Facility 096683 - Facility Scrapers

Stale Data Warning: This facility has not been successfully scraped in 81 days (threshold: 3 days). Data may be outdated.

Facility Information active

Facility ID: 096683
Name: WW Mini Storage
URL: http://www.wwministorage.com/

Address: N/A
Platform: custom_facility_096683
Parser File: src/parsers/custom/facility_096683_parser.py

Last Scraped: 2026-03-23 03:22:05.180363
Created: 2026-03-06 23:45:35.865957
Updated: 2026-03-23 03:22:05.180363

Parser & Healing Diagnosis needs_fix

Parser Status: ⚠ Needs Fix
Status Reason: Parser returned 0 units

Last Healing Attempt: Not attempted

Parser Source (src/parsers/custom/facility_096683_parser.py)

"""Parser for Isaacs Mini Storage (wwministorage.com) facility.

This StorageUnitSoftware-powered site does not publish prices online.
The units/rates page explicitly instructs visitors to call for pricing.
Available unit sizes are listed as a checklist on the home page and
as a bullet list on the units/rates page.

Parsing extracts unit sizes only; price fields are left as None and
a warning is recorded to document that pricing requires a phone call.
"""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult

# Matches dimension patterns like "10' x 20'", "4' x 6'", "10'x14'"
_SIZE_RE = re.compile(
    r"(\d+(?:\.\d+)?)['\u2019\u2032]?\s*[xX\u00d7]\s*(\d+(?:\.\d+)?)['\u2019\u2032]?",
)

# Matches availability annotations like "- AVAILABLE" that follow a size
_AVAILABILITY_RE = re.compile(r"-\s*AVAILABLE", re.IGNORECASE)


class Facility096683Parser(BaseParser):
    """Extract storage unit sizes from Isaacs Mini Storage.

    Pricing is not published on the website; customers are directed to call
    (509) 540-4583 for rates. Unit sizes are extracted from the checklist on
    the home page or the bullet list on the /pages/rent page.
    """

    platform = "custom_facility_096683"

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        result.warnings.append(
            "Pricing not published on website; call (509) 540-4583 for rates"
        )

        seen_sizes: set[str] = set()

        # Strategy 1: widget-checklist items (home page)
        checklist = soup.find(class_="widget-checklist")
        if checklist:
            for span in checklist.find_all("span"):
                text = span.get_text(strip=True)
                match = _SIZE_RE.search(text)
                if match:
                    width = float(match.group(1))
                    length = float(match.group(2))
                    size_label = f"{int(width)}' x {int(length)}'"
                    if size_label not in seen_sizes:
                        seen_sizes.add(size_label)
                        available = bool(_AVAILABILITY_RE.search(text))
                        unit = UnitResult(
                            size=size_label,
                            price=None,
                            description=text,
                            scarcity="available" if available else None,
                            url=url or None,
                            metadata={
                                "width": width,
                                "length": length,
                                "sqft": width * length,
                                "price_source": "call_for_pricing",
                            },
                        )
                        result.units.append(unit)

        # Strategy 2: bullet list on units/rates page (h1 bold bullets)
        if not result.units:
            for heading in soup.find_all(re.compile(r"h[1-6]")):
                raw = heading.get_text(separator="\n", strip=True)
                for line in raw.splitlines():
                    line = line.strip().lstrip("•").strip()
                    match = _SIZE_RE.search(line)
                    if match:
                        width = float(match.group(1))
                        length = float(match.group(2))
                        size_label = f"{int(width)}' x {int(length)}'"
                        if size_label not in seen_sizes:
                            seen_sizes.add(size_label)
                            unit = UnitResult(
                                size=size_label,
                                price=None,
                                description=line,
                                url=url or None,
                                metadata={
                                    "width": width,
                                    "length": length,
                                    "sqft": width * length,
                                    "price_source": "call_for_pricing",
                                },
                            )
                            result.units.append(unit)

        if not result.units:
            result.warnings.append("No unit size listings found on page")

        return result

Stage	Duration
Fetch	2507ms
Detect	7ms
Parse	3ms
Export	14ms

Facility: 096683

Scrape Runs (5)

Run #97 Details

Parsed Units (4)

10' x 20'

10' x 14'

4' x 6'

5' x 6'

All Failures for this Facility (1)

HTML Snapshot — Run #97