Facility 8 - Facility Scrapers

Stale Data Warning: This facility has not been successfully scraped in 81 days (threshold: 3 days). Data may be outdated.

Facility Information active

Facility ID: 8
Name: West Theater Barberton Storage
URL: https://westtheaterbarberton.com/unitsprices.html

Address: N/A
Platform: custom_facility_8
Parser File: src/parsers/custom/facility_8_parser.py

Last Scraped: 2026-03-23 03:16:11.602668
Created: 2026-03-14 16:21:53.706708
Updated: 2026-03-23 03:16:11.602668

Parser & Healing Diagnosis needs_fix

Parser Status: ⚠ Needs Fix
Status Reason: Parser returned 0 units

Last Healing Attempt: Not attempted

Parser Source (src/parsers/custom/facility_8_parser.py)

"""Parser for A & D West Theater Storage (Barberton, Ohio)."""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility8Parser(BaseParser):
    """Extract storage units from West Theater Storage.

    Pricing is embedded in paragraph text with various patterns:
        "$35 per month for a 4x5"
        "5 x 5 unit runs $40"
        "5 x 8 units are $45"
        "10 x 12 at $85 per month"
    All units are indoor/heated.
    """

    platform = "custom_facility_8"

    # Pattern 1: "$XX for a WxL" / "$XX per month for a WxL"
    _PRICE_FIRST_RE = re.compile(
        r"\$([\d,]+(?:\.\d+)?)\s+(?:per\s+month\s+)?for\s+a\s+"
        r"(\d+)\s*[xX]\s*(\d+)",
        re.IGNORECASE,
    )

    # Pattern 2: "WxL ... (runs|are|is|at) $XX"
    _SIZE_FIRST_RE = re.compile(
        r"(\d+)\s*[xX]\s*(\d+)\s+(?:units?\s+)?(?:runs?|are|is|at)\s+\$"
        r"([\d,]+(?:\.\d+)?)",
        re.IGNORECASE,
    )

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        content = soup.find(id="content")
        if not content:
            result.warnings.append("No #content section found")
            return result

        text = content.get_text(separator=" ", strip=True)

        # Check if facility is described as heated/indoor
        full_page_text = soup.get_text(separator=" ", strip=True).lower()
        is_indoor = "indoor" in full_page_text
        is_heated = "heated" in full_page_text

        seen: set[tuple[float, float]] = set()

        def _make_metadata(width: float, length: float) -> dict:
            meta: dict = {"width": width, "length": length, "sqft": width * length}
            if is_indoor:
                meta["indoor"] = True
            if is_heated:
                meta["climateControlled"] = True
            return meta

        # Pattern 1: price first
        for match in self._PRICE_FIRST_RE.finditer(text):
            price = self.normalize_price(match.group(1))
            width = float(match.group(2))
            length = float(match.group(3))
            key = (width, length)
            if key not in seen:
                seen.add(key)
                unit = UnitResult(
                    size=f"{int(width)}x{int(length)}",
                    sale_price=price,
                    description=match.group(0).strip(),
                    metadata=_make_metadata(width, length),
                )
                result.units.append(unit)

        # Pattern 2: size first
        for match in self._SIZE_FIRST_RE.finditer(text):
            width = float(match.group(1))
            length = float(match.group(2))
            price = self.normalize_price(match.group(3))
            key = (width, length)
            if key not in seen:
                seen.add(key)
                unit = UnitResult(
                    size=f"{int(width)}x{int(length)}",
                    sale_price=price,
                    description=match.group(0).strip(),
                    metadata=_make_metadata(width, length),
                )
                result.units.append(unit)

        if not result.units:
            result.warnings.append("No unit pricing found in page text")

        return result

Stage	Duration
Fetch	1915ms
Detect	0ms
Parse	0ms
Export	11ms

Facility: 8

Scrape Runs (3)

Run #495 Details

All Failures for this Facility (3)

HTML Snapshot — Run #495