Facility 41 - Facility Scrapers

Stale Data Warning: This facility has not been successfully scraped in 81 days (threshold: 3 days). Data may be outdated.

Facility Information active

Facility ID: 41
Name: Acme Storage
URL: http://www.acmestorage.com/rates.html

Address: N/A
Platform: custom_facility_41
Parser File: src/parsers/custom/facility_41_parser.py

Last Scraped: 2026-03-23 03:15:35.061054
Created: 2026-03-14 16:21:53.706708
Updated: 2026-03-23 03:15:35.071229

Parser & Healing Diagnosis working

Parser Status: ✓ Working
Status Reason: N/A

Last Healing Attempt: Not attempted

Parser Source (src/parsers/custom/facility_41_parser.py)

"""Parser for ACME Storage (Newville, PA)."""
from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility41Parser(BaseParser):
    """Extract storage units from acmestorage.com/rates.html.

    Old FrontPage site with unit data spread across multiple tables.
    Each unit row contains: Size link, Square Footage text, Monthly Rate,
    Security/Cleaning Deposit.
    """

    platform = "custom_facility_41"

    _SIZE_RE = re.compile(
        r"(\d+)['\u2018\u2019\u2032\u00b4]?\s*x\s*(\d+)['\u2018\u2019\u2032\u00b4]?",
        re.IGNORECASE,
    )

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        # Find all table rows that contain a size-like link or bold text
        for row in soup.find_all("tr"):
            # Only look at immediate children td cells (not deeply nested)
            cells = row.find_all("td", recursive=False)
            if len(cells) < 3:
                continue

            # Skip rows that contain nested tables (layout rows)
            if cells[0].find("table"):
                continue

            first_text = cells[0].get_text(strip=True)
            size_match = self._SIZE_RE.search(first_text)
            if not size_match:
                continue

            # Check if this looks like a door dimensions row
            row_text = row.get_text(strip=True)
            if "door" in row_text.lower() or "opening" in row_text.lower():
                continue

            # Skip header rows
            if "size" in first_text.lower() and "footage" in row_text.lower():
                continue

            width = float(size_match.group(1))
            length = float(size_match.group(2))

            # Look for a price in the row cells
            price = None
            for cell in cells[1:]:
                cell_text = cell.get_text(strip=True)
                if "$" in cell_text:
                    parsed = self.normalize_price(cell_text)
                    if parsed is not None:
                        price = parsed
                        break

            unit = UnitResult(
                size=f"{int(width)}' x {int(length)}'",
                price=price,
                description=row_text,
                metadata={"width": width, "length": length, "sqft": width * length},
            )
            result.units.append(unit)

        if not result.units:
            result.warnings.append("No unit rows found in tables")

        return result

Stage	Duration
Fetch	2753ms
Detect	18ms
Parse	11ms
Export	5ms

Facility: 41

Scrape Runs (3)

Run #935 Details

Parsed Units (5)

5' x 10'

10' x 10'

10' x 20'

10' x 30'

10' x 40'

HTML Snapshot — Run #935