Facility: 41

Acme Storage

Stale Data Warning: This facility has not been successfully scraped in 30 days (threshold: 3 days). Data may be outdated.
Facility Information active
Facility ID
41
Name
Acme Storage
URL
http://www.acmestorage.com/rates.html
Address
N/A
Platform
custom_facility_41
Parser File
src/parsers/custom/facility_41_parser.py
Last Scraped
2026-03-23 03:15:35.061054
Created
2026-03-14 16:21:53.706708
Updated
2026-03-23 03:15:35.071229
Parser & Healing Diagnosis working
Parser Status
✓ Working
Status Reason
N/A
Last Healing Attempt
Not attempted
Parser Source (src/parsers/custom/facility_41_parser.py)
"""Parser for ACME Storage (Newville, PA)."""
from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility41Parser(BaseParser):
    """Extract storage units from acmestorage.com/rates.html.

    Old FrontPage site with unit data spread across multiple tables.
    Each unit row contains: Size link, Square Footage text, Monthly Rate,
    Security/Cleaning Deposit.
    """

    platform = "custom_facility_41"

    _SIZE_RE = re.compile(
        r"(\d+)['\u2018\u2019\u2032\u00b4]?\s*x\s*(\d+)['\u2018\u2019\u2032\u00b4]?",
        re.IGNORECASE,
    )

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        # Find all table rows that contain a size-like link or bold text
        for row in soup.find_all("tr"):
            # Only look at immediate children td cells (not deeply nested)
            cells = row.find_all("td", recursive=False)
            if len(cells) < 3:
                continue

            # Skip rows that contain nested tables (layout rows)
            if cells[0].find("table"):
                continue

            first_text = cells[0].get_text(strip=True)
            size_match = self._SIZE_RE.search(first_text)
            if not size_match:
                continue

            # Check if this looks like a door dimensions row
            row_text = row.get_text(strip=True)
            if "door" in row_text.lower() or "opening" in row_text.lower():
                continue

            # Skip header rows
            if "size" in first_text.lower() and "footage" in row_text.lower():
                continue

            width = float(size_match.group(1))
            length = float(size_match.group(2))

            # Look for a price in the row cells
            price = None
            for cell in cells[1:]:
                cell_text = cell.get_text(strip=True)
                if "$" in cell_text:
                    parsed = self.normalize_price(cell_text)
                    if parsed is not None:
                        price = parsed
                        break

            unit = UnitResult(
                size=f"{int(width)}' x {int(length)}'",
                price=price,
                description=row_text,
                metadata={"width": width, "length": length, "sqft": width * length},
            )
            result.units.append(unit)

        if not result.units:
            result.warnings.append("No unit rows found in tables")

        return result

Scrape Runs (3)

Run #484 Details

Status
exported
Parser Used
Facility41Parser
Platform Detected
table_layout
Units Found
5
Stage Reached
exported
Timestamp
2026-03-14 16:47:05.378802
Timing
Stage Duration
Fetch3055ms
Detect8ms
Parse4ms
Export14ms

Snapshot: 41_20260314T164708Z.html · Show Snapshot · Open in New Tab

Parsed Units (5)

5' x 10'

$45.00/mo

10' x 10'

$70.00/mo

10' x 20'

$105.00/mo

10' x 30'

$135.00/mo

10' x 40'

$185.00/mo

← Back to dashboard