Facility: 080081

Storage (Business Blog)

Stale Data Warning: This facility has not been successfully scraped in 30 days (threshold: 3 days). Data may be outdated.
Facility Information active
Facility ID
080081
Name
Storage (Business Blog)
URL
https://storage.business.blog/
Address
N/A
Platform
custom_facility_080081
Parser File
src/parsers/custom/facility_080081_parser.py
Last Scraped
2026-03-23 03:18:02.978727
Created
2026-03-06 23:45:35.865957
Updated
2026-03-23 03:18:02.985180
Parser & Healing Diagnosis working
Parser Status
✓ Working
Status Reason
N/A
Last Healing Attempt
Not attempted
Parser Source (src/parsers/custom/facility_080081_parser.py)
"""Parser for Downtown Laramie Storage facility (storage.business.blog).

This is a WordPress.com blog that lists storage unit pricing as plain text
inside <h4> elements within a <ul> list. Each <li> contains one unit.

Format: "5′ x 5′ storage unit -$25 a month with a $15 deposit"
"""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility080081Parser(BaseParser):
    """Extract storage units from Downtown Laramie Storage (WordPress blog).

    Units are listed as <h4> elements inside <li> items in the entry-content
    section. Format: "{W}′ x {L}′ storage unit -${price} a month with a ${deposit} deposit"
    """

    platform = "custom_facility_080081"

    # Matches: "5′ x 5′ storage unit -$25 a month with a $15 deposit"
    # Handles both - and – (en-dash) as separators, and prime/apostrophe chars
    _UNIT_RE = re.compile(
        r"(\d+(?:\.\d+)?)['\u2019\u2032\u2018\xb4]?\s*[xX\u00d7]\s*(\d+(?:\.\d+)?)['\u2019\u2032\u2018\xb4]?"
        r"\s+storage\s+unit\s*[-\u2013\u2014\s]+\$"
        r"([\d,]+(?:\.\d+)?)"
        r"\s+a\s+month",
        re.IGNORECASE,
    )

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        # Find the main entry content area
        content = soup.select_one("div.entry-content")
        if not content:
            result.warnings.append("No entry-content div found on page")
            return result

        # Find all <li> items that contain <h4> pricing text
        for li in content.select("li"):
            h4 = li.select_one("h4")
            if not h4:
                continue

            text = h4.get_text(strip=True)
            match = self._UNIT_RE.search(text)
            if not match:
                continue

            width = float(match.group(1))
            length = float(match.group(2))
            price_str = match.group(3).replace(",", "")
            price = float(price_str)

            unit = UnitResult(
                size=f"{int(width)}' x {int(length)}'",
                price=price,
                description=text,
                metadata={
                    "width": width,
                    "length": length,
                    "sqft": width * length,
                },
            )
            result.units.append(unit)

        if not result.units:
            result.warnings.append("No unit lines matched in entry-content list items")

        return result

Scrape Runs (5)

Run #1463 Details

Status
exported
Parser Used
Facility080081Parser
Platform Detected
unknown
Units Found
3
Stage Reached
exported
Timestamp
2026-03-23 03:17:54.592679
Timing
Stage Duration
Fetch8318ms
Detect28ms
Parse17ms
Export5ms

Snapshot: 080081_20260323T031802Z.html · Show Snapshot · Open in New Tab

Parsed Units (3)

5' x 5'

$25.00/mo

5' x 10'

$35.00/mo

10' x 10'

$60.00/mo

All Failures for this Facility (1)

parse _WarningAsException scraper no_units_extracted warning Run #N/A | 2026-03-13 19:10:29.053939

No units extracted for 080081

Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080081

← Back to dashboard