Facility: 042202

Laramie A1 Storage

Stale Data Warning: This facility has not been successfully scraped in 30 days (threshold: 3 days). Data may be outdated.
Facility Information active
Facility ID
042202
Name
Laramie A1 Storage
URL
https://laramiea1storage.wixsite.com/a1-storage
Address
N/A
Platform
custom_facility_042202
Parser File
src/parsers/custom/facility_042202_parser.py
Last Scraped
2026-03-23 03:21:32.295086
Created
2026-03-06 23:45:35.865957
Updated
2026-03-23 03:21:32.301163
Parser & Healing Diagnosis working
Parser Status
✓ Working
Status Reason
N/A
Last Healing Attempt
Not attempted
Parser Source (src/parsers/custom/facility_042202_parser.py)
"""Parser for A1 Storage Laramie WY facility (Wix site).

The pricing page is a single Wix rich-text component (comp-mfzsuxvb) containing
size headers followed by door-type sub-headers and outlet/price lines.

Structure example:
    5X10 - Only 5 Left!
    Standard Single Door
    Without outlet: $55

    10X10 - Only 1 left!
    Standard Garage Door
    Without outlet: - Unavailable
    With outlet: $135
    Standard Double Door
    ...

    10X20 - Call for Availability
    ...
"""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility042202Parser(BaseParser):
    """Extract storage units from A1 Storage Laramie WY (Wix site).

    All pricing is contained in a single Wix rich-text element.
    Units are grouped by size; each size may have multiple door-type
    sub-sections with outlet/no-outlet pricing variants.
    """

    platform = "custom_facility_042202"

    # Matches size headings like "5X10 -", "10X10 -", "10X20 -"
    _SIZE_RE = re.compile(r"^(\d+)[Xx](\d+)\s*-?\s*(.*)?$")

    # Matches scarcity text like "Only 5 Left!", "Only 1 left!"
    _SCARCITY_RE = re.compile(r"Only\s+\d+\s+[Ll]eft", re.IGNORECASE)

    # Matches price lines like "Without outlet: $55", "With outlet: $135"
    _PRICE_LINE_RE = re.compile(
        r"(With(?:out)?\s+outlet)\s*:\s*(?:\xa0\s*-\s*)?\$?([\d,]+(?:\.\d+)?|Unavailable)",
        re.IGNORECASE,
    )

    # Door type sub-headers
    _DOOR_TYPE_RE = re.compile(
        r"^(Standard\s+(?:Single|Double|Garage)\s+Door)$",
        re.IGNORECASE,
    )

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        # The entire pricing content lives in the Wix rich-text container
        container = soup.find("div", id="comp-mfzsuxvb")
        if not container:
            result.warnings.append("Wix rich-text container 'comp-mfzsuxvb' not found")
            return result

        # Collect all meaningful text lines from paragraphs and list items
        raw_lines: list[str] = []
        for elem in container.find_all(["p", "li", "h1", "h2", "h3", "h4", "h5", "h6"]):
            text = elem.get_text(separator=" ", strip=True)
            # Strip zero-width spaces and non-breaking spaces used as spacers
            text = text.replace("\u200b", "").replace("\xa0", " ").strip()
            if text:
                raw_lines.append(text)

        # De-duplicate consecutive identical lines (Wix sometimes renders both
        # <li> and a sibling <p> for the same content)
        lines: list[str] = []
        for line in raw_lines:
            if not lines or line != lines[-1]:
                lines.append(line)

        # Parse lines into unit records
        current_size: tuple[int, int] | None = None  # (width, length)
        current_scarcity: str | None = None
        current_door_type: str | None = None

        for line in lines:
            # Check for size heading (may include scarcity inline, e.g. "5X10 -Only 5 Left!")
            size_match = self._SIZE_RE.match(line)
            if size_match:
                width = int(size_match.group(1))
                length = int(size_match.group(2))
                current_size = (width, length)
                current_door_type = None
                # Scarcity may be appended directly: "5X10 -Only 5 Left!"
                suffix = size_match.group(3).strip() if size_match.group(3) else ""
                if self._SCARCITY_RE.search(suffix):
                    current_scarcity = suffix
                elif re.search(r"Call for Availability", suffix, re.IGNORECASE):
                    current_scarcity = "Call for Availability"
                else:
                    current_scarcity = None
                continue

            # Check for door-type sub-header
            if self._DOOR_TYPE_RE.match(line):
                current_door_type = line
                continue

            # Check for scarcity-only lines (e.g. "Only 1 left!")
            if self._SCARCITY_RE.search(line) or re.search(r"Call for Availability", line, re.IGNORECASE):
                current_scarcity = line
                continue

            # Check for price lines
            price_match = self._PRICE_LINE_RE.match(line)
            if price_match and current_size:
                outlet_label = price_match.group(1).strip()
                price_raw = price_match.group(2).strip()

                if price_raw.lower() == "unavailable" or not price_raw:
                    # Skip unavailable units
                    continue

                width, length = current_size
                price = float(price_raw.replace(",", ""))
                has_outlet = "without" not in outlet_label.lower()

                description_parts = [f"{width}x{length}"]
                if current_door_type:
                    description_parts.append(current_door_type)
                description_parts.append(outlet_label)
                description = " | ".join(description_parts)

                unit = UnitResult(
                    size=self.normalize_size(f"{width}x{length}"),
                    price=price,
                    description=description,
                    scarcity=current_scarcity,
                    url=url,
                    metadata={
                        "width": width,
                        "length": length,
                        "sqft": width * length,
                        "door_type": current_door_type,
                        "has_outlet": has_outlet,
                    },
                )
                result.units.append(unit)

        if not result.units:
            result.warnings.append(
                "No available units found; all may be unavailable or page structure changed"
            )

        return result

Scrape Runs (5)

Run #1497 Details

Status
exported
Parser Used
Facility042202Parser
Platform Detected
table_layout
Units Found
2
Stage Reached
exported
Timestamp
2026-03-23 03:21:26.453856
Timing
Stage Duration
Fetch5764ms
Detect41ms
Parse18ms
Export4ms

Snapshot: 042202_20260323T032132Z.html · Show Snapshot · Open in New Tab

Parsed Units (2)

(5.0,10.0,50.0)

$55.00/mo
Only 5 Left!

(10.0,10.0,100.0)

$135.00/mo
Only 1 left!

← Back to dashboard