Facility: 037351

Riverside Storage WY

Stale Data Warning: This facility has not been successfully scraped in 30 days (threshold: 3 days). Data may be outdated.
Facility Information active
Facility ID
037351
Name
Riverside Storage WY
URL
https://www.riversidestoragewy.com/storage-units-rates-specials
Address
N/A
Platform
custom_facility_037351
Parser File
src/parsers/custom/facility_037351_parser.py
Last Scraped
2026-03-23 03:19:35.996183
Created
2026-03-06 23:45:35.865957
Updated
2026-03-23 03:19:36.004135
Parser & Healing Diagnosis working
Parser Status
✓ Working
Status Reason
N/A
Last Healing Attempt
Not attempted
Parser Source (src/parsers/custom/facility_037351_parser.py)
"""Parser for Riverside Storage WY facility.

The pricing page uses nested `ul.innerList.defaultList` lists where each
indoor unit is structured as:
  <li><b>10x10</b></li>
  <ul>
    <li>$85 / $935 / $77 / 1BR Apartment</li>
    ...
  </ul>

Outdoor storage entries follow the same pattern but the li text contains
a description with an embedded size (e.g. "Back-in Spots – 12x20 ft").

The price string format is: "$mo / $annual / $senior / description"
"""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility037351Parser(BaseParser):
    """Extract storage units from Riverside Storage WY pricing page."""

    platform = "custom_facility_037351"

    # Matches price lines like "$65 / $715 / $59" or "$85 / $935 / $77 / 1BR Apartment"
    _PRICE_RE = re.compile(
        r"\$\s*([\d,]+(?:\.\d+)?)"  # monthly price
        r"\s*/\s*\$\s*([\d,]+(?:\.\d+)?)"  # annual total
        r"\s*/\s*\$\s*([\d,]+(?:\.\d+)?)"  # senior monthly price
        r"(?:\s*/\s*(.+))?",  # optional description suffix
        re.IGNORECASE,
    )

    # Matches sizes embedded in outdoor descriptions like "12x20" or "30-40 ft"
    _OUTDOOR_SIZE_RE = re.compile(r"(\d+)[xX\u00d7](\d+)(?:\s*ft)?|(\d+)-(\d+)\s*ft", re.IGNORECASE)

    def _parse_price_line(self, text: str) -> tuple[float | None, str | None]:
        """Return (monthly_price, description_suffix) from a price line."""
        m = self._PRICE_RE.search(text)
        if not m:
            return None, None
        monthly = float(m.group(1).replace(",", ""))
        desc_suffix = m.group(4).strip() if m.group(4) else None
        return monthly, desc_suffix

    def _parse_unit_list(
        self,
        main_ul: object,
        result: ParseResult,
        outdoor: bool = False,
    ) -> None:
        """Walk the top-level children of a unit list and extract units."""
        children = [c for c in main_ul.children if hasattr(c, "name") and c.name]

        i = 0
        while i < len(children):
            child = children[i]

            if child.name == "li":
                size_text = child.get_text(strip=True)

                # Peek at the next sibling — it should be a ul with the price
                if i + 1 < len(children) and children[i + 1].name == "ul":
                    price_ul = children[i + 1]
                    price_li = price_ul.find("li")
                    price_text = price_li.get_text(strip=True) if price_li else ""
                    monthly, desc_suffix = self._parse_price_line(price_text)

                    if outdoor:
                        # For outdoor units, the size is embedded in the description
                        m = self._OUTDOOR_SIZE_RE.search(size_text)
                        if m:
                            if m.group(1):
                                width, length = float(m.group(1)), float(m.group(2))
                                size = f"{int(width)}' x {int(length)}'"
                                metadata: dict = {"width": width, "length": length, "sqft": width * length}
                            else:
                                # Variable length like "30-40 ft"
                                size = size_text
                                metadata = {}
                        else:
                            size = size_text
                            metadata = {}

                        description = size_text
                        if desc_suffix:
                            description = f"{size_text} – {desc_suffix}"
                    else:
                        # Indoor unit: size_text is "10x10" form
                        w, ln, sq = self.normalize_size(size_text)
                        if w is not None:
                            size = f"{int(w)}' x {int(ln)}'"
                            metadata = {"width": w, "length": ln, "sqft": sq}
                        else:
                            size = size_text
                            metadata = {}

                        description = size_text
                        if desc_suffix:
                            description = f"{size_text} – {desc_suffix}"

                    unit = UnitResult(
                        size=size,
                        price=monthly,
                        description=description,
                        metadata=metadata if metadata else None,
                    )
                    result.units.append(unit)
                    i += 2  # skip the price ul
                    continue

            i += 1

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        all_uls = soup.select("ul.innerList.defaultList")
        if not all_uls:
            result.warnings.append("No innerList.defaultList elements found on page")
            return result

        # The first top-level list is indoor self-storage units
        # The outdoor list contains items whose text includes a size pattern AND
        # a heading like "Outdoor Storage Pricing" precedes it.
        outdoor_heading = soup.find(string=re.compile(r"Outdoor Storage Pricing", re.I))
        outdoor_ul = None
        if outdoor_heading:
            # Find the nearest innerList.defaultList after the heading
            for ul in all_uls:
                # Check if any ul comes after the heading in document order
                if outdoor_heading.parent and outdoor_heading.parent in ul.parents:
                    continue
                # Simple check: ul text starts with "Back-in" or similar outdoor markers
                first_li = ul.find("li")
                if first_li and re.search(r"Back-in|Pull-through|Outdoor", first_li.get_text(), re.I):
                    outdoor_ul = ul
                    break

        # Parse indoor units from the first top-level list
        main_ul = all_uls[0]
        self._parse_unit_list(main_ul, result, outdoor=False)

        # Parse outdoor units if found
        if outdoor_ul and outdoor_ul is not main_ul:
            self._parse_unit_list(outdoor_ul, result, outdoor=True)

        if not result.units:
            result.warnings.append("No units extracted from pricing page")

        return result

Scrape Runs (5)

Run #990 Details

Status
exported
Parser Used
Facility037351Parser
Platform Detected
table_layout
Units Found
8
Stage Reached
exported
Timestamp
2026-03-21 19:12:30.150729
Timing
Stage Duration
Fetch4918ms
Detect30ms
Parse18ms
Export9ms

Snapshot: 037351_20260321T191235Z.html · Show Snapshot · Open in New Tab

Parsed Units (8)

5' x 10'

$65.00/mo

10' x 10'

$85.00/mo

10' x 15'

$105.00/mo

10' x 20'

$125.00/mo

10' x 30'

$170.00/mo

12' x 20'

$35.00/mo

12' x 30'

$40.00/mo

Pull-through – 30-40 ft

$50.00/mo

← Back to dashboard