Facility: 012860

Al's Mini Storage - Forsyth

Stale Data Warning: This facility has not been successfully scraped in 26 days (threshold: 3 days). Data may be outdated.
⚠ Unit Count Anomaly (Critical): Current run has 0 units, expected baseline is 6 (-100.0% change, delta: -6).
Facility Information active
Facility ID
012860
Name
Al's Mini Storage - Forsyth
URL
https://www.alsministorage.com/location/forsyth/
Address
N/A
Platform
custom_facility_012860
Parser File
src/parsers/custom/facility_012860_parser.py
Last Scraped
2026-03-27 14:00:50.522115
Created
2026-03-06 23:45:35.865957
Updated
2026-03-27 14:00:50.553341
Parser & Healing Diagnosis working
Parser Status
✓ Working
Status Reason
N/A
Last Healing Attempt
Not attempted
Parser Source (src/parsers/custom/facility_012860_parser.py)
"""Parser for Al's Mini Storage — Forsyth, MT (WordPress static listing).

The site is a WordPress page that lists available unit sizes in a
``<ul class="two-column-list">`` element inside ``<div class="entry-content">``.
No pricing is published; prices must be obtained by calling the facility.
A second ``<ul class="two-column-list">`` contains facility features/amenities.
"""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility012860Parser(BaseParser):
    """Extract storage unit sizes from Al's Mini Storage (Forsyth, MT).

    The page lists sizes such as "5 x 10" in a two-column unordered list.
    Pricing is not published on the website; ``price`` is always ``None``.
    Facility features (amenities) are captured in ``metadata``.
    """

    platform = "custom_facility_012860"

    # Matches size strings like "5 x 10", "10 x 20", "7 x 15"
    _SIZE_RE = re.compile(r"^\s*(\d+)\s*x\s*(\d+)\s*$", re.IGNORECASE)

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        entry = soup.find("div", class_="entry-content")
        if entry is None:
            result.warnings.append("Could not find div.entry-content on page")
            return result

        size_items: list[str] = []
        feature_items: list[str] = []

        for ul in entry.find_all("ul", class_="two-column-list"):
            items = [li.get_text(strip=True) for li in ul.find_all("li")]
            # Determine whether this list contains sizes or features by checking
            # whether any item matches the WxH dimension pattern.
            if any(self._SIZE_RE.match(item) for item in items):
                size_items.extend(items)
            else:
                feature_items.extend(items)

        for size_text in size_items:
            m = self._SIZE_RE.match(size_text)
            if not m:
                continue

            width = float(m.group(1))
            length = float(m.group(2))
            size = f"{int(width)}' x {int(length)}'"

            unit = UnitResult(
                size=size,
                description=size_text.strip(),
                price=None,
                metadata={
                    "width": width,
                    "length": length,
                    "sqft": width * length,
                    "amenities": feature_items,
                    "price_note": "Call facility for pricing",
                },
            )
            result.units.append(unit)

        if not result.units:
            result.warnings.append("No unit sizes found in two-column-list elements")

        return result

Scrape Runs (7)

Run #8 Details

Status
started
Parser Used
N/A
Platform Detected
N/A
Units Found
0
Stage Reached
started
Timestamp
2026-03-07 01:05:34.194439

All Failures for this Facility (1)

fetch DatatypeMismatch unknown unknown permanent Run #17 | 2026-03-07 01:42:25.505898

column "success" is of type boolean but expression is of type integer LINE 3: ... VALUES ('012860', 17, '012860_20260307T014225Z.html', 0) ^ HINT: You will need to rewrite or cast the expression.

Stack trace
Traceback (most recent call last):
  File "/app/src/pipeline.py", line 329, in _process_facility
    manifest_id = storage.insert_snapshot_manifest(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/src/db/pg_backend.py", line 615, in insert_snapshot_manifest
    row = self._execute_returning(
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/src/db/pg_backend.py", line 54, in _execute_returning
    cur.execute(sql, params)
  File "/app/.venv/lib/python3.11/site-packages/psycopg2/extras.py", line 236, in execute
    return super().execute(query, vars)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.DatatypeMismatch: column "success" is of type boolean but expression is of type integer
LINE 3: ...    VALUES ('012860', 17, '012860_20260307T014225Z.html', 0)
                                                                     ^
HINT:  You will need to rewrite or cast the expression.

← Back to dashboard