Facility: 004192

Mallett's Bay Self-Storage Llc

Stale Data Warning: This facility has not been successfully scraped in 26 days (threshold: 3 days). Data may be outdated.
Facility Information active
Facility ID
004192
Name
Mallett's Bay Self-Storage Llc
URL
https://mallettsbaystorage.net/sizes-and-rates/
Address
115 Heineberg Dr, Colchester, VT 05446, USA, Colchester, Vermont 05446
Platform
custom_facility_004192
Parser File
src/parsers/custom/facility_004192_parser.py
Last Scraped
2026-03-27 13:57:54.781983
Created
2026-03-14 16:21:53.706708
Updated
2026-03-27 13:57:54.810242
Parser & Healing Diagnosis working
Parser Status
✓ Working
Status Reason
N/A
Last Healing Attempt
Not attempted
Parser Source (src/parsers/custom/facility_004192_parser.py)
"""Parser for Mallett's Bay Self-Storage Llc (sizes only, no prices listed)."""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility004192Parser(BaseParser):
    """Extract storage units from Mallett's Bay Self-Storage Llc.

    The sizes-and-rates page lists unit dimensions in an <h2> tag
    separated by en-dashes (e.g. "5×10 – 10×10 – 10×15 – 10×20 – 10×30").
    No prices are published; customers must call for rates.
    """

    platform = "custom_facility_004192"

    # Matches dimensions like 5×10, 10x20, 5'x10', etc.
    _SIZE_RE = re.compile(
        r"(\d+)\s*[xX\u00d7]\s*(\d+)"
    )

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        for tag in soup.find_all(["script", "style"]):
            tag.decompose()

        units: list[UnitResult] = []
        seen: set[tuple[int, int]] = set()

        # Strategy 1: Look for the heading that contains the size listing.
        # The page uses <h2> with sizes separated by en-dashes.
        for heading in soup.find_all(["h1", "h2", "h3"]):
            text = heading.get_text(strip=True)
            matches = list(self._SIZE_RE.finditer(text))
            # Only use headings with 2+ size matches (the listing row)
            if len(matches) >= 2:
                for m in matches:
                    w = int(m.group(1))
                    ln = int(m.group(2))
                    if w < 3 or ln < 3:
                        continue
                    key = (w, ln)
                    if key in seen:
                        continue
                    seen.add(key)
                    size_text = f"{w}x{ln}"
                    unit = UnitResult()
                    unit.size = size_text
                    _, _, sq = self.normalize_size(size_text)
                    unit.metadata = {"width": w, "length": ln, "sqft": sq}
                    units.append(unit)

        # Strategy 2: Fallback -- scan all text for size×size patterns
        if not units:
            body_text = soup.get_text(separator="\n")
            for m in self._SIZE_RE.finditer(body_text):
                w = int(m.group(1))
                ln = int(m.group(2))
                if w < 3 or ln < 3:
                    continue
                key = (w, ln)
                if key in seen:
                    continue
                seen.add(key)
                size_text = f"{w}x{ln}"
                unit = UnitResult()
                unit.size = size_text
                _, _, sq = self.normalize_size(size_text)
                unit.metadata = {"width": w, "length": ln, "sqft": sq}
                units.append(unit)

        if units:
            result.units = units
            result.warnings.append("No prices listed on site; sizes only")
        else:
            result.warnings.append("No units found")

        return result

Scrape Runs (5)

Run #755 Details

Status
exported
Parser Used
Facility004192Parser
Platform Detected
unknown
Units Found
5
Stage Reached
exported
Timestamp
2026-03-21 18:51:02.426404
Timing
Stage Duration
Fetch3505ms
Detect12ms
Parse5ms
Export35ms

Snapshot: 004192_20260321T185105Z.html · Show Snapshot · Open in New Tab

Parsed Units (5)

5x10

No price

10x10

No price

10x15

No price

10x20

No price

10x30

No price

← Back to dashboard