Facility 004192 - Facility Scrapers

Stale Data Warning: This facility has not been successfully scraped in 76 days (threshold: 3 days). Data may be outdated.

Facility Information active

Facility ID: 004192
Name: Mallett's Bay Self-Storage Llc
URL: https://mallettsbaystorage.net/sizes-and-rates/

Address: 115 Heineberg Dr, Colchester, VT 05446, USA, Colchester, Vermont 05446
Platform: custom_facility_004192
Parser File: src/parsers/custom/facility_004192_parser.py

Last Scraped: 2026-03-27 13:57:54.781983
Created: 2026-03-14 16:21:53.706708
Updated: 2026-03-27 13:57:54.810242

Parser & Healing Diagnosis working

Parser Status: ✓ Working
Status Reason: N/A

Last Healing Attempt: Not attempted

Parser Source (src/parsers/custom/facility_004192_parser.py)

"""Parser for Mallett's Bay Self-Storage Llc (sizes only, no prices listed)."""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility004192Parser(BaseParser):
    """Extract storage units from Mallett's Bay Self-Storage Llc.

    The sizes-and-rates page lists unit dimensions in an <h2> tag
    separated by en-dashes (e.g. "5×10 – 10×10 – 10×15 – 10×20 – 10×30").
    No prices are published; customers must call for rates.
    """

    platform = "custom_facility_004192"

    # Matches dimensions like 5×10, 10x20, 5'x10', etc.
    _SIZE_RE = re.compile(
        r"(\d+)\s*[xX\u00d7]\s*(\d+)"
    )

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        for tag in soup.find_all(["script", "style"]):
            tag.decompose()

        units: list[UnitResult] = []
        seen: set[tuple[int, int]] = set()

        # Strategy 1: Look for the heading that contains the size listing.
        # The page uses <h2> with sizes separated by en-dashes.
        for heading in soup.find_all(["h1", "h2", "h3"]):
            text = heading.get_text(strip=True)
            matches = list(self._SIZE_RE.finditer(text))
            # Only use headings with 2+ size matches (the listing row)
            if len(matches) >= 2:
                for m in matches:
                    w = int(m.group(1))
                    ln = int(m.group(2))
                    if w < 3 or ln < 3:
                        continue
                    key = (w, ln)
                    if key in seen:
                        continue
                    seen.add(key)
                    size_text = f"{w}x{ln}"
                    unit = UnitResult()
                    unit.size = size_text
                    _, _, sq = self.normalize_size(size_text)
                    unit.metadata = {"width": w, "length": ln, "sqft": sq}
                    units.append(unit)

        # Strategy 2: Fallback -- scan all text for size×size patterns
        if not units:
            body_text = soup.get_text(separator="\n")
            for m in self._SIZE_RE.finditer(body_text):
                w = int(m.group(1))
                ln = int(m.group(2))
                if w < 3 or ln < 3:
                    continue
                key = (w, ln)
                if key in seen:
                    continue
                seen.add(key)
                size_text = f"{w}x{ln}"
                unit = UnitResult()
                unit.size = size_text
                _, _, sq = self.normalize_size(size_text)
                unit.metadata = {"width": w, "length": ln, "sqft": sq}
                units.append(unit)

        if units:
            result.units = units
            result.warnings.append("No prices listed on site; sizes only")
        else:
            result.warnings.append("No units found")

        return result

Stage	Duration
Fetch	2779ms
Detect	5ms
Parse	3ms
Export	19ms

Facility: 004192

Scrape Runs (5)

Run #1970 Details

Parsed Units (5)

5x10

10x10

10x15

10x20

10x30

HTML Snapshot — Run #1970