Facility 002106 - Facility Scrapers

Stale Data Warning: This facility has not been successfully scraped in 26 days (threshold: 3 days). Data may be outdated.

Facility Information active

Facility ID: 002106
Name: Airport Mini Storage
URL: https://www.airportministorage106.com/storage-unit.html

Address: 106 Industrial Park Dr, Sevierville, TN 37862, USA, Sevierville, Tennessee 37862
Platform: custom_facility_002106
Parser File: src/parsers/custom/facility_002106_parser.py

Last Scraped: 2026-03-27 13:49:31.324661
Created: 2026-03-14 16:21:53.706708
Updated: 2026-03-27 13:49:31.352720

Parser & Healing Diagnosis working

Parser Status: ✓ Working
Status Reason: N/A

Last Healing Attempt: Not attempted

Parser Source (src/parsers/custom/facility_002106_parser.py)

"""Parser for Airport Mini Storage (Sevierville, TN).

This facility's website does not have a structured pricing table.
The storage-unit page mentions sizes in prose (e.g. "5' by 5'") and a
starting price ("$40 per month").  The parser extracts whatever size/price
pairs it can find from the page text, handling "by" as a dimension separator
in addition to "x" / "×".
"""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility002106Parser(BaseParser):
    """Extract storage units from Airport Mini Storage."""

    platform = "custom_facility_002106"

    # Match dimensions like  5' by 5',  10x20,  10'×20', etc.
    _SIZE_RE = re.compile(
        r"(\d+)\s*['\u2019\u2032]?\s*"
        r"(?:by|[xX\u00d7])\s*"
        r"(\d+)\s*['\u2019\u2032]?",
    )

    # Match prices like  $40,  $125.00
    _PRICE_RE = re.compile(r"\$(\d[\d,.]*)")

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        for tag in soup.find_all(["script", "style"]):
            tag.decompose()

        body_text = soup.get_text(separator="\n")

        # Collect all sizes and prices from the page text
        sizes: list[tuple[int, int, str]] = []
        for m in self._SIZE_RE.finditer(body_text):
            w, ln = int(m.group(1)), int(m.group(2))
            # Filter out non-storage dimensions (too small or nonsensical)
            if w < 3 or ln < 3:
                continue
            raw = m.group(0).strip()
            sizes.append((w, ln, raw))

        prices: list[float] = []
        for m in self._PRICE_RE.finditer(body_text):
            price = self.normalize_price(m.group(1))
            if price is not None and price > 0:
                prices.append(price)

        # De-duplicate sizes
        seen_dims: set[tuple[int, int]] = set()
        unique_sizes: list[tuple[int, int, str]] = []
        for w, ln, raw in sizes:
            key = (w, ln)
            if key not in seen_dims:
                seen_dims.add(key)
                unique_sizes.append((w, ln, raw))

        if unique_sizes and prices:
            # If we have equal counts, pair them 1:1
            if len(unique_sizes) == len(prices):
                for (w, ln, raw), price in zip(unique_sizes, prices):
                    unit = UnitResult()
                    unit.size = raw
                    unit.price = price
                    _, _, sqft = self.normalize_size(f"{w}x{ln}")
                    unit.metadata = {"width": w, "length": ln, "sqft": sqft}
                    result.units.append(unit)
            else:
                # Unequal counts — attach first price to each size as a starting price
                starting_price = min(prices)
                for w, ln, raw in unique_sizes:
                    unit = UnitResult()
                    unit.size = raw
                    unit.price = starting_price
                    _, _, sqft = self.normalize_size(f"{w}x{ln}")
                    unit.metadata = {"width": w, "length": ln, "sqft": sqft}
                    unit.description = f"Starting at ${starting_price:.0f}/mo"
                    result.units.append(unit)
        elif unique_sizes:
            # Sizes without prices
            for w, ln, raw in unique_sizes:
                unit = UnitResult()
                unit.size = raw
                _, _, sqft = self.normalize_size(f"{w}x{ln}")
                unit.metadata = {"width": w, "length": ln, "sqft": sqft}
                result.units.append(unit)
        elif prices:
            # Prices without sizes — record the starting price
            unit = UnitResult()
            unit.price = min(prices)
            unit.description = f"Starting at ${min(prices):.0f}/mo"
            result.units.append(unit)

        if not result.units:
            result.warnings.append("No units found on page")

        return result

Stage	Duration
Fetch	3328ms
Detect	11ms
Parse	4ms
Export	18ms

Facility: 002106

Scrape Runs (5)

Run #1757 Details

Parsed Units (1)

5' by 5'

HTML Snapshot — Run #1757