Facility: 043481
Waitsburg Storage
- Facility ID
- 043481
- Name
- Waitsburg Storage
- URL
- http://www.waitsburgstorage.com/
- Address
- N/A
- Platform
- custom_facility_043481
- Parser File
- src/parsers/custom/facility_043481_parser.py
- Last Scraped
- 2026-03-23 03:17:17.826847
- Created
- 2026-03-06 23:45:35.865957
- Updated
- 2026-03-23 03:17:17.843629
- Parser Status
- ✓ Working
- Status Reason
- N/A
- Last Healing Attempt
- Not attempted
Parser Source (src/parsers/custom/facility_043481_parser.py)
"""Parser for Waitsburg Town & Country Storage (facility 043481).
This is a simple static HTML page that lists available unit sizes as plain
text but does not publish pricing information. The parser extracts the size
offerings from the descriptive paragraph so the facility is represented in
the database even though no prices can be collected.
"""
from __future__ import annotations
import re
from bs4 import BeautifulSoup
from src.parsers.base import BaseParser, ParseResult, UnitResult
class Facility043481Parser(BaseParser):
"""Extract storage unit sizes from Waitsburg Town & Country Storage.
The page contains a sentence such as:
"We have storage units in the following sizes: 5 x 10, 10 x 10, 15 x 10 & 20 x 10."
No pricing is published on the site, so ``price`` and ``sale_price`` are
left as ``None`` and a warning is recorded.
"""
platform = "custom_facility_043481"
# Matches dimension tokens like "5 x 10", "10 x 10", "15 x 10", "20 x 10"
_SIZE_RE = re.compile(
r"(\d+(?:\.\d+)?)\s*[xX\u00d7]\s*(\d+(?:\.\d+)?)",
)
def parse(self, html: str, url: str = "") -> ParseResult:
soup = BeautifulSoup(html, "lxml")
result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)
# Find the paragraph/cell that mentions unit sizes; skip <style>/<script> elements
size_text: str | None = None
for element in soup.find_all(string=re.compile(r"sizes?", re.IGNORECASE)):
parent = element.parent
if not parent:
continue
if parent.name in ("style", "script"):
continue
size_text = parent.get_text(separator=" ", strip=True)
break
if not size_text:
result.warnings.append("Could not locate a unit-sizes description on the page")
return result
for match in self._SIZE_RE.finditer(size_text):
width = float(match.group(1))
length = float(match.group(2))
size_label = f"{int(width)}' x {int(length)}'"
unit = UnitResult(
size=size_label,
description=f"Unit size {size_label} — no pricing published on site",
price=None,
sale_price=None,
metadata={
"width": width,
"length": length,
"sqft": width * length,
"no_pricing": True,
},
url=url or None,
)
result.units.append(unit)
if result.units:
result.warnings.append(
"No pricing information is published on this facility's website; "
"unit sizes were extracted but all price fields are None."
)
else:
result.warnings.append("Size description found but no dimension patterns matched")
return result
Scrape Runs (4)
-
exported Run #14542026-03-23 03:17:14.993700 | 4 units | Facility043481Parser | View Data →
-
exported Run #9612026-03-21 19:10:02.486012 | 4 units | Facility043481Parser | View Data →
-
exported Run #5142026-03-14 16:52:57.236497 | 4 units | Facility043481Parser | View Data →
-
exported Run #1132026-03-14 01:04:18.406454 | 4 units | Facility043481Parser | View Data →
Run #961 Details
- Status
- exported
- Parser Used
- Facility043481Parser
- Platform Detected
- table_layout
- Units Found
- 4
- Stage Reached
- exported
- Timestamp
- 2026-03-21 19:10:02.486012
Timing
| Stage | Duration |
|---|---|
| Fetch | 3869ms |
| Detect | 1ms |
| Parse | 1ms |
| Export | 7ms |
Snapshot: 043481_20260321T191006Z.html · Show Snapshot · Open in New Tab
Parsed Units (4)
5' x 10'
No price
10' x 10'
No price
15' x 10'
No price
20' x 10'
No price