Facility: 080478
C&T Storage
- Facility ID
- 080478
- Name
- C&T Storage
- URL
- https://www.candtstorage.com/
- Address
- N/A
- Platform
- custom_facility_080478
- Parser File
- src/parsers/custom/facility_080478_parser.py
- Last Scraped
- 2026-03-23 03:21:51.319900
- Created
- 2026-03-06 23:45:35.865957
- Updated
- 2026-03-23 03:21:51.319900
- Parser Status
- ⚠ Needs Fix
- Status Reason
- Parser returned 0 units
- Last Healing Attempt
- Not attempted
Parser Source (src/parsers/custom/facility_080478_parser.py)
"""Parser for C&T Storage (Google Sites) facility.
This is a Google Sites page that lists storage unit sizes with descriptions
but no pricing. Sizes are split across separate spans inside CjVfdc containers
within h3 headings. The description for each unit appears in a sibling element.
"""
from __future__ import annotations
import re
from bs4 import BeautifulSoup
from src.parsers.base import BaseParser, ParseResult, UnitResult
# Stops that indicate the end of the unit listing section
_STOP_TEXTS = {"larger sizes or outside parking", "rv parking", "indoor & outside storage", "we're here for you"}
# Regex to detect a valid size pattern like "5 X 8" or "10 X 20"
_SIZE_PATTERN = re.compile(r"^\d+\s*X\s*\d+", re.IGNORECASE)
class Facility080478Parser(BaseParser):
"""Extract storage units from C&T Storage (candtstorage.com).
The site is built on Google Sites. Unit sizes are rendered as three
separate spans (width, "X", length) inside a ``div.CjVfdc`` container
within an h3 heading. An optional description follows in a sibling div.
No pricing data is available on this page.
"""
platform = "custom_facility_080478"
def parse(self, html: str, url: str = "") -> ParseResult:
soup = BeautifulSoup(html, "lxml")
result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)
# Each unit entry is an h3 element with class CDt4Ke that contains
# a div.CjVfdc with the size spans inside.
h3_units = soup.find_all("h3", class_="CDt4Ke")
for h3 in h3_units:
container = h3.find("div", class_="CjVfdc")
if not container:
continue
spans = container.find_all("span")
# Filter out empty spans (Google Sites adds empty decorative spans)
span_texts = [s.get_text(strip=True) for s in spans if s.get_text(strip=True)]
# Reconstruct size text from spans: expect [width, "X", length, ...]
# Filter to spans that look like a size pattern
size_text = " ".join(span_texts).strip()
# Skip non-unit headings (section titles, footer text, etc.)
if not _SIZE_PATTERN.match(size_text):
text_lower = size_text.lower()
if "rv parking" in text_lower:
# Include RV Parking as a special unit type
unit = UnitResult(
size="RV Parking",
description="RV Parking",
url=url,
)
result.units.append(unit)
continue
# Parse the size: e.g. "5 X 8 Storage" → "5x8"
size_match = re.match(r"(\d+)\s*X\s*(\d+)", size_text, re.IGNORECASE)
if not size_match:
continue
width = float(size_match.group(1))
length = float(size_match.group(2))
normalized_size = f"{int(width)}x{int(length)}"
w, ln, sq = self.normalize_size(normalized_size)
# span_texts (empty-filtered) is now [width, "X", length] or [width, "X", length, type]
# Extract unit type label (e.g. "Storage") if present as the 4th span
unit_type = span_texts[3] if len(span_texts) > 3 else ""
# Find the sibling description element.
# Structure: h3 → (grandparent chain) → unnamed div with 2 children:
# child[0] = the h3 wrapper chain, child[1] = description div
description = ""
try:
# Walk up: h3.parent (tyJCtd div) → jXK9ad-SmKAyb → hJDwNd... → oKdM2c → unnamed div
ancestor = h3.parent.parent.parent.parent.parent
sibling_children = [
c for c in ancestor.children if hasattr(c, "get_text")
]
if len(sibling_children) > 1:
description = sibling_children[1].get_text(strip=True)
except (AttributeError, IndexError):
pass
# Build the display size label
display_size = f"{int(width)}' x {int(length)}'"
if unit_type and unit_type.lower() not in ("storage",):
display_size = f"{display_size} {unit_type}"
unit = UnitResult(
size=display_size,
description=description or unit_type or None,
url=url,
metadata={
"width": w,
"length": ln,
"sqft": sq,
},
)
result.units.append(unit)
if not result.units:
result.warnings.append("No units found on page")
return result
Scrape Runs (5)
-
exported Run #15012026-03-23 03:21:47.118892 | Facility080478Parser
-
exported Run #10082026-03-21 19:15:07.320089 | Facility080478Parser
-
exported Run #5612026-03-14 16:56:38.778064 | Facility080478Parser
-
exported Run #1682026-03-14 05:00:46.716489 | Facility080478Parser
-
exported Run #912026-03-14 01:02:22.953166 | Facility080478Parser
Run #168 Details
- Status
- exported
- Parser Used
- Facility080478Parser
- Platform Detected
- table_layout
- Units Found
- 0
- Stage Reached
- exported
- Timestamp
- 2026-03-14 05:00:46.716489
Timing
| Stage | Duration |
|---|---|
| Fetch | 2144ms |
| Detect | 30ms |
| Parse | 14ms |
| Export | 3ms |
Snapshot: 080478_20260314T050048Z.html · Show Snapshot · Open in New Tab
No units found in this run.
All Failures for this Facility (5)
parse
_WarningAsException
scraper
no_units_extracted
warning
Run #N/A | 2026-03-23 03:21:51.316005
No units extracted for 080478
Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478
parse
_WarningAsException
scraper
no_units_extracted
warning
Run #N/A | 2026-03-21 19:15:11.926773
No units extracted for 080478
Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478
parse
_WarningAsException
scraper
no_units_extracted
warning
Run #N/A | 2026-03-14 16:56:41.149060
No units extracted for 080478
Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478
parse
_WarningAsException
scraper
no_units_extracted
warning
Run #N/A | 2026-03-14 05:00:48.923804
No units extracted for 080478
Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478
parse
_WarningAsException
scraper
no_units_extracted
warning
Run #N/A | 2026-03-14 01:02:25.025794
No units extracted for 080478
Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478