Facility: 080478

C&T Storage

Stale Data Warning: This facility has not been successfully scraped in 30 days (threshold: 3 days). Data may be outdated.
Facility Information active
Facility ID
080478
Name
C&T Storage
URL
https://www.candtstorage.com/
Address
N/A
Platform
custom_facility_080478
Parser File
src/parsers/custom/facility_080478_parser.py
Last Scraped
2026-03-23 03:21:51.319900
Created
2026-03-06 23:45:35.865957
Updated
2026-03-23 03:21:51.319900
Parser & Healing Diagnosis needs_fix
Parser Status
⚠ Needs Fix
Status Reason
Parser returned 0 units
Last Healing Attempt
Not attempted
Parser Source (src/parsers/custom/facility_080478_parser.py)
"""Parser for C&T Storage (Google Sites) facility.

This is a Google Sites page that lists storage unit sizes with descriptions
but no pricing. Sizes are split across separate spans inside CjVfdc containers
within h3 headings. The description for each unit appears in a sibling element.
"""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult

# Stops that indicate the end of the unit listing section
_STOP_TEXTS = {"larger sizes or outside parking", "rv parking", "indoor & outside storage", "we're here for you"}

# Regex to detect a valid size pattern like "5 X 8" or "10 X 20"
_SIZE_PATTERN = re.compile(r"^\d+\s*X\s*\d+", re.IGNORECASE)


class Facility080478Parser(BaseParser):
    """Extract storage units from C&T Storage (candtstorage.com).

    The site is built on Google Sites. Unit sizes are rendered as three
    separate spans (width, "X", length) inside a ``div.CjVfdc`` container
    within an h3 heading. An optional description follows in a sibling div.

    No pricing data is available on this page.
    """

    platform = "custom_facility_080478"

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        # Each unit entry is an h3 element with class CDt4Ke that contains
        # a div.CjVfdc with the size spans inside.
        h3_units = soup.find_all("h3", class_="CDt4Ke")

        for h3 in h3_units:
            container = h3.find("div", class_="CjVfdc")
            if not container:
                continue

            spans = container.find_all("span")
            # Filter out empty spans (Google Sites adds empty decorative spans)
            span_texts = [s.get_text(strip=True) for s in spans if s.get_text(strip=True)]

            # Reconstruct size text from spans: expect [width, "X", length, ...]
            # Filter to spans that look like a size pattern
            size_text = " ".join(span_texts).strip()

            # Skip non-unit headings (section titles, footer text, etc.)
            if not _SIZE_PATTERN.match(size_text):
                text_lower = size_text.lower()
                if "rv parking" in text_lower:
                    # Include RV Parking as a special unit type
                    unit = UnitResult(
                        size="RV Parking",
                        description="RV Parking",
                        url=url,
                    )
                    result.units.append(unit)
                continue

            # Parse the size: e.g. "5 X 8 Storage" → "5x8"
            size_match = re.match(r"(\d+)\s*X\s*(\d+)", size_text, re.IGNORECASE)
            if not size_match:
                continue

            width = float(size_match.group(1))
            length = float(size_match.group(2))
            normalized_size = f"{int(width)}x{int(length)}"
            w, ln, sq = self.normalize_size(normalized_size)

            # span_texts (empty-filtered) is now [width, "X", length] or [width, "X", length, type]
            # Extract unit type label (e.g. "Storage") if present as the 4th span
            unit_type = span_texts[3] if len(span_texts) > 3 else ""

            # Find the sibling description element.
            # Structure: h3 → (grandparent chain) → unnamed div with 2 children:
            #   child[0] = the h3 wrapper chain, child[1] = description div
            description = ""
            try:
                # Walk up: h3.parent (tyJCtd div) → jXK9ad-SmKAyb → hJDwNd... → oKdM2c → unnamed div
                ancestor = h3.parent.parent.parent.parent.parent
                sibling_children = [
                    c for c in ancestor.children if hasattr(c, "get_text")
                ]
                if len(sibling_children) > 1:
                    description = sibling_children[1].get_text(strip=True)
            except (AttributeError, IndexError):
                pass

            # Build the display size label
            display_size = f"{int(width)}' x {int(length)}'"
            if unit_type and unit_type.lower() not in ("storage",):
                display_size = f"{display_size} {unit_type}"

            unit = UnitResult(
                size=display_size,
                description=description or unit_type or None,
                url=url,
                metadata={
                    "width": w,
                    "length": ln,
                    "sqft": sq,
                },
            )
            result.units.append(unit)

        if not result.units:
            result.warnings.append("No units found on page")

        return result

Scrape Runs (5)

Run #91 Details

Status
exported
Parser Used
Facility080478Parser
Platform Detected
table_layout
Units Found
0
Stage Reached
exported
Timestamp
2026-03-14 01:02:22.953166
Timing
Stage Duration
Fetch2001ms
Detect17ms
Parse9ms
Export11ms

Snapshot: 080478_20260314T010224Z.html · Show Snapshot · Open in New Tab

No units found in this run.

All Failures for this Facility (5)

parse _WarningAsException scraper no_units_extracted warning Run #N/A | 2026-03-23 03:21:51.316005

No units extracted for 080478

Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478
parse _WarningAsException scraper no_units_extracted warning Run #N/A | 2026-03-21 19:15:11.926773

No units extracted for 080478

Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478
parse _WarningAsException scraper no_units_extracted warning Run #N/A | 2026-03-14 16:56:41.149060

No units extracted for 080478

Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478
parse _WarningAsException scraper no_units_extracted warning Run #N/A | 2026-03-14 05:00:48.923804

No units extracted for 080478

Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478
parse _WarningAsException scraper no_units_extracted warning Run #N/A | 2026-03-14 01:02:25.025794

No units extracted for 080478

Stack trace
src.reporting.failure_reporter._WarningAsException: No units extracted for 080478

← Back to dashboard