Facility: 085615

Hardin Mini Storage

Stale Data Warning: This facility has not been successfully scraped in 26 days (threshold: 3 days). Data may be outdated.
Facility Information active
Facility ID
085615
Name
Hardin Mini Storage
URL
https://www.hardinministorage.com/
Address
N/A
Platform
custom_facility_085615
Parser File
src/parsers/custom/facility_085615_parser.py
Last Scraped
2026-03-27 14:03:58.665726
Created
2026-03-06 23:45:35.865957
Updated
2026-03-27 14:03:58.697685
Parser & Healing Diagnosis working
Parser Status
✓ Working
Status Reason
N/A
Last Healing Attempt
Not attempted
Parser Source (src/parsers/custom/facility_085615_parser.py)
"""Parser for Hardin Mini Storage (UnitTrac platform)."""

from __future__ import annotations

import re

from bs4 import BeautifulSoup

from src.parsers.base import BaseParser, ParseResult, UnitResult


class Facility085615Parser(BaseParser):
    """Extract storage units from Hardin Mini Storage (hardinministorage.com).

    The site is powered by UnitTrac and renders unit listings as a list-group
    inside a card aside. Each list-group-item contains a nested table with:
    - Size label (Small/Medium/Large) and dimension text (e.g. "5' x 10'")
    - Square footage below the dimensions
    - Price formatted as "$45<sup>/mo</sup>"
    - Availability badge with text like "1 Available" or "0 Available"
    """

    platform = "custom_facility_085615"

    def parse(self, html: str, url: str = "") -> ParseResult:
        soup = BeautifulSoup(html, "lxml")
        result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)

        # Each unit is a list-group-item div (skip the first one which has the address)
        items = soup.select("div.list-group-item")

        for item in items:
            # Only process items that have a nested table (unit rows)
            outer_table = item.find("table")
            if not outer_table:
                continue

            unit = UnitResult()

            # --- Size label (Small / Medium / Large) ---
            label_span = item.find("span", style=re.compile(r"font-weight:\s*bold"))
            if label_span:
                unit.description = label_span.get_text(strip=True)

            # --- Dimensions: text node immediately after the label span ---
            # The dimension text (e.g. "5' x 10'") appears as a direct text node
            # in the same <td> as the label span, after a <br> tag.
            dimension_text = ""
            if label_span:
                parent_td = label_span.find_parent("td")
                if parent_td:
                    # Collect all text nodes in the td, skip blank ones
                    raw_texts = [t.strip() for t in parent_td.stripped_strings]
                    # raw_texts looks like: ["Small", "5' x 10'", "50"]
                    # First is the label, second is dimension, third is sqft
                    if len(raw_texts) >= 2:
                        dimension_text = raw_texts[1]
                        unit.size = dimension_text
                    if len(raw_texts) >= 3:
                        # third item is sqft number
                        sqft_str = raw_texts[2]
                        if sqft_str.isdigit():
                            w, ln, sq = self.normalize_size(dimension_text)
                            if w is not None:
                                unit.metadata = {"width": w, "length": ln, "sqft": sq}
                            else:
                                unit.metadata = {"sqft": int(sqft_str)}

            # Fall back: try normalize_size if metadata not yet set
            if unit.size and not unit.metadata:
                w, ln, sq = self.normalize_size(unit.size)
                if w is not None:
                    unit.metadata = {"width": w, "length": ln, "sqft": sq}

            # --- Price: span with font-size:18px contains "$45<sup>/mo</sup>" ---
            price_span = item.find("span", style=re.compile(r"font-size:\s*18px"))
            if price_span:
                # Remove the <sup> element so get_text only gives the dollar amount
                sup = price_span.find("sup")
                if sup:
                    sup.decompose()
                price_text = price_span.get_text(strip=True)
                unit.price = self.normalize_price(price_text)

            # --- Availability badge ---
            badge = item.find("div", class_="badge")
            if badge:
                unit.scarcity = badge.get_text(strip=True)

            if unit.size or unit.price:
                result.units.append(unit)

        if not result.units:
            result.warnings.append("No units found on page")

        return result

Scrape Runs (6)

Run #1326 Details

Status
exported
Parser Used
Facility085615Parser
Platform Detected
table_layout
Units Found
3
Stage Reached
exported
Timestamp
2026-03-23 03:06:12.115796
Timing
Stage Duration
Fetch2820ms
Detect28ms
Parse9ms
Export7ms

Snapshot: 085615_20260323T030614Z.html · Show Snapshot · Open in New Tab

Parsed Units (3)

5' x 10'

$45.00/mo
0 Available

10' x 15'

$55.00/mo
0 Available

10' x 20'

$70.00/mo
0 Available

All Failures for this Facility (1)

fetch DatatypeMismatch unknown unknown permanent Run #21 | 2026-03-07 01:42:37.184235

column "success" is of type boolean but expression is of type integer LINE 3: ... VALUES ('085615', 21, '085615_20260307T014237Z.html', 0) ^ HINT: You will need to rewrite or cast the expression.

Stack trace
Traceback (most recent call last):
  File "/app/src/pipeline.py", line 329, in _process_facility
    manifest_id = storage.insert_snapshot_manifest(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/src/db/pg_backend.py", line 615, in insert_snapshot_manifest
    row = self._execute_returning(
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/src/db/pg_backend.py", line 54, in _execute_returning
    cur.execute(sql, params)
  File "/app/.venv/lib/python3.11/site-packages/psycopg2/extras.py", line 236, in execute
    return super().execute(query, vars)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
psycopg2.errors.DatatypeMismatch: column "success" is of type boolean but expression is of type integer
LINE 3: ...    VALUES ('085615', 21, '085615_20260307T014237Z.html', 0)
                                                                     ^
HINT:  You will need to rewrite or cast the expression.

← Back to dashboard