Facility: 001967
B & B Mini Storage
- Facility ID
- 001967
- Name
- B & B Mini Storage
- URL
- http://bnbministorage.com/
- Address
- 1051 38th St, Peru, IL 61354, USA, Peru, Illinois 61354
- Platform
- custom_facility_001967
- Parser File
- src/parsers/custom/facility_001967_parser.py
- Last Scraped
- 2026-03-27 13:49:17.449809
- Created
- 2026-03-14 16:21:53.706708
- Updated
- 2026-03-27 13:49:17.478847
- Parser Status
- ✓ Working
- Status Reason
- N/A
- Last Healing Attempt
- Not attempted
Parser Source (src/parsers/custom/facility_001967_parser.py)
"""Parser for B & B Mini Storage."""
from __future__ import annotations
import re
from bs4 import BeautifulSoup
from src.parsers.base import BaseParser, ParseResult, UnitResult
class Facility001967Parser(BaseParser):
"""Extract storage units from B & B Mini Storage."""
platform = "custom_facility_001967"
_UNIT_RE = re.compile(
r"(\d+\s*[\'\'\u2032]?\s*[xX\u00d7]\s*\d+\s*[\'\'\u2032]?)"
r"[^\$]{0,120}"
r"\$(\d[\d,.]*)",
re.DOTALL,
)
_PRICE_SIZE_RE = re.compile(
r"\$(\d[\d,.]*)"
r".{0,120}"
r"(\d+\s*[\'\'\u2032]?\s*[xX\u00d7]\s*\d+\s*[\'\'\u2032]?)",
re.DOTALL,
)
_SIZE_ONLY_RE = re.compile(
r"(\d+\s*[\'\'\u2032]?\s*[xX\u00d7]\s*\d+\s*[\'\'\u2032]?)"
)
def parse(self, html: str, url: str = "") -> ParseResult:
soup = BeautifulSoup(html, "lxml")
result = ParseResult(platform=self.platform, parser_name=self.__class__.__name__)
for tag in soup.find_all(["script", "style"]):
tag.decompose()
body_text = soup.get_text(separator="\n")
seen: set[tuple[str, str]] = set()
# Try size-then-price pattern
for m in self._UNIT_RE.finditer(body_text):
size_text = m.group(1).strip()
price_text = m.group(2).strip()
key = (size_text, price_text)
if key in seen:
continue
seen.add(key)
unit = UnitResult()
unit.size = size_text
w, ln, sq = self.normalize_size(size_text)
if w is not None:
unit.metadata = {"width": w, "length": ln, "sqft": sq}
unit.price = self.normalize_price(price_text)
unit.description = m.group(0).strip()[:200]
if unit.size or unit.price:
result.units.append(unit)
# Try price-then-size pattern if no results
if not result.units:
for m in self._PRICE_SIZE_RE.finditer(body_text):
price_text = m.group(1).strip()
size_text = m.group(2).strip()
key = (size_text, price_text)
if key in seen:
continue
seen.add(key)
unit = UnitResult()
unit.size = size_text
w, ln, sq = self.normalize_size(size_text)
if w is not None:
unit.metadata = {"width": w, "length": ln, "sqft": sq}
unit.price = self.normalize_price(price_text)
unit.description = m.group(0).strip()[:200]
if unit.size or unit.price:
result.units.append(unit)
# Fallback: extract sizes without prices
if not result.units:
seen_sizes: set[str] = set()
for m in self._SIZE_ONLY_RE.finditer(body_text):
size_text = m.group(1).strip()
if size_text in seen_sizes:
continue
w, ln, sq = self.normalize_size(size_text)
if w is None or w < 3 or ln < 3:
continue
seen_sizes.add(size_text)
unit = UnitResult()
unit.size = size_text
unit.metadata = {"width": w, "length": ln, "sqft": sq}
result.units.append(unit)
if not result.units:
result.warnings.append("No units found via regex")
return result
Scrape Runs (7)
-
exported Run #17512026-03-27 13:48:56.677367 | 7 units | Facility001967Parser | View Data →
-
exported Run #17502026-03-27 13:48:56.675595 | 7 units | Facility001967Parser | View Data →
-
exported Run #11382026-03-23 02:50:39.073674 | 7 units | Facility001967Parser | View Data →
-
exported Run #6452026-03-21 18:41:51.757326 | 7 units | Facility001967Parser | View Data →
-
failed Run #6442026-03-21 18:41:01.716790 | 1 failure(s)
-
failed Run #6432026-03-21 18:40:31.666538 | 1 failure(s)
-
exported Run #1942026-03-14 16:24:00.219007 | 7 units | Facility001967Parser | View Data →
Run #645 Details
- Status
- exported
- Parser Used
- Facility001967Parser
- Platform Detected
- table_layout
- Units Found
- 7
- Stage Reached
- exported
- Timestamp
- 2026-03-21 18:41:51.757326
Timing
| Stage | Duration |
|---|---|
| Fetch | 4482ms |
| Detect | 60ms |
| Parse | 25ms |
| Export | 10ms |
Snapshot: 001967_20260321T184156Z.html · Show Snapshot · Open in New Tab
Parsed Units (7)
5x10
10x10
10x15
10×15
10x20
10x25
10x30
All Failures for this Facility (2)
Message: timeout: Timed out receiving message from renderer: 1.528 (Session info: chrome=146.0.7680.153) Stacktrace: #0 0x55fba75ff8ce <unknown> #1 0x55fba6fbd1d2 <unknown> #2 0x55fba6fa81fc <unknown> #3 0x55fba6fa7fe9 <unknown> #4 0x55fba6fa65e6 <unknown> #5 0x55fba6fa6a96 <unknown> #6 0x55fba6fb50f7 <unknown> #7 0x55fba6fca77d <unknown> #8 0x55fba6fd002b <unknown> #9 0x55fba6fa70a1 <unknown> #10 0x55fba6fca5bf <unknown> #11 0x55fba704cb79 <unknown> #12 0x55fba702ca03 <unknown> #13 0x55fba6ffd5d5 <unknown> #14 0x55fba6ffe1c1 <unknown> #15 0x55fba75c3f10 <unknown> #16 0x55fba75c71d8 <unknown> #17 0x55fba75c6c8a <unknown> #18 0x55fba75c7645 <unknown> #19 0x55fba75b38fb <unknown> #20 0x55fba75c79a7 <unknown> #21 0x55fba759b836 <unknown> #22 0x55fba75ec8a5 <unknown> #23 0x55fba75eca9c <unknown> #24 0x55fba75fe38a <unknown> #25 0x7f34a7eb71f5 <unknown>
Stack trace
Traceback (most recent call last):
File "/app/src/pipeline.py", line 361, in _process_facility
fetch_result = fetch_page(driver, url, snapshot_mgr, facility_id, **fetch_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/src/fetcher/fetcher.py", line 125, in fetch_page
driver.get(url)
File "/app/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 466, in get
self.execute(Command.GET, {"url": url})
File "/app/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 446, in execute
self.error_handler.check_response(response)
File "/app/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 232, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: timeout: Timed out receiving message from renderer: 1.528
(Session info: chrome=146.0.7680.153)
Stacktrace:
#0 0x55fba75ff8ce <unknown>
#1 0x55fba6fbd1d2 <unknown>
#2 0x55fba6fa81fc <unknown>
#3 0x55fba6fa7fe9 <unknown>
#4 0x55fba6fa65e6 <unknown>
#5 0x55fba6fa6a96 <unknown>
#6 0x55fba6fb50f7 <unknown>
#7 0x55fba6fca77d <unknown>
#8 0x55fba6fd002b <unknown>
#9 0x55fba6fa70a1 <unknown>
#10 0x55fba6fca5bf <unknown>
#11 0x55fba704cb79 <unknown>
#12 0x55fba702ca03 <unknown>
#13 0x55fba6ffd5d5 <unknown>
#14 0x55fba6ffe1c1 <unknown>
#15 0x55fba75c3f10 <unknown>
#16 0x55fba75c71d8 <unknown>
#17 0x55fba75c6c8a <unknown>
#18 0x55fba75c7645 <unknown>
#19 0x55fba75b38fb <unknown>
#20 0x55fba75c79a7 <unknown>
#21 0x55fba759b836 <unknown>
#22 0x55fba75ec8a5 <unknown>
#23 0x55fba75eca9c <unknown>
#24 0x55fba75fe38a <unknown>
#25 0x7f34a7eb71f5 <unknown>
Message: timeout: Timed out receiving message from renderer: -0.003 (Session info: chrome=146.0.7680.153) Stacktrace: #0 0x55fba75ff8ce <unknown> #1 0x55fba6fbd1d2 <unknown> #2 0x55fba6fa81fc <unknown> #3 0x55fba6fa7fe9 <unknown> #4 0x55fba6fa65e6 <unknown> #5 0x55fba6fa6a96 <unknown> #6 0x55fba6fb50f7 <unknown> #7 0x55fba6fca77d <unknown> #8 0x55fba6fd002b <unknown> #9 0x55fba6fa70a1 <unknown> #10 0x55fba6fca5bf <unknown> #11 0x55fba704c86a <unknown> #12 0x55fba702ca03 <unknown> #13 0x55fba6ffd5d5 <unknown> #14 0x55fba6ffe1c1 <unknown> #15 0x55fba75c3f10 <unknown> #16 0x55fba75c71d8 <unknown> #17 0x55fba75c6c8a <unknown> #18 0x55fba75c7645 <unknown> #19 0x55fba75b38fb <unknown> #20 0x55fba75c79a7 <unknown> #21 0x55fba759b836 <unknown> #22 0x55fba75ec8a5 <unknown> #23 0x55fba75eca9c <unknown> #24 0x55fba75fe38a <unknown> #25 0x7f34a7eb71f5 <unknown>
Stack trace
Traceback (most recent call last):
File "/app/src/pipeline.py", line 361, in _process_facility
fetch_result = fetch_page(driver, url, snapshot_mgr, facility_id, **fetch_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/src/fetcher/fetcher.py", line 125, in fetch_page
driver.get(url)
File "/app/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 466, in get
self.execute(Command.GET, {"url": url})
File "/app/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/webdriver.py", line 446, in execute
self.error_handler.check_response(response)
File "/app/.venv/lib/python3.11/site-packages/selenium/webdriver/remote/errorhandler.py", line 232, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: timeout: Timed out receiving message from renderer: -0.003
(Session info: chrome=146.0.7680.153)
Stacktrace:
#0 0x55fba75ff8ce <unknown>
#1 0x55fba6fbd1d2 <unknown>
#2 0x55fba6fa81fc <unknown>
#3 0x55fba6fa7fe9 <unknown>
#4 0x55fba6fa65e6 <unknown>
#5 0x55fba6fa6a96 <unknown>
#6 0x55fba6fb50f7 <unknown>
#7 0x55fba6fca77d <unknown>
#8 0x55fba6fd002b <unknown>
#9 0x55fba6fa70a1 <unknown>
#10 0x55fba6fca5bf <unknown>
#11 0x55fba704c86a <unknown>
#12 0x55fba702ca03 <unknown>
#13 0x55fba6ffd5d5 <unknown>
#14 0x55fba6ffe1c1 <unknown>
#15 0x55fba75c3f10 <unknown>
#16 0x55fba75c71d8 <unknown>
#17 0x55fba75c6c8a <unknown>
#18 0x55fba75c7645 <unknown>
#19 0x55fba75b38fb <unknown>
#20 0x55fba75c79a7 <unknown>
#21 0x55fba759b836 <unknown>
#22 0x55fba75ec8a5 <unknown>
#23 0x55fba75eca9c <unknown>
#24 0x55fba75fe38a <unknown>
#25 0x7f34a7eb71f5 <unknown>