← Back to Agents

Lead Generation

County Records Scraper

Playwright-based nightly scraper across county assessor and court sites — extracting lis pendens, tax delinquencies, and probate notices into a clean structured dataset.

Overview

The County Records Scraper automates overnight data collection from public county records across multiple states. It extracts distress signals — lis pendens, tax delinquencies, and probate notices — and structures the data for immediate lead qualification and outreach the next morning.

Challenges This Agent Addresses

Manual county records research consumes 15+ hours per week per analyst, and the data arrives too late for same-day outreach. Competitors beat you to distressed sellers because they have automated systems; you're still pulling spreadsheets by hand. The County Records Scraper runs nightly, pulling current data from all 50 states' public records systems. By 8am, your team has a fresh list of tax-delinquent properties and foreclosure notices ready to work.

How It Works

Every night at midnight, the scraper launches Playwright instances against county assessor and court record websites, extracts distress signals, and cleanses the data for the next morning's outreach.

1

Nightly Web Scraping

Playwright headless browser instances launch against county assessor and court record portals in your target markets. Each scraper navigates search forms, executes queries, and extracts result tables.

Parse county assessor websites for recent tax delinquency filings
Scrape court record systems for lis pendens and foreclosure notices
Handle pagination and JavaScript-rendered content via Playwright
2

Data Extraction & Structuring

Raw HTML and table data are parsed into structured records: property address, owner name, filing date, amount owed, notice type. OCR handles scanned documents where needed.

Extract property address, owner name, phone, email (where available)
Parse filing dates and amounts owed from unstructured text
Standardise addresses and normalise phone numbers for CRM
3

Deduplication & Storage

New records are checked against previous nights' data to avoid duplicate outreach. Clean dataset is stored in a database and synced to your CRM by 7am.

Deduplicate against last 90 days of records using property address + owner name
Flag new signals (lis pendens filed yesterday) for priority outreach
Store in structured database and auto-push to CRM

Key Benefits

15+ hrs/week manual research cut to zero — fully automated overnight
Fresh data in CRM every morning, competitors still researching
Multi-state coverage in one system instead of multiple manual searches
Distress signals prioritised by recency — newest signals first
County Records Scraper | The Independent Broker Collective