EP01 · The Bargain Patrol

Transcript & receipts — the full written case study

01 · Cold open

On June 8, the patrol flagged an Epson EB-680 short-throw projector for NT$2,300. The same unit sells new for NT$31,900. The listing had been up for a few hours; by the time a human browsing Carousell on their phone would have scrolled past it, the loop had already scraped it, checked the new price against two sources, computed that NT$2,300 is 7% of retail, and filed it at the top of a report titled "好貨" — good stuff, buy-ready.

Nobody was at the keyboard. Nobody had been at the keyboard for most of the six weeks the loop has been running.

02 · The problem

Carousell Taiwan lists thousands of second-hand items a day. Real bargains exist — people sell ANC headphones at a third of retail, camera lenses at half the used-market median — but they're buried under sellers who price "20% off new" and call it a deal, under counterfeit batches, and under listings so vague (an iPad with no storage size, a lens with no mount) that the price means nothing.

Browsing manually has two costs. Time: a serious sweep across categories takes an hour, and good items sell within hours, so you'd need to sweep several times a day. Judgment fatigue: every listing demands a price check against current retail and the used-market median, because "vs new" is the seller's favorite misleading anchor. Nobody sustains that. So you either waste hours or miss the projector.

03 · The loop

One command starts a patrol. Everything else is split along a single design line: the agent makes judgments; deterministic scripts do everything repeatable.

patrol.sh (one background process, ~every few hours)
│
├─ scrape.js          ~2,000 raw listings, 11 categories      [script]
├─ velocity checks    log what reappeared, what sold          [script]
├─ process.js         rule filter: dedup vs 16k seen IDs,     [script]
│                     blacklisted sellers, vague titles,
│                     <NT$2K noise, recurring fraud patterns
│                     → 10–30 survivors in need_verify.json
│
└─ agent (Claude)     price-verifies each survivor against    [judgment]
                      retail + used-median sources, applies
                      the rulebook, buckets the results
│
└─ DEALS.md           good / haggle / manual / fair / overpriced,
                      synced to GitHub — the human reads one page

Each filter layer cuts ~90%. Scraping is cheap, verification is expensive (agent tokens), so the rules exist to make sure only ambiguous cases reach the agent. The human's only job is reading the final report and marking decisions, which feed back into seen_ids.json so nothing appears twice.

The rulebook the agent applies is where the domain knowledge lives, and it was learned, not designed up front:

Used-median beats new-price. Sellers anchor on "% off new"; the loop trusts the second-hand median and auto-rejects anything above it, however pretty the vs-new number looks.
Vague title = skip. Missing capacity, mount, size, or generation can mean a 2× price difference. Those listings never reach verification.
Too sweet is a trap. Under 50% of used-median usually means battery rot, parts swaps, or stolen goods. The loop flags instead of celebrating.
Repetition is a signal. The same model at the same price across multiple patrols means a grey-import or fraud batch, not three lucky bargains. Recurring patterns get auto-excluded.

04 · What broke

The two-layer relay hang (patrol #484). Early on, a sub-agent backgrounded the scraper and waited to be woken when it finished. The wake-up sometimes never fired, and everything downstream hung silently. The fix is structural and now lives as a comment at the top of patrol.sh: one script, one background process, one completion notification. No relay.

The Cloudflare ambush. The scraper started returning "Just a moment…" challenge pages. Cookies were fine; fingerprints were fine. The actual cause: a VPN exit through Japan. Carousell's Cloudflare rules block at the IP level. The fix: detect the challenge, exit with a dedicated code, and have the patrol report "CF_BLOCKED — turn off the VPN" instead of writing garbage into state files.

The deleted-memory incident. During a test, seen_ids.json — the loop's record of every listing already judged — got deleted as "just cache." Every previously rejected item flooded back into the next report. The file is state, not cache; it now has a standing never-delete rule.

The flooded report. One hot category could fill the entire deals page and bury everything else. Fix: cap five items per category per patrol, ranked by discount depth, and sort by the listing's absolute timestamp — the scraped "3 hours ago" strings go stale in stored state and had silently broken ordering.

05 · The numbers

Six weeks of operation, May 1 to June 11, 2026 (git history: 785 commits).

Metric	Value	Source
Listings triaged	16,705	`state/seen_ids.json`
Prices verified & cached	7,978	`state/verified_prices.json`
Sellers blacklisted	149	report header
Per patrol	~2,000 raw → 10–30 verified	pipeline counts in `patrol.sh` output
On the board today	41 buy-ready · 41 haggle · 27 manual	`DEALS.md`, 2026-06-11
Sample finds	projector at 7% of retail · Leica-branded lens at 31% · ANC earbuds at 33%	`DEALS.md` rows

⚠️ Not tracked: total agent token cost and money actually saved on completed purchases. Honest gap — the loop measures market coverage, not wallet outcomes.

06 · Take it home

Three design rules transfer to any patrol-shaped loop (price watching, job boards, apartment hunting, grant deadlines):

Funnel before judgment. Cheap deterministic rules must cut ~90% per layer so the expensive agent only sees genuinely ambiguous cases. If your agent reads everything, your loop dies of token cost.
One process, one notification. Never chain "background task wakes a waiter that wakes a reporter." Flatten the pipeline into a single script the agent fires once and reads once.
State files are sacred. The seen-IDs file is the loop's memory. Treat it like a database, not a cache, and your loop gets smarter every cycle instead of resetting.

The full pipeline (scraper, rule filter, patrol script) is a few hundred lines of Node plus one bash file. The rulebook above is the part that took six weeks to learn.

The Bargain Patrol

Six weeks. Nobody watching.

Your move