Inside a luxury data oracle: how Luxe Oracle scrapes, caches, and settles in <1s
Short answer: Stealth Playwright scrapers feed a Redis cache on a 24-hour cycle. The hot path never scrapes — it reads cache and settles a $0.001 USDC payment via x402 on Base mainnet. Two layers, three pieces of state, sub-second latency.
A useful data oracle has to do two unrelated jobs well: (1) keep fresh data flowing in from a hostile source, and (2) serve it to paying clients with predictable latency. We split these into two independent layers — a background ingestion pipeline and a hot request path — connected only through Redis.
The 30-second tour
AI agent
│
▼
┌──────────────────────────┐
│ Next.js App Router │ ◄── x402 middleware ($0.001)
│ /api/v1/inventory free │ replay-guard (nonce dedup)
│ /api/v1/monitor paid │ rate-limiter (per-IP)
│ /api/v1/sessions paid │
└────────────┬─────────────┘
│ read-only
▼
┌──────────────────────────┐
│ Redis (SWR + AOF) │
│ inventory:{b}:{r} │
│ cache:{b}:{r}:{id} │
│ session:{token} │
│ nonce:{hash} │
└────────────▲─────────────┘
│ writes only
┌────────────┴─────────────┐
│ BullMQ refresh worker │ every 24h + boot
│ ↳ Playwright + Stealth │ rotating proxies / region
│ ↳ HermesScraper │ hermes.com category page
└──────────────────────────┘The cleanest property of this layout: the request path never blocks on a scrape. Whether the scraper is happy or in a 30-minute cooldown after a hard block, agents still get fresh-enough cache in tens of milliseconds.
The hot path: serving paid agents
All three public endpoints run on Next.js App Router as Edge-style handlers. Their only job is to read Redis and (for paid routes) collect payment.
| Endpoint | Auth | Reads |
|---|---|---|
| GET /api/v1/inventory | none (free) | inventory:{b}:{r} |
| GET /api/v1/monitor/{r}/{b}/{m} | x402 ($0.001) or X-Session-Key | cache:{b}:{r}:{id} |
| POST /api/v1/sessions | x402 ($0.01) | writes session:{token} |
For a paid monitor call, the request runs through a small middleware stack:
- Rate limiter — Redis counter per IP. Cheap, blocks abuse before any payment work.
- Session-key fast path — if
X-Session-Keyis present, decrement credits via a Lua script (atomic) and skip x402 entirely. - Replay guard — the EIP-3009 nonce from the X-PAYMENT header is checked against Redis; reuse fails fast.
- x402 middleware —
@x402/nextverifies the signature and forwards to the CDP facilitator for settlement. - Cache read — final handler returns
{ in_stock, metadata, last_update }from Redis. No scrape, no I/O beyond Redis.
Redis: the only shared state
| Key pattern | Written by | Read by |
|---|---|---|
| inventory:{b}:{r} | scraper | /inventory, /monitor (validate) |
| cache:{b}:{r}:{id} | scraper | /monitor |
| session:{token} | /sessions, /monitor | /monitor |
| nonce:{hash} | replay-guard | replay-guard |
| bull:monitor-jobs:* | BullMQ | BullMQ |
Two reliability features matter here:
- SWR (stale-while-revalidate) — cache values keep serving for up to 7 days after they expire, while a refresh job is queued. An agent calling during a scraper outage still gets a value (with an older
last_update); we never return 503 for a transient upstream block. - AOF persistence — Redis fsyncs on every write, so a node restart does not drop session credits or replay-guard nonces. (For an oracle, losing a nonce table is worse than losing an inventory cache.)
The cold path: scraping behind anti-bot
Hermès, like every luxury site, fronts its category pages with commercial bot mitigation (DataDome). A naïve fetch is dead on arrival. The scraper layer is the most defensive code in the project.
| Defense | Implementation |
|---|---|
| Browser fingerprint | playwright-extra + stealth plugin; real Chrome (channel: 'chrome') |
| IP rotation | Per-region pool; new browser session = new outbound IP |
| Human simulation | Random mouse moves, smooth scroll, jittered delays |
| Session warmup | Hit homepage first to acquire DataDome cookie before category |
| Soft-block detection | Empty /product/link list → throw, retry with new IP (don't blame the proxy) |
| Hard-block detection | "Access Denied" / "temporarily restricted" → mark proxy dead 30 min |
| Cooldown | ≥15 s between requests, module-level mutex |
One subtle decision: softblocks rotate IP without penalising the proxy. Most proxies fail probabilistically — DataDome sees the new session as suspicious for reasons unrelated to the IP's reputation. Marking it dead would shrink the pool every time we got unlucky.
BullMQ orchestration
Scrapes run as BullMQ jobs against a single worker (concurrency 1) so we never hammer Hermès in parallel. The schedule is dead simple:
refresh-all— repeatable every 24 h, plus one boot job on process start.- That job fans out one
scrapechild job per registered (brand, region) pair. - Retries: 3 attempts, exponential backoff (5 s base). Stalled-job detection at 5 min, lock duration 10 min, hard execution timeout 8 min via
Promise.race. SIGTERM/SIGINThandlers close the worker cleanly so deploys never leave a dangling lock.
Where x402 fits
x402 sits at exactly two seams:
- Per-request payments on
/api/v1/monitorvia@x402/nextwith a $0.001 USDC quote. The route handler never sees a wallet key; the CDP facilitator settles on Base mainnet. - Session purchase on
/api/v1/sessions— pay $0.01 USDC once, get a 48-char hex token good for 10 monitor queries via X-Session-Key. Trades latency for predictability when an agent has bursty demand.
Both routes ship extensions.bazaar metadata with tags, regions, and OpenAPI link, so the moment a real settlement clears the facilitator the service self-registers in the x402 Bazaar discovery index.
What this buys us
- ~50 ms p50 on free reads, ~600 ms on paid. All five middleware layers are Redis-bound.
- Scraper outage ≠ user-facing outage. SWR keeps cache live for 7 days; agents get a slightly older
last_updateinstead of an error. - No long-lived secrets. No API keys to issue, rotate, or revoke. The merchant key never touches request paths — only the facilitator does broadcasts.
- Single-binary ops. One Node process runs Next.js + the BullMQ worker. PM2 plus Nginx plus a Redis instance is the entire production footprint.
Read next
- Anatomy of an x402 payment — what happens between "agent calls fetchWithPayment" and "data lands".
- Why x402, not API keys — the higher-level case for switching auth models.
- OpenAPI 3.1 spec — paste into your agent's tool description.