Why browser-using agents are eating manual research
Most knowledge work hidden in browsers — pricing pages, LinkedIn profiles, government registries — is repetitive enough that a small browser-controlling AI agent can swallow hours of human work per day. The 2026 stack makes this easy.
The minimal stack
- Playwright for the browser layer (Chromium headless, anti-bot friendly).
- Claude Sonnet 4.6 as the planner and DOM reader.
- A tiny Node.js orchestrator exposing two tools to the model:
navigate(url)andextract(selector_hint).
The loop
while not done:
snapshot = page.accessibility_snapshot()
decision = claude.next_step(goal, snapshot, history)
run_tool(decision)The trick is feeding the model the accessibility tree, not raw HTML — it\'s 10x smaller, structured, and lines up with how the model reasons about UI.
Production gotchas
- Rate-limit your own agent. Models will happily click 80 buttons per minute. Sites won\'t love that.
- Add a human-approval step for any write action (form submits, purchases, messaging).
- Persist sessions. Re-logging in costs tokens and triggers CAPTCHAs.
We\'ve shipped variants of this for lead enrichment, marketplace monitoring and competitive-pricing dashboards. Want one for your team? See our automation services.