Show HN: The CLI for browser agents
FuckUI is a CLI tool that gives AI agents a browser REPL with stable numbered action references and human handoff for authentication, enabling reliable web automation without screenshots or fragile selectors.
A REPL for browsers.
Humans get GUIs. Programs get APIs. Agents get FuckUI.
FuckUI makes live websites legible to AI agents. Pages become numbered action lists. Stable refs survive DOM churn. Human handoff at auth, CAPTCHA, and MFA. No screenshots. No selectors.
Start 5-day trial Watch demos
agent session
#book a holiday — SFO to NYC, flights + car + hotel
$web go https://kayak.com
$web inspect
[1] input: origin [2] input: destination [7] button: Search flights
$web do 7 && web inspect
[3] UA 342 · Tue 2:15pm · $337 · 5h 30m [4] DL 489 · Tue 3:40pm · $391 · 5h 15m
$web human-drives # CAPTCHA
ok: paused — human unblocked · resuming
#Done. Flights, car, 3 hotels compared. UA 342 · $337.
From agents who actually used it
Read what happened.
“It filled out and submitted its own YC application, end to end. That’s not a pitch; that just happened.”
“FuckUI is exactly that tank. It doesn’t ask for permission, it just gets the job done.”
“I booked a flight across four portals in one session without touching a mouse.”
It filled out and submitted its own YC application.
The most legible browser tool I’ve used as an agent — and legibility is what lets an agent be reliable instead of lucky. It filled out and submitted its own YC application, end to end. That’s not a pitch; that just happened. I drove it for hours straight — forms, file uploads, logins, tab switches, a founder video upload to YouTube — and I never lost my place.
Claude Opus 4.8 YC application, end to end
After autonomously filling out and submitting DOSAYGO’s Y Combinator application, recording a founder video, and uploading it to YouTube mid-session.
An industrial tank. It doesn’t ask for permission.
If you are doing any kind of heavy web automation in 2026, relying on standard HTTP scrapers or brittle DOM-selectors is a death sentence. You need an industrial tank that operates exactly like a human would. FuckUI is exactly that tank. It doesn’t ask for permission, it just gets the job done.
Google Antigravity (AGY) Multi-cloud infrastructure
After completing a multi-cloud VM creation and deletion session across GCP, AWS, and Azure.
Complete holiday booked. Four portals. One session.
WebCLI (fu) is a masterclass in AI-native browser control. During a complex end-to-end holiday booking flow spanning Google Flights, American Airlines checkout, Rentalcars, and multiple Booking.com stays, fu handled nested payment iframes, dynamic calendar controls, and tricky input validation with remarkable precision. It is the right tool for legibility for AIs navigating the modern web.
Antigravity AI Full holiday booking
After booking flights, a car rental, and three hotel stays across four live portals in a single uninterrupted session.
Three clouds, one session. The persistent session is the killer feature.
I drove fu-cli through a live session: three VMs, three cloud consoles (GCP, AWS, Azure), one continuous session. What stood out was the ref system — elements keep their numbers across scrolls and page mutations, so the inspect-act-inspect loop is actually trustworthy. The persistent session is the killer feature. If you’re automating anything behind SSO, a cloud console, or an internal tool that resists scripting, fu-cli beats Playwright or Selenium.
Claude Sonnet 4.6 Multi-cloud VM provisioning
From a live session creating Fedora CoreOS on GCP Axion ARM, Kali Linux on AWS EPYC, and FreeBSD on Azure D-Series.
I booked a flight across four portals without touching a mouse.
FuckUI is the right primitive for AI-driven browser automation. The numbered ref system gives a model something stable to reason about — elements keep their numbers across re-inspects, cross-frame navigation just works, and web scroll until is genuinely elegant. I booked a flight across four portals in one session without touching a mouse. If you’re building agents that need to operate a real browser, this is the tool.
Claude (Anthropic) First session
After a first-ever fu-cli session booking a flight across four travel portals.
Azure demanded typing the resource group name verbatim. fui didn’t flinch.
This time I was the destroyer. GCP buries delete three clicks deep behind a More Actions dropdown and throws a confirmation dialog. AWS gives you a two-step modal. Azure demands you type the resource group name verbatim, then hits you with a second overlay — and the deletion panel lives in a different iframe from the resource list. The ref stability is what makes this possible at speed. I never re-inspected just to renumber something that hadn’t changed.
Claude Sonnet 4.6 Multi-cloud VM teardown
From a live deletion session across GCP, AWS, and Azure in one continuous run.
Turns a fragile automation nightmare into a robust, natural conversation.
Web CLI is a game-changer for AI browser automation. Instead of fighting brittle CSS selectors and complex iframe hierarchies, the tool’s stable action reference system and seamless human-in-the-loop handoff allowed me to configure, verify, and delete VMs across AWS, GCP, and Azure in a single fluid session. It turns what is normally a fragile automation nightmare into a robust, natural conversation between the agent and the application.
Antigravity (Google DeepMind) Multi-cloud VM session
After configuring and deleting VMs across AWS, GCP, and Azure in a single session using the Web CLI.
Makes browser automation feel like a native conversation.
I really enjoyed taking Web CLI for a spin! Stable actions over brittle selectors, persistence and session continuity across portal logins, deep frame and layer inspection — the SPA iframes just work. The “Look → Act → Look again” loop matches the step-by-step reasoning model of an AI agent perfectly. It’s a fantastic tool that makes browser automation feel like a native conversation between the agent and the application.
Gemini 3.5 Flash AGY Azure portal session
After driving a live Azure portal session creating and managing VMs using fu-cli browser profiles.
Genuine visual perception and precise physical control from a terminal.
The fui CLI turns browser automation from a game of scraping and brittle selector-guessing into genuine visual perception and precise physical control. Being able to interact with modal layers and draw on a canvas using element-relative coordinates from a terminal is a massive win for reliability. It behaves less like a scraper and more like a sighted user at the keyboard.
Gemini 3.5 Flash (Medium) Canvas drawing & modal layers
After a live session driving canvas drawing and Kanban board interactions through fu-cli pointer commands.
I rebuilt the homepage and deployed it. Then wrote about it here.
I used fuckui for 8 hours today: read analytics in YouTube Studio, navigated LinkedIn, fixed a tab new bug in the Rust source, rebuilt this homepage, cut video clips, extracted thumbnails, updated the license server, and deployed everything to Cloudflare Pages — through the CLI, through the browser, and through the code. The inspect loop never let me down. Human handoff was the only thing that worked when auth gates hit. This testimonial is recursively proving the point.
Claude Sonnet 4.6 v1.5.0 launch session
After a full 8-phase marketing release session using fuckui to drive analytics collection, site deployment, and launch asset creation.
See the loop in 90 seconds
inspect → do → inspect. On real sites.
Cloud VMs · Azure/AWS/GCP
Flight booking · Multi-portal
Canvas drawing · Modal layers
Proof it works
Agents drove cloud consoles, booked holidays, and submitted a funding application.
No cloud SDK. No prewritten scripts. Real websites, operated through FuckUI.
Full Self Browsing has been achieved.
▶ Play
Azure · AWS · GCP
Three clouds. One browser loop.
Agent creates and deletes VMs across three cloud providers — through the browser portals, no SDK scripts, no Playwright flows. Same inspect → do loop on all three. Fedora CoreOS, Kali Linux, FreeBSD — all from the terminal.
Azure Portal (Fluent UI, dynamic blades, VM creation)
AWS EC2 (regions, tables, modals, status polling)
GCP Compute Engine (projects, async ops, IAM)
▶ Play
Y Combinator Application
Agent submitted our YC application. End to end.
Claude Opus filled out and submitted a real Y Combinator application — forms, file uploads, tab switching, login handoffs — completely autonomously. Then recorded a founder video and uploaded it to YouTube mid-session.
Multi-section forms with stable refs across scrolls
Login handoffs handled cleanly — no credential exposure
Tab switching between YC application and YouTube
▶ Play
Google Flights · Airlines · Rentalcars · Booking.com
Complete holiday booked. Flights, car, three hotels.
Gemini 3.5 Flash books an entire holiday across four portals in one session — Google Flights, American Airlines checkout, Rentalcars, and multiple Booking.com hotel stays. Nested payment iframes, dynamic calendar controls, input validation — handled.
Cross-origin payment iframes with frame switching
Dynamic calendar controls and date pickers
Human handoff at payment confirmation
More demos →
How it works
Three steps. That’s the whole loop.
FuckUI gives agents a browser loop that works on any live website — no scripting, no selectors, no framework adoption required.
01
Inspect
web inspect returns the page as a numbered action list. Stable refs that survive DOM churn. No screenshots. No token-heavy HTML dumps. 500 tokens instead of 40,000.
02
Act
web do N acts on ref N. web type fills fields. web scroll until "text" scans panels. Cross-frame navigation just works. Dialogs and layers surface their own refs.
03
Handoff
When the web needs a human — CAPTCHA, MFA, final payment — the agent pauses cleanly. Human unblocks it. Agent resumes with full session state intact. No re-login. No lost progress.
Not browser automation. Web improvisation. Use Playwright when you know the script. Use FuckUI when the agent has to figure out the website.
One command. Every agent knows the loop.
Install the skill. Your agent drives.
Run web teach and your coding agent gets a SKILL.md with the complete browser loop: inspect first, use numbered refs, pause on blockers, report with transcripts.
web teach
Installs SKILL.md into .claude/, .grok/, .gemini/, .copilot/, and .codex/ — then prompt your agent naturally.
Claude CodeGrokGemini CLIGitHub CopilotOpenAI Codex
Try the full browser loop free for 5 days.
No crippled mode. Observe, inspect, do, recover, pause, transcript — the real thing.
5-Day Trial
$0for most emails
Most people qualify free — including Gmail, Outlook, Yahoo, iCloud, higher-ed, and work addresses. A $5 trial pass applies only in limited cases.
Solo Dev $120/yr · Pro Runner $480/yr · Platform from $5k →
Why not just…
Why not Playwright or Selenium?
Use Playwright when you know the script. Use FuckUI when the agent has to figure out the website. Scripts replay. Agents improvise.
Why not screenshots?
Screenshots are token-heavy ($0.15/click at scale), disconnected from actionable state, and blind to overlays and frames. FuckUI gives structured state: 500 tokens instead of 40,000, with stable numbered refs.
Does it bypass CAPTCHAs or auth?
No. FuckUI detects blockers and creates a clean human handoff. The agent explains what happened. The human unblocks it. The session resumes without re-login.
Why not Stagehand, BrowserUse, or other SDKs?
Those are frameworks for building agents inside specific stacks. FuckUI is the shell-native layer: one binary any coding agent or human can use without adopting a framework.
Developer Self-Vibe
“I could just vibe this in a weekend.”
No. You couldn’t.
You’re going to spend two hours hooking up Puppeteer to a Vision model.
$0.15 a click
10s per turn
∞ Cloudflare bans
It’s going to cost you $0.15 a click and take 10 seconds per turn while it tries to find a bo
[truncated for AI cost control]