\

Show HN: Open-source browser for AI agents

71 points - today at 2:39 PM


Hi HN, I forked chromium and built agent-browser-protocol (ABP) after noticing that most browser-agent failures aren’t really about the model misunderstanding the page. Instead, the problem is that the model is reasoning from a stale state.

ABP is designed to keep the acting agent synchronized with the browser at every step. After each action (click, type, etc), it freezes JavaScript execution and rendering, then captures the resulting state. It also compiles the notable events that occurred during that action loop, such as navigation, file pickers, permission prompts, alerts, and downloads, and sends that along with a screenshot of the frozen page state back to the agent.

The result is that browser interaction starts to feel more like a multimodal chat loop. The agent takes an action, gets back a fresh visual state and a structured summary of what happened, then decides what to do next from there. That fits much better with how LLMs already work.

A few common browser-use failures ABP helps eliminate: * A modal appears after the last Playwright screenshot and blocks the input the agent was about to use * Dynamic filters cause the page to reflow between steps * An autocomplete dropdown opens and covers the element the agent intended to click * alert() / confirm() interrupts the flow * Downloads are triggered, but the agent has no reliable way to know when they’ve completed

As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark. I think modern LLMs already understand websites, they just need a better tool to interact with them. Happy to answer questions about the architecture, forking chrome or anything else in the comments below.

Try it out: `claude mcp add browser -- npx -y agent-browser-protocol --mcp` (Codex/OpenCode instructions in the docs)

Demo video: https://www.loom.com/share/387f6349196f417d8b4b16a5452c3369

Source
  • Retr0id

    today at 4:03 PM

    > As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark

    And what does opus score with "regular" browser harnesses?

      • 9wzYQbTYsAIc

        today at 5:05 PM

        90% easy or 90% average?

          • theredsix

            today at 5:25 PM

            90% average with 85.51% hard!

              • 9wzYQbTYsAIc

                today at 5:28 PM

                Nice! Will take a look at this for my homelab - was debating using crawl.cloudflare.com to try it out, as browser rendering was my next stretch goal.

        • esafak

          today at 4:19 PM

          https://huggingface.co/spaces/osunlp/Online_Mind2Web_Leaderb...

            • Retr0id

              today at 4:31 PM

              Hm I can't see Opus 4.6 on there

    • giancarlostoro

      today at 3:59 PM

      Interesting, I wonder if this would help with other projects too, one project that comes to mind is archivebox, I don't know if they still have the issue I'm thinking of, but archivebox eventually had the Chrome instances (as the meme goes) basically consume all available RAM. If by freezing execution this could stop that, it could be useful for more than just AI agents.

        • theredsix

          today at 5:32 PM

          Yeah, I noticed CPU use goes to near zero during the pausing phase. You can also trigger pause via REST/MCP so a script can take advantage of these abilities as well.

      • gregpr07

        today at 5:17 PM

        Love it! From first principles: this kinda answers the "do we really even need CDP" I always have in my head building browser use...

          • theredsix

            today at 5:25 PM

            Totally, I feel that CDP was designed for a different category of automations.

        • theredsix

          today at 2:40 PM

          Op here, happy to answer any question!

        • octoclaw

          today at 6:03 PM

          [dead]

          • bhekanik

            today at 5:02 PM

            [dead]

            • webpolis

              today at 6:27 PM

              [dead]

                • sebmellen

                  today at 7:46 PM

                  Does it feel good to be botting HN with ads for your own product?

                  I'm so sick of reading OpenClaw comments! No activity for 7 months, and then in the past day, five comments from an LLM pitching your tool. What are you doing man? This degrades the quality of HN so badly.

                  • theredsix

                    today at 6:35 PM

                    Great insight! ABP exposes display resolution controls right now. I've noticed almost zero reCAPTCHAs during testing compared puppeteer stealth or other packages. Regarding the freezing mechanic, virtualtime is paused as well and the entire browser clock is captured so it would be very hard for a page's JavaScript to notice the time drift unless they were querying an external API clock.