Tiled Hacker news on React Router

Show HN: PageAgent, A GUI agent that lives inside your web app

62 points - today at 5:01 PM

Title: Show HN: PageAgent, A GUI agent that lives inside your web app

Hi HN,

I'm building PageAgent, an open-source (MIT) library that embeds an AI agent directly into your frontend.

I built this because I believe there's a massive design space for deploying general agents natively inside the web apps we already use, rather than treating the web merely as a dumb target for isolated bots.

Currently, most AI agents operate from external clients or server-side programs, effectively leaving web development out of the AI ecosystem. I'm experimenting with an "inside-out" paradigm instead. By dropping the library into a page, you get a client-side agent that interacts natively with the live DOM tree and inherits the user's active session out of the box, which works perfectly for SPAs.

To handle cross-page tasks, I built an optional browser extension that acts as a "bridge". This allows the web-page agent to control the entire browser with explicit user authorization. Instead of a desktop app controlling your browser, your web app is empowered to act as a general agent that can navigate the broader web.

I'd love to start a conversation about the viability of this architecture, and what you all think about the future of in-app general agents. Happy to answer any questions!

Source

moehj
today at 10:49 PM
"Interesting architecture — embedding the agent inside the app context rather than outside it makes sense for session-aware tasks. One question: how do you handle output validation before the agent acts on the DOM? Client-side agents acting on live state without a certification layer seems like a reliability risk in production. We've been building ARU (aru-runtime.com) as a runtime certification layer for exactly this — curious if you've thought about that boundary."
simon_luv_pho
today at 5:07 PM
This is highly experimental right now, but here are some quick links for anyone wanting to dig deeper:
- GitHub: https://github.com/alibaba/page-agent
- Live Demo (No sign-up): https://alibaba.github.io/page-agent/ (you can drag the bookmarklet from here to try it on other sites)
- Browser Extension: https://chromewebstore.google.com/detail/page-agent-ext/akld...
I'd be really interested in feedback on the security model of client-side agents giving extension-bridge access, and taking questions on the implementation!
mentalgear
today at 6:59 PM
> Data processed via servers in Mainland China
Appreciate the transparency, but maybe you could add some European (preferably) alternatives ?
general_reveal
today at 7:18 PM
I’ve been thinking about something like this. If it’s just a one line script import, how the heck are you trusting natural language to translate to commands for an arbitrary ui?
The only thing I can think of is you had the AI rewrite and embed selectors on the entire build file and work with that?
jadbox
today at 10:21 PM
Firefox support?
dzink
today at 6:53 PM
Is this Affiliated with the Chinese company Alibaba? Any chance data goes there too?
pscanf
today at 5:59 PM
Very cool!
I'm particularly impressed by the bookmark "trick" to install it on a page. Despite having spent 15 years developing for the browser, I had somehow missed that feature of the bookmarks bar. But awesome UX for people to try out the tool. Congrats!
Mnexium
today at 7:21 PM
Curious - how does it perform with captchas and other "are you human" stuff on the web?
coreylane
today at 6:52 PM
Looks cool! Are you open to adding AWS Bedrock or LiteLLM support?
MeteorMarc
today at 6:41 PM
Confusing name because of the existence of pageant, the putty agent.
popalchemist
today at 7:32 PM
Does it support long-click / click-and-drag?
today at 6:35 PM
jauntywundrkind
today at 5:43 PM
Not exactly the same but I'd also point to Paul Kinlan's FolioLM as a very interesting project in this space. A very nice browser extension,
> Collect and query content from tabs, bookmarks, and history - your AI research companion. FolioLM helps you collect sources from tabs, bookmarks, and history, then query and transform that content using AI.
https://github.com/PaulKinlan/NotebookLM-Chrome https://chromewebstore.google.com/detail/foliolm/eeejhgacmlh...

Show HN: PageAgent, A GUI agent that lives inside your web app

moehj

simon_luv_pho

jadbox

mentalgear

simon_luv_pho

simon_luv_pho

general_reveal

simon_luv_pho

jadbox

dzink

simon_luv_pho

pscanf

simon_luv_pho

Mnexium

simon_luv_pho

CloakHQ

Mnexium

simon_luv_pho

coreylane

simon_luv_pho

MeteorMarc

simon_luv_pho

graypegg

mmarian

simon_luv_pho

kirth_gersen

simon_luv_pho

popalchemist

simon_luv_pho

popalchemist

simon_luv_pho

jauntywundrkind

klueinc

simon_luv_pho