Tiled Hacker news on React Router

We replaced RAG with a virtual filesystem for our AI documentation assistant

116 points - yesterday at 6:24 PM

Source

softwaredoug
today at 5:41 PM
The real thing I think people are rediscovering with file system based search is that there’s a type of semantic search that’s not embedding based retrieval. One that looks more like how a librarian organizes files into shelves based on the domain.
We’re rediscovering forms of in search we’ve known about for decades. And it turns out they’re more interpretable to agents.
https://softwaredoug.com/blog/2026/01/08/semantic-search-wit...
sunir
today at 7:25 PM
I am really enjoying this renaissance in CLI world applications. There's so much possible.
I'm working on a related challenge which is mounting a virtual filesystem with FUSE that mirrors my Mac's actual filesystem (over a subtree like ~/source), so I can constrain the agents within that filesystem, and block destructive changes outside their repo.
I have it so every repo has its own long-lived agent. They do get excited and start changing other repos, which messes up memory.
I didn't want to create a system user per repo because that's obnoxious, so I created a single claude system user, and I am using the virtual file system to manage permissions. My gmail repo's agent can for instance change the gmail repo and the google_auth repo, but it can't change the rag repo.
Edit: I'm publishing it here. It's still under development. https://github.com/sunir/bashguard
badgersnake
today at 7:56 PM
So you did GraphRAG but your graph is a filesystem tree.
jdthedisciple
today at 7:33 PM
But SQLite is notoriously 35% faster than the filesystem [0], so why not use that?
[0] https://news.ycombinator.com/item?id=14550060
tylergetsay
today at 6:33 PM
I dont understand the additional complexity of mocking bash when they could just provide grep, ls, find, etc tools to the LLM
seanlinehan
today at 5:42 PM
This is definitely the way. There are good use cases for real sandboxes (if your agent is executing arbitrary code, you better it do so in an air-gapped environment).
But the idea of spinning up a whole VM to use unix IO primitives is way overkill. Makes way more sense to let the agent spit our unix-like tool calls and then use whatever your prod stack uses to do IO.
Galanwe
today at 6:21 PM
I am not familiar with the tech stack they use, but from an outsider point of view, I was sort of expecting some kind of fuse solution. Could someone explain why they went through a fake shell? There has to be a reason.
pboulos
today at 6:04 PM
I think this is a great approach for a startup like Mintlify. I do have skepticism around how practical this would be in some of the “messier” organisations where RAG stands to add the most value. From personal experience, getting RAG to work well in places where the structure of the organisation and the information contained therein is far from hierarchical or partition-able is a very hard task.
kenforthewin
today at 6:34 PM
I don't get it - everybody in this thread is talking about the death of vector DBs and files being all you need. The article clearly states that this is a layer on top of their existing Chroma db.
bluegatty
today at 6:35 PM
RAG should have have been represented as a context tool but rather just vector querying ad an variation of search/query - and that's it.
We were bit by our own nomenclature.
Just a small variation in chosen acronym may have wrought a different outcome.
Different ways to find context are welcome, we have a long way to go!
dmix
today at 6:19 PM
This puts a lot of LLM in front of the information discovery. That would require far more sophisticated prompting and guardrails. I'd be curious to see how people architect an LLM->document approach with tool calling, rather than RAG->reranker->LLM. I'm also curious what the response times are like since it's more variable.
devops000
today at 7:44 PM
Why not a simple full text search in Postgres ?
mandeepj
today at 6:08 PM
> even a minimal setup (1 vCPU, 2 GiB RAM, 5-minute session lifetime) would put us north of $70,000 a year based on Daytona's per-second sandbox pricing ($0.0504/h per vCPU, $0.0162/h per GiB RAM)
$70k?
how about if we round off one zero? Give us $7000.
That number still seems to be very high.
maille
today at 5:57 PM
Let's say I want a free, local or free-tier-llm, simple solution to search information mostly from my emails and a little bit from text, doc and pdf files. Are there any tool I should try to have ollamma or gemini able to reply with my own knowledge base?
tschellenbach
today at 6:24 PM
I think generally we are going from vector based search, to agentic tool use, and hierarchy based systems like skills.
yieldcrv
today at 7:48 PM
I love the multipronged attack on RAG
RIP RAG: lasted one year at a skillset that recruiters would list on job descriptions, collectively shut down by industry professionals
dust42
today at 6:33 PM
If grep and ls do the trick, then sure you don't need RAG/embeddings. But you also don't need an LLM: a full text search in a database will be a lot more performant, faster and use less resources.
HanClinto
today at 6:40 PM
> "The agent doesn't need a real filesystem; it just needs the illusion of one. Our documentation was already indexed, chunked, and stored in a Chroma database to power our search, so we built ChromaFs: a virtual filesystem that intercepts UNIX commands and translates them into queries against that same database. Session creation dropped from ~46 seconds to ~100 milliseconds, and since ChromaFs reuses infrastructure we already pay for, the marginal per-conversation compute cost is zero."
Not to be "that guy" [0], but (especially for users who aren't already in ChromaDB) -- how would this be different for us from using a RAM disk?
> "ChromaFs is built on just-bash ... a TypeScript reimplementation of bash that supports grep, cat, ls, find, and cd. just-bash exposes a pluggable IFileSystem interface, so it handles all the parsing, piping, and flag logic while ChromaFs translates every underlying filesystem call into a Chroma query."
It sounds like the expected use-case is that agents would interact with the data via standard CLI tools (grep, cat, ls, find, etc), and there is nothing Chroma-specific in the final implementation (? Do I have that right?).
The author compares the speeds against the Chroma implementation vs. a physical HDD, but I wonder how the benchmark would compare against a Ramdisk with the same information / queries?
I'm very willing to believe that Chroma would still be faster / better for X/Y/Z reason, but I would be interested in seeing it compared, since for many people who already have their data in a hierarchical tree view, I bet there could be some massive speedups by mounting the memory directories in RAM instead of HDD.
[0] - https://news.ycombinator.com/item?id=9224
jrm4
today at 6:40 PM
Is this related to that thing where somehow the entire damn world forgot about the power of boolean (and other precise) searching?
ctxc
today at 6:25 PM
haha, sweet. One of the cooler things I've read lately
today at 5:59 PM
RodMiller
today at 7:54 PM
[dead]
wazionapps
today at 7:07 PM
[dead]
volume_tech
today at 6:12 PM
[dead]
hyperlambda
today at 6:17 PM
[dead]

We replaced RAG with a virtual filesystem for our AI documentation assistant

softwaredoug

wielebny

ivanovm

softwaredoug

TeMPOraL

oceansky

adfm

rafterydj

KPGv2

johnathandos

bluegatty

morkalork

czhu12

khalic

softwaredoug

skeptrune

UltraSane

whattheheckheck

sunir

badgersnake

jdthedisciple

tylergetsay

skeptrune

wahnfrieden

seanlinehan

skeptrune

Galanwe

skeptrune

readitalready

Galanwe

pboulos

khalic

pboulos

GandalfHN

skeptrune

kenforthewin

dominotw

skeptrune

kenforthewin

bluegatty

skeptrune

dmix

skeptrune

dmix

devops000

mandeepj

lstodd

maille

ghywertelling

tschellenbach

ghywertelling

skeptrune

yieldcrv

dust42

HanClinto

skeptrune

jrm4

ctxc

RodMiller

wazionapps

volume_tech

hyperlambda