koakuma-chan
yesterday at 10:41 PM
Keep in mind that I am a dashboard copy-and-paste workflow user, so the following may not be the same for Cursor users or Claude Code users.
> Which of these have you used and how are they useful to you?
llms-full.txt is generally not useful to me, because they are generally too big and consume too many tokens. For example, Next.js has an llms-full.txt[0] which is, IIRC, around 800K tokens. I don't know how this was intended to be used. I think llms-full.txt should look like Astro's /_llms-txt/api-reference.txt, more on it later.
[0]: https://nextjs.org/docs/llms-full.txt
Regarding llms.txt, I think there is some ambiguity because the way they look varies in my experience, but the most common ones are those that look like this[1] (i.e., a list of URLs), and I consider them moderately useful. My LLM cannot read URLs, but, what I do is, in that llms.txt, I look for files that are relevant to what I am doing, and just `curl -LO` them to a dedicated folder in my project (this kind of llms.txt usually lists LLM-friendly .md files). Subsequently, those files I downloaded are included in the context.
[1]: https://bun.sh/llms.txt
Now, what really impressed me is Astro's llms-small.txt, which, to be honest, still looks a little too big and appears to still contain some irrelevant stuff like "Editor setup," however, I think this is already small enough for it to be directly included in the prompt without any kind of additional preprocessing. I haven't seen anyone else do this (llms-small.txt) before, even though I think this is a pretty low hanging fruit.
But Astro actually has something that's, in my opinion, even better: /_llms-txt/api-reference.txt[2]. This appears to be just the API reference without any unnecessary data, and it even includes a list of common errors (something I have to maintain myself for other things, so that the LLM doesn't keep making same mistakes over and over again). This looks perfect for my dashboard copy and paste workflow, though I haven't actually tested yet (because I just found this).
[2]: https://docs.astro.build/_llms-txt/api-reference.txt
> Do you think this is relevant at earlier stages of a project or only once you have tons and tons of docs?
I think this is definitely relevant at early stages, and for as long as LLMs don't have your APIs in their own knowledge (you can look for "knowledge cut-off" date in model descriptions). I would go as far as saying that this is very important because if you don't have this, and LLMs don't have your APIs in their own knowledge, it will be a pain to use your library/SDK/whatever when coding with LLMs.
Tips:
- Maintain an LLM-friendly list of errors that LLMs commonly make when using your thing. For example, in Next.js the `headers` function, as of recently, returns a Promise (it used to return headers directly), and therefore you now have to `await` it, and it's extremely common for LLMs to not include an `await`, which prevents your app from working, and you have to waste time fixing this. It would be really good if Next.js provided an LLM-friendly list of common errors like this one, and there are many others.
- Maintain an LLM-friendly list of guidelines/best practices. This can (for example, but not limited) be used to discourage LLMs from using deprecated/whatever APIs that new apps should not use. Example: in Angular, you can inject things into your components by defining constructor parameters, but this is apparently an old way or whatever. Now they want you to use the `inject` function. So on their website they have LLM prompts[3] which list guidelines/best practices, including using the `inject` function.
[3]: https://angular.dev/ai/develop-with-ai
> My instinct is that many LLMs.txt become less relevant over time as AI tokens become cheaper and context becomes longer.
I don't think llms.txt will become less relevant anywhere in the near future. I think, as LLM capabilities increase, you will just be able to put more llms.txt into your context. But as of right now, in my experience, if your prompt is longer than 200K~ tokens, the LLM performance degrades significantly. Keep in mind that, (though this is just my mental model, and I am not an AI scientist), just because the LLM description says, for example, up to 1M tokens context, that doesn't necessarily mean that its "attention" spans across 1M tokens, and even though you can feed 1M tokens into, say, Gemini 2.5 Pro, it doesn't work well.
colonCapitalDee
today at 12:30 AM
With Claude Code I've had great success with maintaining a references folder of useful docs, cloned repos, downloaded html, etc. Claude Code is able to use its filesystem traversal tools to explore the library, it works very well. It's amazing to be able to say something like "Figure out how to construct $OBSCURE_TYPE. This is a research task; use the library" and it nails it.
I'm curious – how are you organizing the folder/instructing Claude Code on its layout? I'm trying to get an LLM-aided dev environment set up for an ancient application dev framework, and I'm resigned to the fact that I'm going to have to curate my own "AI source material" for it.
koakuma-chan
today at 12:44 AM
That’s not efficient though when you do real work. I prefer to manage the context myself and just copy and paste everything that’s needed into the dashboard, as opposed to waiting for Claude to read all it needs, which is longer and more expensive.