Tiled Hacker news on React Router

Levels of Agentic Engineering

53 points - today at 8:48 AM

Source

vidimitrov
today at 8:18 PM
Level 4 is where I see the most interesting design decisions get made, and also where most practitioners take a shortcut that compounds badly later.
When the author talks about "codifying" lessons, the instinct for most people is to update the rules file. That works fine for conventions - naming patterns, library preferences, relatively stable stuff. But there's a different category of knowledge that rules files handle poorly: the why behind decisions. Not what approach was chosen, but what was rejected and why the tradeoff landed where it did.
"Never use GraphQL for this service" is a useful rule to have in CLAUDE.md. What's not there: that GraphQL was actually evaluated, got pretty far into prototyping, and was abandoned because the caching layer had been specifically tuned for REST response shapes, and the cost of changing that was higher than the benefit for the team's current scale. The agent follows the rule. It can't tell when the rule is no longer load-bearing.
The place where this reasoning fits most naturally is git history - decisions and rejections captured in commit messages, versioned alongside the code they apply to. Good engineers have always done this informally. The discipline to do it consistently enough that agents can actually retrieve and use it is what's missing, and structuring it for that purpose is genuinely underexplored territory.
At level 7, this matters more than people expect. Background agents running across sessions with no human-in-the-loop have nothing to draw on except whatever was written down. A stale rules file in that context doesn't just cause mistakes - it produces confident mistakes.
holtkam2
today at 8:21 PM
Level 9: agent managers running agent teams Level 10: agent CEOs overseeing agent managers Level 11: agent board of directors overseeing the agent CEO
Level 12: agent superintelligence - single entity doing everything
Level 13: agent superagent, agenting agency agentically, in a loop, recursively, mega agent, agentic agent agent agency super AGI agent
Level 14: A G E N T
jjmarr
today at 6:50 PM
I coded a level 8 orchestration layer in CI for code review, two months before Claude launched theirs.
It's very powerful and agents can create dynamic microbenchmarks and evaluate what data structure to use for optimal performance, among other things.
I also have validation layers that trim hallucinations with handwritten linters.
I'd love to find people to network with. Right now this is a side project at work on top of writing test coverage for a factory. I don't have anyone to talk about this stuff with so it's sad when I see blog posts talking about "hype".
mzg
today at 6:38 PM
As a lowly level 2 who remains skeptical of these software “dark factories” described at the top of this ladder, what I don’t understand is this:
If software engineering is enough of a solved problem that you can delegate it entirely to LLM agents, what part of it remains context-specific enough that it can’t be better solved by a general-purpose software factory product? In other words, if you’re a company that is using LLMs to develop non-AI software, and you’ve built a sufficient factory to generate that software, why don’t you start selling the factory instead of whatever you were selling before? It has a much higher TAM (all of software)
nimasadri11
today at 7:19 PM
I really like your post and agree with most things. The one thing I am not fully sure about:
> Look at your app, describe a sequence of changes out loud, and watch them happen in front of you.
The problem a lot of times is that either you don't know what you want, or you can't communicate it (and usually you can't communicate it properly because you don't know exactly what you want). I think this is going to be the bottleneck very soon (for some people, it is already the bottleneck). I am curious what are your thoughts about this? Where do you see that going, and how do you think we can prepare for that and address that. Or do you not see that to be an issue?
politelemon
today at 6:33 PM
These are levels of gatekeeping. The items are barely related to each other. Lists like these will only promote toxicity, you should be using the tools and techniques that solve your problems and fit your comfort levels.
eikenberry
today at 6:38 PM
In my opinion there are 2 levels, human writes the code with AI assist or AI writes the code with human assist; centuar or reverse-centuar. But this article tries to focus on the evolution of the ideas and mistakenly terms them as levels (indicating a skill ladder as other commenters have noted) when they are more like stages that the AI ecosystem has evolved through. The article reads better if you think of it that way.
smy20011
today at 6:13 PM
I will not put it into a ladder. It implies that the higher the rank, the better. However, you want to choose the best solution for your needs.
ftkftk
today at 6:51 PM
I prefer Dan Shapiro's 5 level analogy (based on car autonomy levels) because it makes for a cleaner maturity model when discussing with people who are not as deeply immersed in the current state of the art. But there are some good overall insights in this piece, and there are enough breadcrumbs to lead to further exploration, which I appreciate. I think levels 3 and 4 should be collapsed, and the real magic starts to happen after combining 5 and 6; maybe they should be merged as well.
jackby03
today at 7:19 PM
Good taxonomy. One thing missing from most discussions at these levels is how agents discover project context — most tools still rely on vendor-specific files (CLAUDE.md, .cursorrules). Would love to see standardization at that layer too.
efsavage
today at 6:26 PM
Yegge's list resonated a little more closely with my progression to a clumsy L8.
I think eventually 4-8 will be collapsed behind a more capable layer that can handle this stuff on its own, maybe I tinker with MCP settings and granular control to minmax the process, but for the most part I shouldn't have to worry about it any more than I worry about how many threads my compiler is using.
dolebirchwood
today at 8:05 PM
> Voice-to-voice (thought-to-thought, maybe?) interaction with your coding agent — conversational Claude Code, not just voice-to-text input — is a natural next step.
Maybe it's just me, but I don't see the appeal in verbal dictation, especially where complexity is involved. I want to think through issues deliberately, carefully, and slowly to ensure I'm not glossing over subtle nuances. I don't find speaking to be conducive to that.
For me, the process of writing (and rewriting) gives me the time, space, and structure to more precisely articulate what I want with a more heightened degree of specificity. Being able to type at 80+ wpm probably helps as well.
sjkoelle
today at 6:12 PM
Oceania has always been context engineering. Its been interesting to see this prioritized in the zeitgeist over the last 6 months from the "long context" zeitgeist.
C0ldSmi1e
today at 7:47 PM
One of the best article I've read recently.
ramesh31
today at 7:26 PM
>(Re: level 8) "...I honestly don't think the models are ready for this level of autonomy for most tasks. And even if they were smart enough, they're still too slow and too token-hungry for it to be economical outside of moonshot projects like compilers and browser builds (impressive, but far from clean)."
This is increasingly untrue with Opus 4.6. Claude Max gives you enough tokens to run ~5-10 agents continuously, and I'm doing all of my work with agent teams now. Token usage is up 10x or more, but the results are infinitely better and faster. Multi-agent team orchestration will be to 2026 what agents were to 2025. Much of the OP article feels 3-6 months behind the times.
measurablefunc
today at 6:49 PM
What level is numeric patterns that evolve according to a sequence of arithmetic operations?

Levels of Agentic Engineering

vidimitrov

holtkam2

zenoprax

jjmarr

jessmartin

mzg

2001zhaozhao

hakanderyal

glhast

pydry

dist-epoch

nimasadri11

politelemon

eikenberry

dist-epoch

eikenberry

smy20011

ftkftk

jackby03

efsavage

lherron

ramesh31

dolebirchwood

sjkoelle

C0ldSmi1e

ramesh31

measurablefunc