Tiled Hacker news on React Router

Ask HN: How are you preventing LLM hallucinations in production systems?

3 points - 01/13/2026

Hi HN,

For those running LLMs in real production environments (especially agentic or tool-using systems): what’s actually worked for you to prevent confident but incorrect outputs?

Prompt engineering and basic filters help, but we’ve still seen cases where responses look fluent, structured, and reasonable — yet violate business rules, domain boundaries, or downstream assumptions.

I’m curious:

Do you rely on strict schemas or typed outputs?

Secondary validation models or rule engines?

Human-in-the-loop for certain classes of actions?

Hard constraints before execution (e.g., allow/deny lists)?

What approaches failed for you, and what held up under scale and real user behavior?

Interested in practical lessons and post-mortems rather than theory.

al_borland
01/14/2026
I’ve just been ignoring my boss every time he says something about how we should leverage AI. What we’re building doesn’t need it and can’t tolerate hallucinations. They just want to be able to brag up the chain that AI is being used, which is the wrong reason to use it.
If I was forced to use it, I’d probably be writing pretty extensive guardrails (outside of the AI) to make sure it isn’t going off the rails and the results make sense. I’m doing that anyway with all user input, so I guess I’d be treating all LLM generated text as user input and assuming it’s unreliable.
raw_anon_1111
01/15/2026
Let me give you a little anecdote. I use ChatGPT to learn Spanish. The prompt I use is below.
It gets things wrong about half the time and I have to tell it that it’s wrong. If I can’t trust an LLM to follow simple instructions, why would I trust it “agentically” with business critical decision making?
I work in cloud consulting specializing in app dev and every project I’ve done in the last year and a half has a bedrock based LLM somewhere in the process - ie the running system. But I know what to trust it for and what not to trust it for and I guide my clients appropriately.
The prompt I use for studying Spanish that ChatGPT gets wrong:
—- I am learning Spanish at a A2 level. When I ask you to do a lightning round, I will give you a list of sentences first. You will give me each English sentence one by one and I will translate it to Spanish. If I get it wrong, save it for the next round.
When I ask you to create sentences from a verb, create 1 sentences each for 1-3 single and 1 and 3 plural for present and simple past and 3 for progressive. Each sentence must be at least five words.
These are some words and phrases I need to review: only use these words in sentences for 1-3 present single and only when they make sense, If a target word does not fit naturally, skip it and prioritize a natural sentence. don’t force yourself to use these words. When I list of verb, it means I need to practice it, present and simple past
<a relatively short list of words>
Never use:
<a relative short list of words>
crosslayer
01/16/2026
+1 to treating LLM output as untrusted input.
The failure mode I keep seeing isn’t hallucination per se… it’s blurred responsibility between intent and execution. Once a model can both decide and act, you’ve already lost determinism.
The stable pattern seems to be- LLMs generate proposals, deterministic systems enforce invariants. Creativity upstream, boring execution downstream.
What matters in production isn’t making models smarter, it’s making failure modes predictable and auditable. If you can’t explain why a state transition was allowed, the architecture is already too permissive.
stephenr
01/14/2026
I've found that I can use a very similar approach to the one I've used when handling the risks associated with blockchain, cryptocurrencies, "web scale" infrastructure, and of course the chupacabra.
Agent_Builder
01/14/2026
[dead]

Ask HN: How are you preventing LLM hallucinations in production systems?

al_borland

kundan_s__r

Agent_Builder

kundan_s__r

raw_anon_1111

kundan_s__r

crosslayer

stephenr

kundan_s__r

stephenr

kundan_s__r

stephenr

kundan_s__r

Agent_Builder

kundan_s__r