Tiled Hacker news on React Router

Show HN: Context Gateway – Compress agent context before it hits the LLM

32 points - today at 5:58 PM

We built an open-source proxy that sits between coding agents (Claude Code, OpenClaw, etc.) and the LLM, compressing tool outputs before they enter the context window.

Demo: https://www.youtube.com/watch?v=-vFZ6MPrwjw#t=9s.

Motivation: Agents are terrible at managing context. A single file read or grep can dump thousands of tokens into the window, most of it noise. This isn't just expensive — it actively degrades quality. Long-context benchmarks consistently show steep accuracy drops as context grows (OpenAI's GPT-5.4 eval goes from 97.2% at 32k to 36.6% at 1M https://openai.com/index/introducing-gpt-5-4/).

Our solution uses small language models (SLMs): we look at model internals and train classifiers to detect which parts of the context carry the most signal. When a tool returns output, we compress it conditioned on the intent of the tool call—so if the agent called grep looking for error handling patterns, the SLM keeps the relevant matches and strips the rest.

If the model later needs something we removed, it calls expand() to fetch the original output. We also do background compaction at 85% window capacity and lazy-load tool descriptions so the model only sees tools relevant to the current step.

The proxy also gives you spending caps, a dashboard for tracking running and past sessions, and Slack pings when an agent is sitting there waiting on you.

Repo is here: https://github.com/Compresr-ai/Context-Gateway. You can try it with:

  curl -fsSL https://compresr.ai/api/install | sh

Happy to go deep on any of it: the compression model, how the lazy tool loading works, or anything else about the gateway. Try it out and let us know how you like it!

Source

sethcronin
today at 8:08 PM
I guess I'm skeptical that this actually improves performance. I'm worried that the middle man, the tool outputs, can strip useful context that the agent actually needs to diagnose.
kuboble
today at 7:26 PM
I wonder what is the business model.
It seems like the tool to solve the problem that won't last longer than couple of months and is something that e.g. claude code can and probably will tackle themselves soon.
tontinton
today at 7:33 PM
Is it similar to rtk? Where the output of tool calls is compressed? Or does it actively compress your history once in a while?
If it's the latter, then users will pay for the entire history of tokens since the change uncached: https://platform.claude.com/docs/en/build-with-claude/prompt...
How is this better?
root_axis
today at 7:21 PM
Funny enough, Anthropic just went GA with 1m context claude that has supposedly solved the lost-in-the-middle problem.
thesiti92
today at 6:07 PM
do you guys have any stats on how much faster this is than claude or codex's compression? claudes is super super slow, but codex feels like an acceptable amount of time? looks cool tho, ill have to try it out and see if it messes with outputs or not.
esafak
today at 6:55 PM
I can already prevent context pollution with subagents. How is this better?
lambdaone
today at 7:44 PM
This company sounds like it has months to live, or until the VC money runs out at most. If this idea is good, Anthropic et. al. will roll it into their own product, eliminating any purpose for it to exist as an independent product. And if it isn't any good, the company won't get traction.
uaghazade
today at 6:53 PM
ok, its great
verdverm
today at 6:05 PM
I don't want some other tooling messing with my context. It's too important to leave to something that needs to optimize across many users, there by not being the best for my specifics.
The framework I use (ADK) already handles this, very low hanging fruit that should be a part of any framework, not something external. In ADK, this is a boolean you can turn on per tool or subagent, you can even decide turn by turn or based on any context you see fit by supplying a function.
YC over indexed on AI startups too early, not realizing how trivial these startup "products" are, more of a line item in the feature list of a mature agent framework.
I've also seen dozens of this same project submitted by the claws the led to our new rule addition this week. If your project can be vibe coded by dozens of people in mere hours...
poushwell
today at 7:56 PM
[flagged]
BrianFHearn
today at 6:11 PM
[flagged]
zenon_paradox
today at 6:22 PM
[dead]
eegG0D
today at 7:05 PM
[flagged]
jameschaearley
today at 6:33 PM
[flagged]

Show HN: Context Gateway – Compress agent context before it hits the LLM

sethcronin

kuboble

tontinton

BloondAndDoom

root_axis

SyneRyder

siva7

BloondAndDoom

thesiti92

esafak

lambdaone

uaghazade

verdverm

poushwell

BrianFHearn

zenon_paradox

eegG0D

mmastrac

post-it

jameschaearley

metadat

linkregister

altruios

PufPufPuf