Tiled Hacker news on React Router

LLM Inference Handbook

366 points - 07/11/2025

Source

sherlockxu
07/11/2025
Hi everyone. I'm one of the maintainers of this project. We're both excited and humbled to see it on Hacker News!
We created this handbook to make LLM inference concepts more accessible, especially for developers building real-world LLM applications. The goal is to pull together scattered knowledge into something clear, practical, and easy to build on.
We’re continuing to improve it, so feedback is very welcome!
GitHub repo: https://github.com/bentoml/llm-inference-in-production
subset
07/11/2025
Ooh this looks really neat! I'd love to see more content in the future on Structured outputs/Guided generation and sampling. Another great reference on inference-time algorithms for sampling is here: https://rentry.co/samplers
aligundogdu
07/11/2025
It's a really beautiful project, and I’d like to ask something purely out of curiosity and with the best intentions. What’s the name of the design trend you used for your website? I really loved the website too.
gchadwick
07/11/2025
Very glad to see this. There is (understandably) much excitement and focus on training models in publicly available material.
Running them well is very important too. As we get to grips with everything models can do and look to deploy them widely knowledge of how to best run them becomes ever more important.
qrios
07/11/2025
Thanks for putting this together! From now on I only need one link to point interested ones to learn.
Only one suggestion: On page "OpenAI-compatible API" it would be great to have also a simple example for the pure REST call instead of the need to import the OpenAI package.
srameshc
07/11/2025
If I remember, BentoML was about MLOps, I remember trying it about a year back. Did the company pivot ?
holografix
07/11/2025
Very good reference thanks for collating this!
Domainzsite
07/11/2025
[dead]

LLM Inference Handbook

sherlockxu

DiabloD3

leopoldj

DiabloD3

nl

DiabloD3

nl

DiabloD3

ChromaticPanic

criemen

sherlockxu

armcat

sethherr

subset

aarnphm

larme

aligundogdu

Jimmc414

aligundogdu

gchadwick

qrios

sherlockxu

srameshc

aarnphm

fsjayess

holografix

Domainzsite