Ask HN: Build Your Own LLM?
14 points - last Friday at 10:18 AM
The best way to really understand how something works is to build it yourself. So I am wondering if there are any good tutorials on building your own LLM from scratch. I.e. implementing tokenisation, embeddings, attention and so on. I am not suggesting one could replicate chatGPT, but more a toy model that implements the core features but based on a much smaller corpus and training data.
liqilin1567
today at 10:04 AM
There is a new repo of karpathy: https://github.com/karpathy/nanochat. It's a full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase.
khamidou
last Saturday at 5:51 PM
Sorry to self-promote but I did exactly that a few months back: https://khamidou.com/gpt2/
Generally, I think the Karpathy tutorials are a good starting point but they're very mathy (despite people telling you you only need high school math to understand llms, a lot of the abstractions and concepts he uses are a bit foreign to programmers).
I found out rebuilding inference of a known model taught me a lot more than passively sitting through the videos and maybe retyping his code. You should try it with something simple, like a model from a few years back!