Tiled Hacker news on React Router

What even is a small language model now?

109 points - 05/21/2025

Source

antirez
05/24/2025
Very small: can run on the edge to allow something like a Raspberry Pi to make basic decisions for your appliance even if disconnected from the internet. Example: those are some time series parameters and instructions, decide if watering the plants or not; vision models that can watch a camera and transcribe what it is seeing in a basic way, ...
Small: runs in an average laptop not optimized for inference of LLMs, like Gemma 3 4B.
Medium: runs in a very high spec computer that people can buy for less than 5k. 30B, 70B dense models or larger MoEs.
Large: Models that big LLM providers sell as "mini", "flash", ...
Extra Large / SOTA: Gemini 2.5 PRO, Claude 4 Opus, ChatGPT O3, ...
nickpsecurity
05/24/2025
The term is too overloaded.
I'll add one more: a LLM small enough that it can be trained from scratch on one A100 in 24 hours. Is it really small if it takes $10,000 to train? Or leave that term for $200 models?
Back to your definitions, there are sub-1B models people are using. I think I saw one in the 400-600M range for audio. Another person posted here a 100M-200M model for extracting data from web pages. We told them to just use a rules-based approach where possible but they believed the SLM worked better.
Then, there's projects like BabyLM that can be useful at 10M:
https://babylm.github.io/
zellyn
05/24/2025
I think of “fits on the overpowered M1/2/3/4 64GB MacBook Pro my employer gave me” as the dividing line. We’re getting to within spitting distance of models that can code well at that size.
armcat
05/24/2025
There is a "small language model", and then there is a "small LARGE language model". In late 2018, BERT (110 million params) would've been considered a "large" language model. A "small" LM would be some markov chain or a topic model (e.g. latent dirichlet allocation) - technically they would be considered generative language models since they learn joint distributions of params and data (words), and can then sample from that distribution. But today, we usually map "small" LMs to "small" LLMs, so in that sense a small LLM would be anything from BERT to around 3-4B params.
breckinloggins
05/24/2025
Maybe we should appropriate the old DOS/x86 memory model names and give them “class-relative” sizes.
“tiny” can run on a microcontroller, “compact” on a Rpi, “small” on a phone, “medium” on a single GPU machine, “large” on AI class workstation hardware, and “huge” on a data center cluster.
alexpham14
05/24/2025
I appreciate how it redefines “small” not by parameter count but by practical impact and deployability.
croes
05/24/2025
How can a Large Language Model be a small language model?
srikz
05/24/2025
I want to see more models that can be streamed to a browser and run locally via wasm. That would be my hope for small models. In the <100mb range.
mcswell
05/24/2025
> Small models used to mean tiny. Now they mean "runs without drama."
Does this mean without a dedicated electric power plant?
I wanted to say "Right, big-sized. Do you want fries with that?", but I couldn't figure out how to work that in, so I won't say it.
rickstanley
05/24/2025
On this topic, I've been wondering if models are capable of recommending other models for a given machine spec, for example: which model, if any, would be recommended for a laptop with a Ryzen 9 6000S and RTX 3060m (random spec).
Dwedit
05/24/2025
These terms are all relative, but there's also "BabyLlama", which measures its parameter count in millions rather than billions.
GolDDranks
05/24/2025
A traditional Markov model trained (rather, just "fitted") on tokens or words is a small language model.
Havoc
05/25/2025
It’s always been a little arbitrary. Can it fit on 3090 seems like a reasonable cutoff to me for now
stephantul
05/24/2025
This post is 100% rewritten or fully generated by gpt-4o. It has the gpt smell all over it.
KasianFranks
05/24/2025
This is also where MoE shines with a mixture of small and large language models.
option
05/24/2025
whatever fits into gaming GPU such as GeForce 3080
Velorivox
05/24/2025
[dead]
MiddleEndian
05/24/2025
Just ask my ex-wife!

What even is a small language model now?

antirez

mnahkies

antirez

mnahkies

galangalalgol

antirez

galangalalgol

nolist_policy

antirez

antirez

greenavocado

mnky9800n

collingreen

dainiusse

kovezd

1over137

kovezd

mnky9800n

thenthenthen

onecommentman

ithkuil

kovezd

ithkuil

amelius

hugh-avherald

layer8

SkiFire13

tayo42

tough

lloydatkinson

bdzr

oezi

nickpsecurity

GardenLetter27

nickpsecurity

monkeyisland

nickpsecurity

zellyn

Maxious

mark_l_watson

api

danielbln

onecommentman

adgjlsfhk1

api

adgjlsfhk1

armcat

breckinloggins

alexpham14

lblume

kergonath

lblume

croes

kelseyfrog

baq

tialaramex

srikz

firejake308

vindex10

relaxing

dainiusse

relaxing

mcswell

rickstanley

Dwedit

GolDDranks

GolDDranks

Havoc

stephantul

gwern

maksimur

stephantul

KasianFranks

option

Velorivox

MiddleEndian