An interactive primer
Trace one message through the whole machine — then take every piece apart with your own hands. No math required; bring only curiosity.
Reader's contract. You are smart and curious but you are not an ML engineer, and you don't want to become one. You want to understand — well enough to look a founder or a researcher in the eye and know whether their claim holds water.
This document leads with pictures and analogies, defines every piece of jargon the moment it appears, and never makes you read a wall of text when a diagram would do. Math stays in the basement; we'll only come upstairs for it when a number actually changes how you think.
01 The on-ramp
Before we take anything apart, let's watch the whole machine run once, end to end, on a single real message. Everything later is just a zoom-in on one of these steps. Keep this picture in your head; we'll hang every later idea off it.
You type into a chat box: "How many r's are in strawberry?" You hit enter. To you it feels instant. To the model, your sentence is about to go through six transformations before a single word comes back.
The model answers "two" — wrong — because it never saw the letters. It saw tokens. That single failure is the whole article in miniature: to know why it breaks, you have to know what it actually does.
02 Embeddings
Once text is in tokens, each token is turned into a vector — a long list of numbers that places the word at a point in space. Words that mean similar things land near each other. That's the whole trick behind "the model understands meaning": meaning is position.
Words become coordinates.
Related words land together. Watch the arithmetic: take king, apply the same step that turns man into woman, and you arrive at queen. The offset itself carries the meaning.
Illustrative; real embeddings have thousands of dimensions, flattened here to two. The king−man+woman result comes from static word2vec embeddings, not from inside an LLM.
In a real model this space has thousands of dimensions, not two — but the intuition survives the flattening: direction and distance carry meaning. Follow the dotted arrow above: the step from king to queen is the same move as "man → woman." That's why king − man + woman lands on queen — the offset itself is the meaning.
03 Sampling
When the model picks the next word, it doesn't always pick the most likely one. A single setting — temperature — controls how much it's willing to gamble. Low temperature plays it safe; high temperature takes risks. Drag it and watch.