Long Form Explainers
Long-form, interactive walkthroughs
A growing collection of long-form, interactive explainers — each one builds an idea from the ground up, defines every term the first time it appears, and gives you widgets to poke at instead of just walls of text.
How a large language model is actually built — from the next-token objective and the gradient that trains it, through the GPU memory budget, precision tricks, and parallelism that make trillion-token runs possible, to the architecture and data innovations of every major paper from "Attention Is All You Need" to the 2026 frontier models. Pre-training only.
How a raw pre-trained model becomes a helpful, honest, reasoning assistant — the whole post-training stack in historical order. From supervised fine-tuning and instruction tuning, through RLHF, reward models, PPO and the policy-gradient family, to DPO and the offline-preference wave, and on to RL from verifiable rewards, the reasoning era (o1, DeepSeek-R1), GRPO and its 2026 refinements, and agentic tool-use RL. Post-training only.
How modern Large Language Models actually run — from the moment text becomes a list of tokens, through every matrix multiplication inside a transformer, to the memory tricks production serving systems like vLLM use to keep an Nvidia GPU saturated.