Study Notes
Interactive, hands-on explainers of ideas in machine learning and math — built to be poked, dragged, and run.
June 23, 2026
Reinforcement Learning
Machine Learning
Interactive
A World Models agent splits into a large world model trained unsupervised and a tiny controller trained by evolution: V compresses each frame to a latent vector, M learns to predict the next latent given the action, and C maps that compressed state to an action — small enough to train entirely inside the model's own hallucinated rollouts, then transfer to reality.
Read More →
June 22, 2026
Information Theory
Machine Learning
Interactive
The temperature parameter in an LLM divides the logits before the softmax. Raise it and the output distribution flattens, so its entropy rises — and that is not a loose analogy but the literal statistical-mechanics relationship: a softmax is a Boltzmann distribution with energy = −logit, and temperature trades expected energy against entropy, monotonically, from a deterministic argmax to the uniform distribution.
Read More →
June 19, 2026
Finance
Interactive
The Black–Scholes formula — N(d1), N(d2) — is easy to memorise and hard to reconstruct. Rebuilt from the payoff up: an option's price is the discounted, risk-neutral expectation of its payoff, and volatility is what gives it value. Drag the strike, widen the volatility, watch a Monte-Carlo average crawl onto the Black–Scholes price, and read the Greeks as slopes and curvature.
Read More →
June 19, 2026
Statistics
Interactive
How many flips does it take to confidently tell a rigged coin from a fair one? Flip a coin and watch the evidence pile up, see a p-value light up, push the false-alarm and missed-detection curves apart, then read the exact number off one clean formula — and confirm it with a thousand simulated experiments. The punchline: near-fair coins are expensive to expose, and the cost scales like 1/(p−1⁄2)2.
Read More →
June 16, 2026
Information Theory
Interactive
Entropy is usually introduced as a formula — −Σ p log p — with the meaning left implicit. Rebuilt from the ground up: entropy is average surprise, measured in yes/no questions. Drag a coin's bias, reshape a distribution, and watch a sampler's running surprise converge onto the entropy.
Read More →
May 30, 2026
Machine Learning
Interactive
Hopfield networks slide downhill into the nearest memory — and get stuck. Add a single ingredient, temperature, and that deterministic descent becomes a Boltzmann machine that samples, escapes local minima, and learns. A hands-on, visual walk-through with live energy landscapes, annealing, and a trainable RBM.
Read More →
May 21, 2026
Reinforcement Learning
Interactive
A Markov Decision Process models sequential decisions under uncertainty, where today's choice shapes tomorrow's options. An interactive introduction to MDPs and value iteration, worked through one small example: should a 9-year-old study, play, or rest after school?
Read More →