"A clean, from-scratch implementation of the OLMo architecture with KV caching, RoPE, and an efficient autoregressive inference pipeline. Designed as a minimal yet extensible foundation for post-training research, including RLHF, preference optimization, and reasoning-focused systems."
-
Updated
Apr 15, 2026 - Jupyter Notebook