Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

codewithdark-git/Building-LLMs-from-scratch

Repository files navigation

🧠 Building LLMs from Scratch – A 30-Day Journey

This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.


πŸ“˜ Reference Book


πŸ—“οΈ Weekly Curriculum Overview

πŸ”Ή Week 1: Foundations of Language Models

  • Set up the environment and tools.
  • Learn about tokenization, embeddings, and the idea of a "language model".
  • Encode input/output sequences and build basic forward models.
  • Understand unidirectional processing and causal language modeling.

πŸ”Ή Week 2: Building the Transformer Decoder

  • Explore Transformer components: attention, multi-head attention, and positional encoding.
  • Implement residual connections, normalization, and feedforward layers.
  • Build a GPT-style decoder-only transformer architecture.

πŸ”Ή Week 3: Training and Dataset Handling

  • Load and preprocess datasets like TinyShakespeare.
  • Implement batch creation, context windows, and training routines.
  • Use cross-entropy loss, optimizers, and learning rate schedulers.
  • Monitor perplexity and improve generalization.

πŸ”Ή Week 4: Text Generation and Deployment

  • Generate text using greedy, top-k, top-p, and temperature sampling.
  • Evaluate and tune generation.
  • Export and convert model for Hugging Face compatibility.
  • Deploy via Hugging Face Hub and Gradio Space.

πŸ› οΈ Getting Started

Prerequisites

  • Python 3.8+
  • PyTorch
  • NumPy
  • Matplotlib
  • JupyterLab or Notebooks
  • Hugging Face libraries: transformers, datasets, huggingface_hub
  • gradio for deployment

Installation

git clone https://github.com/codewithdark-git/Building-LLMs-from-scratch.git
cd Building-LLMs-from-scratch
pip install -r requirements.txt

πŸ“ Project Structure

Building-LLMs-from-scratch/
β”œβ”€β”€ notebooks/ # Weekly learning notebooks
β”œβ”€β”€ models/ # Model architectures & checkpoints
β”œβ”€β”€ data/ # Preprocessing and datasets
β”œβ”€β”€ hf_deploy/ # Hugging Face config & deployment scripts
β”œβ”€β”€ theoretical/ # Podcast & theoretical discussions
β”œβ”€β”€ utils/ # Helper scripts
β”œβ”€β”€ requirements.txt
└── README.md

πŸš€ Hugging Face Deployment

This project includes:

  • Scripts to convert the model for πŸ€— Transformers compatibility
  • Uploading to Hugging Face Hub
  • Launching an interactive demo on Hugging Face Spaces using Gradio

You’ll find detailed instructions inside the hf_deploy/ folder.


πŸ“š Resources


πŸ“„ License

MIT License β€” see the LICENSE file for details.

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /