codewithdark-git/Building-LLMs-from-scratch

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github		.github
notebooks		notebooks
theoretical		theoretical
utils		utils
.gitignore		.gitignore
Building LLMs From Scratch.pdf		Building LLMs From Scratch.pdf
LICENSE		LICENSE
README.md		README.md

Repository files navigation

🧠 Building LLMs from Scratch – A 30-Day Journey

This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.

📘 Reference Book

Title: Build a Large Language Model (From Scratch)
Author: Sebastian Raschka
Publisher: Manning Publications
Link: manning.com/books/build-a-large-language-model-from-scratch
Free Version: Github Gist
Download PDF: PDF Version
Resourse: Check the resource from where I learned.

🗓️ Weekly Curriculum Overview

🔹 Week 1: Foundations of Language Models

Set up the environment and tools.
Learn about tokenization, embeddings, and the idea of a "language model".
Encode input/output sequences and build basic forward models.
Understand unidirectional processing and causal language modeling.

🔹 Week 2: Building the Transformer Decoder

Explore Transformer components: attention, multi-head attention, and positional encoding.
Implement residual connections, normalization, and feedforward layers.
Build a GPT-style decoder-only transformer architecture.

🔹 Week 3: Training and Dataset Handling

Load and preprocess datasets like TinyShakespeare.
Implement batch creation, context windows, and training routines.
Use cross-entropy loss, optimizers, and learning rate schedulers.
Monitor perplexity and improve generalization.

🔹 Week 4: Text Generation and Deployment

Generate text using greedy, top-k, top-p, and temperature sampling.
Evaluate and tune generation.
Export and convert model for Hugging Face compatibility.
Deploy via Hugging Face Hub and Gradio Space.

Build Models

FaseehGPT is an advanced pipeline for training a GPT-style language model specifically designed for the Arabic language. FaseehGPT

🛠️ Getting Started

Prerequisites

Python 3.8+
PyTorch
NumPy
Matplotlib
JupyterLab or Notebooks
Hugging Face libraries: transformers, datasets, huggingface_hub
gradio for deployment

Installation

git clone https://github.com/codewithdark-git/Building-LLMs-from-scratch.git
cd Building-LLMs-from-scratch
pip install -r requirements.txt

📁 Project Structure

Building-LLMs-from-scratch/
├── notebooks/ # Weekly learning notebooks
├── models/ # Model architectures & checkpoints
├── data/ # Preprocessing and datasets
├── hf_deploy/ # Hugging Face config & deployment scripts
├── theoretical/ # Podcast & theoretical discussions
├── utils/ # Helper scripts
├── requirements.txt
└── README.md

🚀 Hugging Face Deployment

This project includes:

Scripts to convert the model for 🤗 Transformers compatibility
Uploading to Hugging Face Hub
Launching an interactive demo on Hugging Face Spaces using Gradio

You’ll find detailed instructions inside the hf_deploy/ folder.

📚 Resources

📄 License

MIT License — see the LICENSE file for details.

About

This repository guides you through the process of building a GPT-style Large Language Model (LLM) from scratch using PyTorch. The structure and approach are inspired by the book Build a Large Language Model (From Scratch) by Sebastian Raschka.

gist.github.com/codewithdark-git/e204e6c06546f652e76ced9d479d9

Sponsor this project

open_collective opencollective.com/codewithdark

Languages

Jupyter Notebook 100.0%

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

License

Uh oh!

codewithdark-git/Building-LLMs-from-scratch

Folders and files

Latest commit

History

Repository files navigation

🧠 Building LLMs from Scratch – A 30-Day Journey

📘 Reference Book

🗓️ Weekly Curriculum Overview

🔹 Week 1: Foundations of Language Models

🔹 Week 2: Building the Transformer Decoder

🔹 Week 3: Training and Dataset Handling

🔹 Week 4: Text Generation and Deployment

Build Models

🛠️ Getting Started

Prerequisites

Installation

📁 Project Structure

🚀 Hugging Face Deployment

📚 Resources

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Sponsor this project

Uh oh!

Languages