@@ -11,41 +11,38 @@ This repository guides you through the process of building a GPT-style **Large L
1111* ** Publisher** : Manning Publications
1212* ** Link** : [ manning.com/books/build-a-large-language-model-from-scratch] ( https://www.manning.com/books/build-a-large-language-model-from-scratch )
1313* ** Free Version** : [ On Github Gist] ( https://gist.github.com/codewithdark-git/e204e6c06546f652e76ced9d479d914e )
14- * ** Donwload pdf** : [ PDF Version] ( https://raw.github.com/codewithdark-git/Building-LLMs-from-scratch/379208ccc204218f0ffc9114464b36d96a97505e/Building%20LLMs%20From%20Scratch.pdf )
14+ * ** Download PDF** : [ PDF Version] ( https://raw.github.com/codewithdark-git/Building-LLMs-from-scratch/379208ccc204218f0ffc9114464b36d96a97505e/Building%20LLMs%20From%20Scratch.pdf )
15+ 1516---
1617
1718## 🗓️ Weekly Curriculum Overview
1819
19- ### 🔹 Week 1: Core Concepts of Language Modeling
20+ ### 🔹 Week 1: Foundations of Language Models
2021
21- * Set up your development environment and explore foundational concepts in NLP and tokenization.
22- * Learn how to numerically encode language, build vocabularies, and understand token embeddings.
23- * Grasp the importance of attention mechanisms and understand how to implement them manually.
22+ * Set up the environment and tools.
23+ * Learn about tokenization, embeddings, and the idea of a "language model".
24+ * Encode input/output sequences and build basic forward models.
25+ * Understand unidirectional processing and causal language modeling.
2426
25- ---
27+ ### 🔹 Week 2: Building the Transformer Decoder
2628
27- ### 🔹 Week 2: Building the Transformer
29+ * Explore Transformer components: attention, multi-head attention, and positional encoding.
30+ * Implement residual connections, normalization, and feedforward layers.
31+ * Build a GPT-style decoder-only transformer architecture.
2832
29- * Dive into the architecture of Transformer models from the ground up.
30- * Learn about positional encoding, residual connections, normalization, and multi-head attention.
31- * Construct and test a decoder-style Transformer (like GPT) with causal masking.
33+ ### 🔹 Week 3: Training and Dataset Handling
3234
33- ---
35+ * Load and preprocess datasets like TinyShakespeare.
36+ * Implement batch creation, context windows, and training routines.
37+ * Use cross-entropy loss, optimizers, and learning rate schedulers.
38+ * Monitor perplexity and improve generalization.
3439
35- ### 🔹 Week 3: Training and Optimization
40+ ### 🔹 Week 4: Text Generation and Deployment
3641
37- * Prepare and preprocess datasets such as TinyShakespeare or WikiText.
38- * Create efficient data pipelines and define model training loops.
39- * Apply optimizer strategies, monitor model perplexity, and manage model checkpoints.
40- 41- ---
42- 43- ### 🔹 Week 4: Evaluation and Hugging Face Deployment
44- 45- * Implement text generation methods including greedy and top-k sampling.
46- * Evaluate the model's outputs and compare them with other LLMs.
47- * Learn how to convert your model for Hugging Face Hub and push it live.
48- * Create a Hugging Face Space using Gradio to serve your model with an interactive UI.
42+ * Generate text using greedy, top-k, top-p, and temperature sampling.
43+ * Evaluate and tune generation.
44+ * Export and convert model for Hugging Face compatibility.
45+ * Deploy via Hugging Face Hub and Gradio Space.
4946
5047---
5148
@@ -67,7 +64,7 @@ This repository guides you through the process of building a GPT-style **Large L
6764git clone https://github.com/codewithdark-git/Building-LLMs-from-scratch.git
6865cd Building-LLMs-from-scratch
6966pip install -r requirements.txt
70- ```
67+ ````
7168
7269---
7370
@@ -79,6 +76,7 @@ Building-LLMs-from-scratch/
7976├── models/ # Model architectures & checkpoints
8077├── data/ # Preprocessing and datasets
8178├── hf_deploy/ # Hugging Face config & deployment scripts
79+ ├── theoretical/ # Podcast & theoretical discussions
8280├── utils/ # Helper scripts
8381├── requirements.txt
8482└── README.md
@@ -108,3 +106,4 @@ You’ll find detailed instructions inside the `hf_deploy/` folder.
108106## 📄 License
109107
110108MIT License — see the `LICENSE` file for details.
109+
0 commit comments