Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 3eaa5a7

Browse files
Update README.md
1 parent 50b764f commit 3eaa5a7

File tree

1 file changed

+109
-2
lines changed

1 file changed

+109
-2
lines changed

‎README.md‎

Lines changed: 109 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,109 @@
1-
# Building-LLMs-from-scratch
2-
Building LLMs from scratch
1+
# 🧠 Building LLMs from Scratch – A 30-Day Journey
2+
3+
This repository guides you through the process of building a GPT-style **Large Language Model (LLM)** from scratch using PyTorch. The structure and approach are inspired by the book ***Build a Large Language Model (From Scratch)*** by **Sebastian Raschka**.
4+
5+
---
6+
7+
## 📘 Reference Book
8+
9+
* **Title**: *Build a Large Language Model (From Scratch)*
10+
* **Author**: Sebastian Raschka
11+
* **Publisher**: Manning Publications
12+
* **Link**: [manning.com/books/build-a-large-language-model-from-scratch](https://www.manning.com/books/build-a-large-language-model-from-scratch)
13+
14+
---
15+
16+
## 🗓️ Weekly Curriculum Overview
17+
18+
### 🔹 Week 1: Core Concepts of Language Modeling
19+
20+
* Set up your development environment and explore foundational concepts in NLP and tokenization.
21+
* Learn how to numerically encode language, build vocabularies, and understand token embeddings.
22+
* Grasp the importance of attention mechanisms and understand how to implement them manually.
23+
24+
---
25+
26+
### 🔹 Week 2: Building the Transformer
27+
28+
* Dive into the architecture of Transformer models from the ground up.
29+
* Learn about positional encoding, residual connections, normalization, and multi-head attention.
30+
* Construct and test a decoder-style Transformer (like GPT) with causal masking.
31+
32+
---
33+
34+
### 🔹 Week 3: Training and Optimization
35+
36+
* Prepare and preprocess datasets such as TinyShakespeare or WikiText.
37+
* Create efficient data pipelines and define model training loops.
38+
* Apply optimizer strategies, monitor model perplexity, and manage model checkpoints.
39+
40+
---
41+
42+
### 🔹 Week 4: Evaluation and Hugging Face Deployment
43+
44+
* Implement text generation methods including greedy and top-k sampling.
45+
* Evaluate the model's outputs and compare them with other LLMs.
46+
* Learn how to convert your model for Hugging Face Hub and push it live.
47+
* Create a Hugging Face Space using Gradio to serve your model with an interactive UI.
48+
49+
---
50+
51+
## 🛠️ Getting Started
52+
53+
### Prerequisites
54+
55+
* Python 3.8+
56+
* PyTorch
57+
* NumPy
58+
* Matplotlib
59+
* JupyterLab or Notebooks
60+
* Hugging Face libraries: `transformers`, `datasets`, `huggingface_hub`
61+
* `gradio` for deployment
62+
63+
### Installation
64+
65+
```bash
66+
git clone https://github.com/codewithdark-git/Building-LLMs-from-scratch.git
67+
cd Building-LLMs-from-scratch
68+
pip install -r requirements.txt
69+
```
70+
71+
---
72+
73+
## 📁 Project Structure
74+
75+
```
76+
Building-LLMs-from-scratch/
77+
├── notebooks/ # Weekly learning notebooks
78+
├── models/ # Model architectures & checkpoints
79+
├── data/ # Preprocessing and datasets
80+
├── hf_deploy/ # Hugging Face config & deployment scripts
81+
├── utils/ # Helper scripts
82+
├── requirements.txt
83+
└── README.md
84+
```
85+
86+
---
87+
88+
## 🚀 Hugging Face Deployment
89+
90+
This project includes:
91+
92+
* Scripts to convert the model for 🤗 Transformers compatibility
93+
* Uploading to Hugging Face Hub
94+
* Launching an interactive demo on Hugging Face Spaces using Gradio
95+
96+
You’ll find detailed instructions inside the `hf_deploy/` folder.
97+
98+
---
99+
100+
## 📚 Resources
101+
102+
* [Transformers Docs](https://huggingface.co/docs/transformers)
103+
* [Hugging Face](https://huggingface.co)
104+
105+
---
106+
107+
## 📄 License
108+
109+
MIT License — see the `LICENSE` file for details.

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /