Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit bd6dec7

Browse files
author
Susannah Klaneček
committed
Add custom diagrams for GPT-2
1 parent 6f9f044 commit bd6dec7

File tree

3 files changed

+3
-7
lines changed

3 files changed

+3
-7
lines changed

‎6_gpt2_finetuned_text_generation.ipynb‎

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -15,17 +15,13 @@
1515
"\n",
1616
"The model, a language model, was trained by just trying to predict the next word for many many millions of documents found on the web. This is called unsupervised learning because we don't have a set of labels we are trying to predict. \n",
1717
"\n",
18-
"The GPT-2 blog post and paper do not go into much detail into how the model was designed. However, we know that they use a transformer architecture. At a high level, the Transformer converts input sequences into output sequences.\n",
18+
"The GPT-2 blog post and paper do not go into much detail into how the model was designed. However, we know that they use a transformer architecture. At a high level, the Transformer converts input sequences into output sequences. It's composed of an encoding component and a decoding component.\n",
1919
"\n",
20-
"![transformer at a high level](http://jalammar.github.io/images/t/the_transformer_3.png)\n",
21-
"\n",
22-
"The Transformer is composed of an encoding component and a decoding component.\n",
23-
"\n",
24-
"![encoding and decoding](http://jalammar.github.io/images/t/The_transformer_encoders_decoders.png)\n",
20+
"![transformer at a high level](images/transformer.png)\n",
2521
"\n",
2622
"The Transformer is actually composed of stacks of encoders and decoders.\n",
2723
"\n",
28-
"![stacks of encoders and decoders](http://jalammar.github.io/images/t/The_transformer_encoder_decoder_stack.png)\n",
24+
"![stacks of encoders and decoders](images/stack_encoders_decoders.png)\n",
2925
"\n",
3026
"We can see a snapshot of how tensors flow through this encoder-decoder architecture:\n",
3127
"\n",

‎images/stack_encoders_decoders.png‎

38.8 KB
Loading[フレーム]

‎images/transformer.png‎

28.4 KB
Loading[フレーム]

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /