Name	Name	Last commit message	Last commit date
Latest commit History 1,360 Commits
Chapter01	Chapter01
Chapter02	Chapter02
Chapter03	Chapter03
Chapter04	Chapter04
Chapter05	Chapter05
Chapter06	Chapter06
Chapter07	Chapter07
Chapter08	Chapter08
Chapter09	Chapter09
Chapter10	Chapter10
Chapter11	Chapter11
Chapter12	Chapter12
Chapter13	Chapter13
Chapter14	Chapter14
Chapter15	Chapter15
Chapter16	Chapter16
Chapter17	Chapter17
Chapter18	Chapter18
Chapter19	Chapter19
Chapter20	Chapter20
Notebook images	Notebook images
CHANGELOG.md	CHANGELOG.md
LICENSE	LICENSE
README.md	README.md
Transformers_3rd_Edition.jpg	Transformers_3rd_Edition.jpg
errata.md	errata.md

Transformers for Natural Language Processing and Computer Vision: Take Generative AI and LLMs to the next level with Hugging Face, Google Vertex AI, ChatGPT, GPT-4V, and DALL-E 3 3rd Edition

by Denis Rothman

drawing

This repo is continually updated and upgraded.

Last updated: August 14, 2025
📝 For details on updates and improvements, see the Changelog.

🚩If you see anything that doesn't run as expected, raise an issue, and we'll work on it!

Look for 🐬 to explore new bonus notebooks such as and DeepSeek-R1 and OpenAI o1 reasoning models, Midjourney's API, Google Vertex AI Gemini's API, OpenAI asynchronous batch API calls!
Look for 🎏 to explore existing notebooks for the latest model or platform releases, such as OpenAI's latest models (GPT-4o and o1).
Look for 🛠 to run existing notebooks with new dependency versions and platform API constraints and tweaks.

Transformers-for-NLP-and-Computer-Vision-3rd-Edition

This is the code repository for Transformers for Natural Language Processing and Computer Vision, published by Packt.

Explore Generative AI and Large Language Models with Hugging Face, ChatGPT, GPT-4V, and DALL-E 3

About the book

Transformers for Natural Language Processing and Computer Vision, Third Edition, explores Large Language Model (LLM) architectures, applications, and various platforms (Hugging Face, OpenAI, and Google Vertex AI) used for Natural Language Processing (NLP) and Computer Vision (CV).

Dive into generative vision transformers and multimodal model architectures and build applications, such as image and video-to-text classifiers. Go further by combining different models and platforms and learning about AI agent replication.

What you will learn

Learn how to pretrain and fine-tune LLMs
Learn how to work with multiple platforms, such as Hugging Face, OpenAI, and Google Vertex AI
Learn about different tokenizers and the best practices for preprocessing language data
Implement Retrieval Augmented Generation and rules bases to mitigate hallucinations
Visualize transformer model activity for deeper insights using BertViz, LIME, and SHAP
Create and implement cross-platform chained models, such as HuggingGPT
Go in-depth into vision transformers with CLIP, DALL-E 2, DALL-E 3, and GPT-4V

Chapters

What Are Transformers?
Getting Started with the Architecture of the Transformer Model
Emergent vs Downstream Tasks: The Unseen Depths of Transformers
Advancements in Translations with Google Trax, Google Translate, and Gemini
Diving into Fine-Tuning through BERT
Pretraining a Transformer from Scratch through RoBERTa
The Generative AI Revolution with ChatGPT
Fine-Tuning OpenAI GPT Models
Shattering the Black Box with Interpretable Tools
Investigating the Role of Tokenizers in Shaping Transformer Models
Leveraging LLM Embeddings as an Alternative to Fine-Tuning
Toward Syntax-Free Semantic Role Labeling with ChatGPT and GPT-4
Summarization with T5 and ChatGPT
Exploring Cutting-Edge LLMs with Vertex AI and PaLM 2
Guarding the Giants: Mitigating Risks in Large Language Models
Beyond Text: Vision Transformers in the Dawn of Revolutionary AI
Transcending the Image-Text Boundary with Stable Diffusion
Hugging Face AutoTrain: Training Vision Models without Coding
On the Road to Functional AGI with HuggingGPT and its Peers
Beyond Human-Designed Prompts with Generative Ideation

Appendix

Appendix: Answers to the Questions

Platforms

You can run the notebooks directly from the table below:

Chapter	Colab	Kaggle	Gradient	StudioLab
Part I The Foundations of Transformer Models
Chapter 1: What are Transformers?
🛠O_1_and_Accelerators.ipynb ChatGPT_Plus_writes_and_explains_AI.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Getting started with DeepSeek-R1 Reasoning models. Integrated into HuggingFace Hub and Together.
🐬DeepSeek_Hugging_Face.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 2: Getting Started with the Architecture of the Transformer Model
🛠Multi_Head_Attention_Sub_Layer.ipynb positional_encoding.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Explaining DeepSeek's Training innovations; Part 1: RL
🐬DeepSeek_R1_Zero_RL.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Explaining DeepSeek's Training innovations; Part 2: RoPE
🐬DeepSeek_attention_head_RoPE.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 3: Emergent vs Downstream Tasks: the Unseen Depths of Transformers
From_training_to_emergence.ipynb Transformer_tasks_with_Hugging_Face.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Chapter 4: Advancements in Translations with Google Trax, Google Translate, and Google Bard
WMT_translations.ipynb Trax_Google_Translate.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Chapter 5: Diving into Fine-Tuning through BERT
BERT_Fine_Tuning_Sentence_Classification_GPU.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 6: Pretraining a Transformer from Scratch through RoBERTa
🎏 KantaiBERT.ipynb 🎏🛠 Customer_Support_for_X.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Part II: The Rise of Suprahuman NLP
Chapter 7: The Generative AI Revolution with ChatGPT
OpenAI_Models.ipynb OpenAI_GPT_4_Assistant.ipynb 🎏Getting_Started_GPT_4_API.ipynb(GPT-4o) 🎏GPT_4_RAG.ipynb(GPT-4o)	Open In Colab Open In Colab Open In Colab Open In Colab	Kaggle Kaggle Kaggle Kaggle	Gradient Gradient Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab
OpenAI Reasoning models: the o1 API
🐬OpenAI_Reasoning_models_o1_API.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
OpenAI Reasoning models: the o1-preview API
🐬OpenAI_Reasoning_models_o3_API.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 8: Fine-tuning OpenAI Models
Fine_tuning_OpenAI_Models.ipynb 🎏Fine_tuning_GPT_4o_mini_SQuAd.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Fine-Tuning GPT-4.1
🎏Fine_tuning_GPT_4.1_mini_SQuAd.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
🐬RAG as an alternative to fine-tuning: Building Scalable Knowledge-based RAG-drive Generative AI
Click here to access an open-source library to implement RAG
Chapter 9: Shattering the Black Box with Interpretable tools
BertViz_Interactive.ipynb Hugging_Face_SHAP.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Chapter 10: Investigating the Role of Tokenizers in Shaping Transformer Models
Tokenizers.ipynb Sub_word_tokenizers.ipynb 🛠Exploring_tokenizers.ipynb	Open In Colab Open In Colab Open In Colab	Kaggle Kaggle Kaggle	Gradient Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Chapter 11: Leveraging LLM Embeddings as an Alternative to Fine-Tuning
🛠Embedding_with_NLKT_Gensim.ipynb 🎏Question_answering_with_embeddings.ipynb 🛠Transfer_Learning_with_Ada_Embeddings.ipynb	Open In Colab Open In Colab Open In Colab	Kaggle Kaggle Kaggle	Gradient Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Chapter 12: Towards Syntax-Free Semantic Role Labeling with BERT and OpenAI's ChatGPT
Semantic_Role_Labeling_GPT-4.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 13: Summarization with T5 and ChatGPT
🛠Summerizing_Text_T5.ipynb Summarizing_ChatGPT.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Chapter 14: Exploring Cutting-Edge NLP with Google Vertex AI(PaLM and🐬Gemini with gemini-1.5-flash-001
Google_Vertex_AI.ipynb 🐬Google_Vertex_AI_Gemini.ipynb	Open In Colab Open In Colab	Kaggle Kaggle	Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Gemini 2.5 Flash showcase of Generative AI tasks
🐬Google_Gemini_2.5_Flash.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 15: Guarding the Giants: Mitigating Risks in Large Language Models<
🎏Auto_Big_bench.ipynb(GPT-4o,synchronous) 🎏Auto_Big_bench.ipynb(GPT-4o-mini,synchronous) 🐬GPT API Speed++ with Asynchronous Batch Calls! 🛠WandB_Prompts_Quickstart.ipynb Encoder_decoder_transformer.ipynb Mitigating_Generative_AI.ipynb	Open In Colab Open In Colab Open In Colab Open In Colab Open In Colab Open In Colab	Kaggle Kaggle Kaggle Kaggle Kaggle Kaggle	Gradient Gradient Gradient Gradient Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Part III: Generative Computer Vision: A New Way to See the World
Chapter 16: Vision Transformers in the Dawn of Revolutionary AI
ViT_CLIP.ipynb Getting_Started_DALL_E_API.ipynb 🎏GPT-4V.ipynb(GPT-4o)	Open In Colab Open In Colab Open In Colab	Kaggle Kaggle Kaggle	Gradient Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Chapter 17: Transcending the Image-Text Boundary with Stable Diffusion
Stable_Diffusion_Keras.ipynb Stable__Vision_Stability_AI.ipynb Stable__Vision_Stability_AI_Animation.ipynb Text_to_video_synthesis.ipynb TimeSformer.ipynb	Open In Colab Open In Colab Open In Colab Open In Colab Open In Colab	Kaggle Kaggle Kaggle Kaggle Kaggle	Gradient Gradient Gradient Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab
Stable Diffusion with Hugging Face
🐬Stable_Diffusion_Hugging_Face.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 18: Automated Vision Transformer Training
🛠Hugging_Face_AutoTrain.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 19: On the Road to Functional AGI with HuggingGPT and its Peers
Computer_Vision_Analysis.ipynb	Open In Colab	Kaggle	Gradient	Open In SageMaker Studio Lab
Chapter 20: Generative AI Ideation Vertex AI, Langchain, and Stable Diffusion
Automated_Design.ipynb Midjourney_bot.ipynb 🎏Automated_Ideation.ipynb 🐬 MyMidjourney_API.ipynb	Open In Colab Open In Colab Open In Colab Open In Colab	Kaggle Kaggle Kaggle Kaggle	Gradient Gradient Gradient Gradient	Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab Open In SageMaker Studio Lab

Raise an issue

You can create an issue We will be glad to provide support!Supportin this repository if you encounter one in the notebooks.

Get my copy

If you feel this book is for you, get your copy today! Coding

Know more on the Discord server Coding

You can get more engaged on the Discord server for more latest updates and discussions in the community at Discord

Download a free PDF Coding

If you have already purchased a print or Kindle version of this book, you can get a DRM-free PDF version at no cost. Simply click on the link to claim your Free PDF Coding

We also provide a PDF file that has color images of the screenshots/diagrams used in this book at ColorImages Coding

Get to Know the Author

Denis Rothman graduated from Sorbonne University and Paris-Cité University, designing one of the first patented encoding and embedding systems and teaching at Paris-I Panthéon Sorbonne.He authored one of the first patented word encoding and AI bots/robots. He began his career delivering a Natural Language Processing (NLP) chatbot for Moët et Chandon(LVMH) and an AI tactical defense optimizer for Airbus (formerly Aerospatiale). Denis then authored an AI optimizer for IBM and luxury brands, leading to an Advanced Planning and Scheduling (APS) solution used worldwide. LinkedIn

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition

Folders and files

Latest commit

History

Repository files navigation

Transformers for Natural Language Processing and Computer Vision: Take Generative AI and LLMs to the next level with Hugging Face, Google Vertex AI, ChatGPT, GPT-4V, and DALL-E 3 3rd Edition

Transformers-for-NLP-and-Computer-Vision-3rd-Edition

About the book

What you will learn

Table of Contents

Chapters

Appendix

Platforms

Raise an issue

Get my copy

Know more on the Discord server Coding

Download a free PDF Coding

Get to Know the Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Contributors 2

Languages

License

Denis2054/Transformers-for-NLP-and-Computer-Vision-3rd-Edition

Folders and files

Latest commit

History

Repository files navigation

Transformers for Natural Language Processing and Computer Vision: Take Generative AI and LLMs to the next level with Hugging Face, Google Vertex AI, ChatGPT, GPT-4V, and DALL-E 3 3rd Edition

Transformers-for-NLP-and-Computer-Vision-3rd-Edition

About the book

What you will learn

Table of Contents

Chapters

Appendix

Platforms

Raise an issue

Get my copy

Know more on the Discord server Coding

Download a free PDF Coding

Get to Know the Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages