MIT's introductory program on deep learning methods with applications to natural language processing, computer vision, biology, and more! Students will gain foundational knowledge of deep learning algorithms, practical experience in building neural networks, and understanding of cutting-edge topics including large language models and generative AI. Program concludes with a project proposal competition with feedback from staff and panel of industry sponsors. Prerequisites assume calculus (i.e. taking derivatives) and linear algebra (i.e. matrix multiplication), we'll try to explain everything else along the way! Experience in Python is helpful but not necessary. Listeners are welcome!
The 2026 in-person edition has completed and was held in MIT Room 32-123. The online edition of the course is live on Monday at 10am ET, every week!
We are expecting very elementary knowledge of linear algebra and calculus. How to multiply matrices, take derivatives and apply the chain rule. Familiarity in Python is a big plus as well. The program will be beginner friendly since we have many registered students from outside of computer science.
If you would like to receive related updates and lecture materials please subscribe to our YouTube channel and sign up for our mailing list.
All materials are open-sourced to the world for free and are copyrighted under the MIT license. If you are an instructor and would like to use any materials from this program (slides, labs, code), you must add the following reference to each slide:
© Alexander Amini and Ava Amini
MIT Introduction to Deep Learning
IntroToDeepLearning.com
All materials are copyrighted and licensed under the MIT license. If you are an instructor and would like to use any materials from this program (slides, labs, code), you must add the following reference to each slide:
© Alexander Amini and Ava Amini
MIT 6.S191: Introduction to Deep Learning
IntroToDeepLearning.com
If you are an MIT student, postdoc, faculty, or affiliate and would like to become involved with this program please email introtodeeplearning-staff@mit.edu. We are always accepting new applications to join the program staff.
This class would not be possible without our amazing sponsors and has been sponsored by Google, IBM, NVIDIA, Microsoft, Amazon, LambdaLabs, Tencent AI, Ernst and Young, and Onepanel. If you are interested in becoming involved in this program as a sponsor please contact us at introtodeeplearning-staff@mit.edu .
Daniela Rus
EECS Faculty Sponsor
Anisha Parsan
Lead TA
Shrika Eddula
Lead TA
Victory Yinka-Banjo
Teaching Assistant
Adrian Mittal
Teaching Assistant
Vanessa Xiao
Teaching Assistant
Jeannie She
Teaching Assistant
Benjamin Najib
Teaching Assistant
John Werner
Community & Strategy
Copyright © MIT 6.S191. banner image
Scientific discovery is a central driver of human progress and is grounded in the iterative formulation and refinement of hypotheses, tested against the physical world. While AI can amplify many aspects of this process, there are also specific new opportunities to accelerate key aspects by many orders of magnitude. This talk examines how modern deep learning methods are being integrated into discovery pipelines, focusing on concrete examples from atmospheric modelling, materials design, and drug discovery.
Christopher Bishop is a Microsoft Technical Fellow and a member of Microsoft Research AI for Science. Chris obtained a BA in Physics from Oxford, and a PhD in Theoretical Physics from the University of Edinburgh, with a thesis on quantum field theory. He joined Microsoft in 1997 and was Lab Director of Microsoft Research Cambridge from 2015 until 2022 when he founded the new AI for Science team. At Microsoft Research, Chris oversees a global portfolio of research, focussed on machine learning for the natural sciences.
This lecture talks about how to scale training of deep neural networks to thousands of GPUs. It begins by motivating why GPUs are essential for training (comparing FLOPs of GPUs vs CPUs) and why scaling to larger models and datasets improves performance, drawing on scaling laws from LLaMA and Kaplan et al. The talk then explores the memory requirements of training and techniques to reduce them, including activation checkpointing and offloading. The bulk of the lecture covers parallelism strategies: data parallelism, tensor parallelism, pipeline parallelism, and sequence/context parallelism, as well as sharding approaches like DeepSpeed ZeRO and FSDP. It also touches on sparsity through Mixture of Experts and expert parallelism. Throughout, network bandwidth is highlighted as a key bottleneck. The lecture concludes with a case study of LFM2 showing how these techniques combine in practice.
Mathias Lechner is Co-Founder and Chief Technology Officer (CTO) at Liquid AI, as well as a Research Affiliate at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT, where he collaborates with Prof. Daniela Rus. He completed his PhD in 2022 at the Institute of Science and Technology Austria (ISTA), under the supervision of Tom Henzinger. Before his PhD, he earned his master’s (2017) and bachelor’s (2016) degrees in Computer Science from the Vienna University of Technology (TU Wien).
Coming soon!
Coming soon!
In 1942, Isaac Asimov introduced the Three Laws of Robotics as a literary ethical framework to explore robot safety and prevent harm to humans. Until recently, these concepts were purely theoretical in relation to real AI. However, more than 80 years later, the challenge of creating a robust ethical and safety layer for autonomous systems is a pressing reality. In this presentation, we will explore the core ideas behind Asimov's laws and conduct interactive, hands-on demonstrations that utilize and challenge current Deep Learning (DL) techniques. By examining the application and inherent limitations of modern safety protocols in DL systems, we will consider Three New Laws of AI designed for contemporary intelligent systems.
Douglas Blank is the Head of Research at Comet ML, where he works with many teams, including Engineering, Customer Success, and Product Design. Prior to Comet, Douglas completed his PhD in Computer Science and Cognitive Science from Indiana University, Bloomington. His thesis explored the training of neural networks to make analogies. He taught courses in Robotics, Cognitive Science, and Computer Science at Bryn Mawr College, where he created a research agenda called "Developmental Robotics," focusing on using Deep Learning as the foundation for a mentally developing robot.