MIT's introductory course on deep learning methods with applications to computer vision, natural language processing, biology, and more! Students will gain foundational knowledge of deep learning algorithms and get practical experience in building neural networks in TensorFlow. Course concludes with a project proposal competition with feedback from staff and panel of industry sponsors. Prerequisites assume calculus (i.e. taking derivatives) and linear algebra (i.e. matrix multiplication), we'll try to explain everything else along the way! Experience in Python is helpful but not necessary. This class is taught during MIT's IAP term by current MIT PhD researchers. Listeners are welcome!
Everyday (M-F), 1:00-3:00pm EST
1:00pm-2:00pm: Technical lecture
2:00pm-3:00pm: Software labs and
office hours on MIT Gather.Town
Note: Times above are for MIT students. All course materials will be released to the public afterwards.
[Code]
[Code]
We are expecting very elementary knowledge of linear algebra and calculus. How to multiply matrices, take derivatives and apply the chain rule. Familiarity in Python is a big plus as well. The course will be beginner friendly since we have many registered students from outside of computer science.
If you would like to receive course related updates and lecture materials please subscribe to our YouTube channel.
All course materials available online for free but are copyrighted and licensed under the MIT license. If you are an instructor and would like to use any materials from this course (slides, labs, code), you must add the following reference to each slide:
© Alexander Amini and Ava Soleimany
MIT 6.S191: Introduction to Deep Learning
IntroToDeepLearning.com
All course materials are copyrighted and licensed under the MIT license. If you are an instructor and would like to use any materials from this course (slides, labs, code), you must add the following reference to each slide:
© Alexander Amini and Ava Soleimany
MIT 6.S191: Introduction to Deep Learning
IntroToDeepLearning.com
If you are an MIT student, postdoc, faculty, or affiliate and would like to become involved with this course please email introtodeeplearning-staff@mit.edu. We are always accepting new applications to join the course staff.
This class would not be possible without our amazing sponsors and has been sponsored by Google, IBM, NVIDIA, Ernst and Young, LambdaLabs and Onepanel. If you are interesting in becoming involved in this course as a sponsor please contact us at introtodeeplearning-staff@mit.edu .
Carmen Martin Alonso
William Chen
Kristian Georgiev
Shinjini Ghosh
Julia Moseyko
Jacob Phillips
Ryan Sander
Sam Sledzieski
Gilbert Yang
Copyright © MIT 6.S191. banner image; page template
We combine deep learning and Conditional Probabilistic Context Free Grammars (CPCFG) to create an end-to-end system for extracting structured information from complex documents. For each class of documents, we create a CPCFG that describes the structure of the information to be extracted. Conditional probabilities are modeled by deep neural networks. We use this grammar to parse 2-D documents to directly produce structured records containing the extracted information. This system is trained end-to-end with (Document, Record) pairs. We apply this approach to extract information from scanned invoices achieving state-of-the-art results.
Nigel is a technologist and entrepreneur serving as Global Artificial Intelligence (AI) Leader in Global Innovation at Ernst & Young (EY). In this role, he is responsible for the application of AI throughout EY. As leader of the EY AI Lab, he is responsible for projects driving strategic transformation of how we operate, compete and provide services. He is also strengthening relationships with start-ups and academic communities worldwide. He holds a master’s degree in Mathematics from University College Dublin and a PhD in Machine Learning from the University of California, Santa Cruz. His original research includes the first theoretical papers on gradient boosting.
Deep Learning has made exciting progress on many computer vision problems, but it requires large datasets that can be expensive and time-consuming to collect and label. Datasets also suffer from "dataset bias," which happens when the training data is not representative of the future deployment domain. Dataset bias is a major pervasive problem in computer vision -- even the most powerful deep neural networks fail to generalize to out-of-sample data. A classic example of this is when a network trained to classify handwritten digits fails to recognize typed digits, but this problem happens in most practical situations, as no finite dataset is rich enough to represent the full complexity of the visual world. Can we solve dataset bias and learn with only a limited amount of supervision? Indeed, we can, under certain assumptions. I will describe past and recent work based on domain adaptation of deep learning models and point out several assumptions these methods make and situations they fail to handle. I will also describe recent efforts to improve adaptation by using unlabeled data to learn better features, with ideas from semi-supervised and self-supervised learning.
Kate is an Associate Professor of Computer Science at Boston University and a consulting professor for the MIT-IBM Watson AI Lab. She leads the Computer Vision and Learning Group at BU, is the founder and co-director of the Artificial Intelligence Research (AIR) initiative, and member of the Image and Video Computing research group. Kate received a PhD from MIT and did her postdoctoral training at UC Berkeley and Harvard. Her research interests are in the broad area of Artificial Intelligence with a focus on dataset bias, adaptive machine learning, learning for image and language understanding, and deep learning.
3D content is key in several domains such as architecture, film, gaming, and robotics. However, creating 3D content can be very time consuming -- the artists need to sculpt high quality 3d assets, compose them into large worlds, and bring these worlds to life by writing behaviour models that "drives" the characters around in the world. This talk will discuss some of our recent efforts on introducing automation in the 3D content creation process using A.I.
Sanja Fidler is an Associate Professor at the Department of Computer Science, University of Toronto. She joined UofT in 2014. In 2018, she took a role of Director of AI at NVIDIA, leading a research lab in Toronto. Previously she was a Research Assistant Professor at TTI-Chicago, a philanthropically endowed academic institute located in the campus of the University of Chicago. She completed her PhD in computer science at University of Ljubljana in 2010, and was a postdoctoral fellow at University of Toronto during 2011-2012. In 2010 she visited UC Berkeley as a visiting research scientist. She received the NVIDIA Pioneer of AI award, Amazon Academic Research Award, Facebook Faculty Award, and the Connaught New Researcher Award. In 2018 she was appointed as the Canadian CIFAR AI Chair. Her work on semi-automatic object instance annotation won the Best Paper Honorable Mention at CVPR’17. Her main research interests are scene parsing from images and videos, interactive annotation, 3D scene understanding, 3D content creation, and multimodal representations.
Investigate the role of tech and AI in improving healthcare and discuss the challenges we face when deploying in the real world.
Katherine Chou is the Director of Research and Innovations at Google, developing products that apply AI to healthcare and social good. Katherine is a serial intrapreneur at Google with a history of incubating products and establishing sustainable businesses. She previously developed products within Google[x] Labs for Life Sciences (now Verily) and ran global teams to develop partner solutions and establish developer ecosystems for Mobile Payments, Mobile Search, GeoCommerce, and Android. She is also a co-founder and committee chair for the AI for Social Good program at Google. Outside of Google, she is a Board member and Program Chair of Lewa Wildlife Conservancy, a fellow of the Zoological Society of London, and collaborates with other wildlife NGOs and the Cambridge Business Sustainability Programme in applying the Silicon Valley innovation mindset to new areas. She holds a double major in Computer Science and Economics at Stanford University and an M.S. in CS.