CSC413/2516 Fall 2024
Neural Networks and Deep Learning


Teaching staff:

  • Instructor and office hours:
  • TAs: Aida Ramezani (Head TA), Haonan Duan, Shihao Ma, Mustafa Ammous

Piazza: Students are encouraged to sign up Piazza to join course discussions. If your question is about the course material and doesn’t give away any hints for the homework, please post to Piazza so that the entire class can benefit from the answer.

Lecture and tutorial hours:

  Time Location
Tutorial
Lec0301
Tues 1-2 pm
Thur 1-3 pm
ES B142

Prerequisites: Students should have taken courses on machine learning, linear algebra, and multivariate calculus. Further, it is recommended that students have some basic familiarity with statistical concepts. Finally, students must be proficient in reading and writing Python code. For a list of courses that serve as prerequisites for the undergraduate version of this course, see here.



Course Overview:

Deep learning is the branch of machine learning focused on training neural networks. Neural networks have proven to be powerful across a wide range of domains and tasks, including computer vision, natural language processing, speech recognition, and beyond. The success of these models is partially thanks to the fact that their performance tends to improve as more and more data is used to train them. Further, there have been many advances over the past few decades that have made it easier to attain good performance when using neural networks. In this course, we will provide a thorough introduction to the field of deep learning. We will cover the basics of building and optimizing neural networks in addition to specifics of different model architectures and training schemes. The course will cover portions of the “Dive into Deep Learning” textbook.


Assignments:

  Handout Due
Assignment 1 pdf, starter code(make a copy in your own Drive) Sept. 5(out), due Sept. 26
Assignment 2 pdf Sept. 26(out), due Oct. 17
Assignment 3 pdf Oct. 17(out), due Nov. 7
Assignment 4    
Course Project pdf Oct. 23(out)



Calendar:

The following schedule is tentative; the content of each lecture may change depending on pacing. All readings refer to corresponding sections in “Dive into Deep Learning”S. Because the book is occasionally updated, the sections listed may become out-of-date. If a reading seems incongruous with the topic of the lecture, please let me know and I will check if the sections changed. Tutorials will more directly cover the background and tools needed for each homework assignment or, when preceding an exam, will consist of an exam review.

  Date     Topic Slides Suggested Readings Homework
Lecture 1 Sept 5 Class introduction, linear & logistic regression Slides 2.1-2.7 (optional), 3.1-3.5, 4.1-4.5; Roger Grosse’s notes: Linear Regression, Linear Classifiers, Training a Classifier H1 assigned
Lecture 2 Sept 12 Multilayer Perceptrons & Backpropagation Slides 3.6-3.7, 5.1-5.4, 5.6; Roger Grosse’s notes: Multilayer Perceptrons, Backpropagation  
Lecture 3 Sept 19 Optimization & Generalization Slides 12.1-12.6, 12.10; Roger Grosse’s notes: Automatic Differentiation, Distributed Representations, Optimization  
Lecture 4 Sept 26 Convolutional Neural Networks and Image Classification Slides 7.1-7.5; Roger Grosse’s notes: ConvNets, Image Classification. Related papers: Yann LeCun’s 1998 LeNet, AlexNet H1 due, H2 assigned
Lecture 5 Oct 03 Batch/layer normalization, residual connections Slides 8.5-8.6, Roger Grosse’s notes: Generalization, Exploding Vanishing Gradients. Related papers: Dropout, ResNet  
Lecture 6 Oct 10 Recurrent Neural Networks, sequence-to-sequence learning Slides 9.1-9.7, 10.1-10.8, Roger Grosse’s notes: RNNs, Exploding Vanishing Gradients. Related papers: LSTM, ResNet, Neural machine translation  
Midterm exams Oct 17       H2 due, H3 assigned
Lecture 8 Oct 24 Attention, Transformers and Autoregressive Models Slides 11.1-11.7, 15.8-15.10; Related papers: Transformers, BERT pre-training, PixelRNNs, WaveNet, PixelCNNs.  
Reading Week Oct 31        
Lecture 9 Nov 07 Generative Models: GAN, VAE, LLMs Slides 11.8-11.9 H3 due, H4 assigned
Lecture 10 Nov 14 Diffusion model, vision language model      
Lecture 11 Nov 21 Additional architecture grab bag: GNNs, autoencoders, UNet, MoE      
Lecture 12 Nov 28 Deep learning engineering; fairness, accountability, transparency, and recent trends in deep learning   13.5-13.6, 4.7 H4 due

Logistics:

Grading:

  • Homework, 50 points: There will be 4 homework assignments. Homework will consist of some combination of math and coding. Each homework is worth 12.5 points.
  • Midterm, 20 points: The midterm will take place on 10/17 and will cover all topics discussed before the midterm.
  • Final Project, 30 points: For the course project, you will implement a research idea related to the course material. Details will be released later.

Late work, collaboration rules, and the honor code:

Every student has a total of 7 grace days to extend the coursework deadlines through the semester. Each grace day allows for a 24 hours deadline extension without late penalty. That is, you may apply the grace days on a late submission to remove its late penalty. The maximum extension you may request is up to the remaining grace days you have banked up. We will keep track of the grace days on MarkUs. After the grace period, assignments will be accepted up to 3 days late, but 10% will be deducted for each day late, rounded up to the nearest day. After that, submissions will not be accepted and will receive a score of 0.

You are welcome to work together with other students on the homework. You are also welcome to use any resources you find (online tutorials, textbooks, papers, chatbots, etc.) to help you complete the homework. However, you must list any collaboration or resources you used to complete each homework on each assignment. If you hand in homework that involved collaboration and/or makes use of content that you did not create and you do not disclose this, you will get a 0 for that homework. In addition, it is likely that you will be able to use some resource (be it another student, ChatGPT, or whatever) that can help you solve many of the homework problems. However, note that if you rely too much on such resources you will likely not learn the material and will do poorly on the exams, during which such resources will not be available.


Resource:

Type Name Description
Related Textbooks Deep Learning (Goodfellow at al., 2016) The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning.
  Information Theory, Inference, and Learning Algorithms (MacKay, 2003) A good introduction textbook that combines information theory and machine learning.
General Framework PyTorch An open source deep learning platform that provides a seamless path from research prototyping to production deployment.
Computation Platform Colab Colaboratory is a free Jupyter notebook environment that requires no setup and runs entirely in the cloud.
  GCE Google Compute Engine delivers virtual machines running in Google’s innovative data centers and worldwide fiber network.
  AWS-EC2 Amazon Elastic Compute Cloud (EC2) forms a central part of Amazon.com’s cloud-computing platform, Amazon Web Services (AWS), by allowing users to rent virtual computers on which to run their own computer applications.