CSC413/2516 Neural Networks and Deep Learning (Winter 2023)

Teaching staff:

Instructor and office hours:
- Bo Wang, Mon 2-3pm, gathertown
TAs: Aida Ramezani (Head TA), Haonan Duan, Shihao Ma, Mustafa Ammous

Piazza: Students are encouraged to sign up Piazza to join course discussions. If your question is about the course material and doesn’t give away any hints for the homework, please post to Piazza so that the entire class can benefit from the answer.

Lecture and tutorial hours:

	Time	Location
Tutorial Lec0301	Tues 1-2 pm Thur 1-3 pm	ES B142

Prerequisites: Students should have taken courses on machine learning, linear algebra, and multivariate calculus. Further, it is recommended that students have some basic familiarity with statistical concepts. Finally, students must be proficient in reading and writing Python code. For a list of courses that serve as prerequisites for the undergraduate version of this course, see here.

Course Overview:

Deep learning is the branch of machine learning focused on training neural networks. Neural networks have proven to be powerful across a wide range of domains and tasks, including computer vision, natural language processing, speech recognition, and beyond. The success of these models is partially thanks to the fact that their performance tends to improve as more and more data is used to train them. Further, there have been many advances over the past few decades that have made it easier to attain good performance when using neural networks. In this course, we will provide a thorough introduction to the field of deep learning. We will cover the basics of building and optimizing neural networks in addition to specifics of different model architectures and training schemes. The course will cover portions of the “Dive into Deep Learning” textbook.

Assignments:

	Handout	Due
Assignment 1	pdf, starter code(make a copy in your own Drive)	Sept. 5(out), due Sept. 26
Assignment 2	pdf	Sept. 26(out), due Oct. 17
Assignment 3	pdf	Oct. 17(out), due Nov. 7
Assignment 4	pdf	Nov. 7(out), due Nov. 28
Course Project	pdf	Oct. 23(out)

Calendar:

The following schedule is tentative; the content of each lecture may change depending on pacing. All readings refer to corresponding sections in “Dive into Deep Learning”S. Because the book is occasionally updated, the sections listed may become out-of-date. If a reading seems incongruous with the topic of the lecture, please let me know and I will check if the sections changed. Tutorials will more directly cover the background and tools needed for each homework assignment or, when preceding an exam, will consist of an exam review.

	Date	Topic	Slides	Suggested Readings	Homework
Lecture 1	Sept 5	Class introduction, linear & logistic regression	Slides	2.1-2.7 (optional), 3.1-3.5, 4.1-4.5; Roger Grosse’s notes: Linear Regression, Linear Classifiers, Training a Classifier	H1 assigned
Lecture 2	Sept 12	Multilayer Perceptrons & Backpropagation	Slides	3.6-3.7, 5.1-5.4, 5.6; Roger Grosse’s notes: Multilayer Perceptrons, Backpropagation
Lecture 3	Sept 19	Optimization & Generalization	Slides	12.1-12.6, 12.10; Roger Grosse’s notes: Automatic Differentiation, Distributed Representations, Optimization
Lecture 4	Sept 26	Convolutional Neural Networks and Image Classification	Slides	7.1-7.5; Roger Grosse’s notes: ConvNets, Image Classification. Related papers: Yann LeCun’s 1998 LeNet, AlexNet	H1 due, H2 assigned
Lecture 5	Oct 03	Batch/layer normalization, residual connections	Slides	8.5-8.6, Roger Grosse’s notes: Generalization, Exploding Vanishing Gradients. Related papers: Dropout, ResNet
Lecture 6	Oct 10	Recurrent Neural Networks, sequence-to-sequence learning	Slides	9.1-9.7, 10.1-10.8, Roger Grosse’s notes: RNNs, Exploding Vanishing Gradients. Related papers: LSTM, ResNet, Neural machine translation
Midterm exams	Oct 17				H2 due, H3 assigned
Lecture 8	Oct 24	Attention, Transformers and Autoregressive Models	Slides	11.1-11.7, 15.8-15.10; Related papers: Transformers, BERT pre-training, PixelRNNs, WaveNet, PixelCNNs.
Reading Week	Oct 31
Lecture 9	Nov 07	Generative Models: GAN, VAE, LLMs	Slides	11.8-11.9	H3 due, H4 assigned
Lecture 10	Nov 14	Diffusion model, vision language model	Slides
Lecture 11	Nov 21	Additional architecture grab bag: GNNs, autoencoders, UNet, MoE	Slides
Lecture 12	Nov 28	Deep learning engineering; fairness, accountability, transparency, and recent trends in deep learning		13.5-13.6, 4.7	H4 due

Logistics:

Grading:

Homework, 50 points: There will be 4 homework assignments. Homework will consist of some combination of math and coding. Each homework is worth 12.5 points.
Midterm, 20 points: The midterm will take place on 10/17 and will cover all topics discussed before the midterm.
Final Project, 30 points: For the course project, you will implement a research idea related to the course material. Details will be released later.

Late work, collaboration rules, and the honor code:

Every student has a total of 7 grace days to extend the coursework deadlines through the semester. Each grace day allows for a 24 hours deadline extension without late penalty. That is, you may apply the grace days on a late submission to remove its late penalty. The maximum extension you may request is up to the remaining grace days you have banked up. We will keep track of the grace days on MarkUs. After the grace period, assignments will be accepted up to 3 days late, but 10% will be deducted for each day late, rounded up to the nearest day. After that, submissions will not be accepted and will receive a score of 0.

You are welcome to work together with other students on the homework. You are also welcome to use any resources you find (online tutorials, textbooks, papers, chatbots, etc.) to help you complete the homework. However, you must list any collaboration or resources you used to complete each homework on each assignment. If you hand in homework that involved collaboration and/or makes use of content that you did not create and you do not disclose this, you will get a 0 for that homework. In addition, it is likely that you will be able to use some resource (be it another student, ChatGPT, or whatever) that can help you solve many of the homework problems. However, note that if you rely too much on such resources you will likely not learn the material and will do poorly on the exams, during which such resources will not be available.

Resource:

Type	Name	Description
Related Textbooks	Deep Learning (Goodfellow at al., 2016)	The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning.
	Information Theory, Inference, and Learning Algorithms (MacKay, 2003)	A good introduction textbook that combines information theory and machine learning.
General Framework	PyTorch	An open source deep learning platform that provides a seamless path from research prototyping to production deployment.
Computation Platform	Colab	Colaboratory is a free Jupyter notebook environment that requires no setup and runs entirely in the cloud.
	GCE	Google Compute Engine delivers virtual machines running in Google’s innovative data centers and worldwide fiber network.
	AWS-EC2	Amazon Elastic Compute Cloud (EC2) forms a central part of Amazon.com’s cloud-computing platform, Amazon Web Services (AWS), by allowing users to rent virtual computers on which to run their own computer applications.