Course Schedule

wk	Lecture	Links	Reading material
9/1	HW-0 Release	Campuswire	Fundamentals of Linear Algebra and Deep Learning
9/3	Part 1: Course overview Part 2: Supervised learning for control		1. A framework for behavioural cloning, Bain and Sommut, 1999. 2. A reduction of imitation learning and structured prediction to no-regret online learning, Ross et al., 2011.
9/10	Part 1: Introduction to RL Part 2: Value functions		Part 1: Chapter 1 of RL book. Part 2: Chapter 3 of RL book.
9/10	HW-0 Due
9/10	HW-1 Release		Imitation via Supervision.
9/17	Deep Q Learning		1. Human-level control through deep reinforcement learning, Mnih et al., 2015. 2. Rainbow: Combining Improvements in Deep Reinforcement Learning, Hessel et al., 2017. 3. Agent 57, Deepmind blog post
9/24	HW-1 Due
9/24	HW-2 Release		Deep Q Learning++.
9/24	Policy gradients		1. REINFORCE, Williams, 1992. 2. A Natural Policy Gradient, Kakade, 2002. 3. Proximal Policy Optimization Algorithms, Schulman et al. 2017.
10/1	Part 1: Actor-critic methods Part 2: Distributed RL		1. Deterministic Policy Gradient Algorithms, Silver et al., 2014. 2. Asynchronous Methods for Deep Reinforcement Learning, Mnih et al. 2016. 3. Massively Parallel Methods for Deep Reinforcement Learning, Nair et al. 2015.
10/8	HW-2 Due
10/8	HW-3 Release		RL with Policy Gradients.
10/8	Exploration in RL		1. Exploration strategies in deep RL, blog by Lilian Weng, 2020 2. Intrinsic Motivation Systems for Autonomous Mental Development, Oudeyer et al. 2007. 3. Curiosity-driven Exploration by Self-supervised Prediction, Pathak et al. 2017.
10/9	Project Proposals Due
10/15	Generalization in RL		1. Quantifying Generalization in RL, blog by OpenAI 2018 2. Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours, Pinto and Gupta 2016 3. Visual Imitation Made Easy, Young et al. 2020
10/22	HW-3 Due
10/22	HW-4 Release		Exploration with Bandits.
10/22	Imitation Learning		1. Apprenticeship Learning via Inverse Reinforcement Learning, Abbeel and Ng 2004 2. Generative Adversarial Imitation Learning, Ho et al. 2016 3. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations, Rajeswaran et al. 2018
10/29	HW-4 Due
10/29	Control and planning		1. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems, Li and Todorov 2004 2. Benchmarking Model-Based Reinforcement Learning, Wang et al. 2019
11/5	Latent State Discovery -- John Langford		Guest Lecture
11/12	Multi-Agent Learning -- Noam Brown		Guest Lecture
11/19	Unsupervised Reinforcement Learning -- Denis Yarats		Guest Lecture
12/3	Current frontiers
12/10	Final Project Presentations and Writeups