Skip to main content

Course Schedule

wkLectureLinksReading material
9/1HW-0 ReleaseCampuswireFundamentals of Linear Algebra and Deep Learning
9/3Part 1: Course overview
Part 2: Supervised learning for control

1. A framework for behavioural cloning, Bain and Sommut, 1999.
2. A reduction of imitation learning and structured prediction to no-regret online learning, Ross et al., 2011.
9/10Part 1: Introduction to RL
Part 2: Value functions
Part 1: Chapter 1 of RL book.
Part 2: Chapter 3 of RL book.
9/10HW-0 Due
9/10HW-1 ReleaseImitation via Supervision.
9/17Deep Q Learning1. Human-level control through deep reinforcement learning, Mnih et al., 2015.
2. Rainbow: Combining Improvements in Deep Reinforcement Learning, Hessel et al., 2017. 
3. Agent 57, Deepmind blog post
9/24HW-1 Due
9/24HW-2 ReleaseDeep Q Learning++.
9/24Policy gradients1. REINFORCE, Williams, 1992.
2. A Natural Policy Gradient, Kakade, 2002.
3. Proximal Policy Optimization Algorithms, Schulman et al. 2017.
10/1Part 1: Actor-critic methods
Part 2: Distributed RL
1. Deterministic Policy Gradient Algorithms, Silver et al., 2014.
2. Asynchronous Methods for Deep Reinforcement Learning, Mnih et al. 2016.
3. Massively Parallel Methods for Deep Reinforcement Learning, Nair et al. 2015.
10/8HW-2 Due
10/8HW-3 ReleaseRL with Policy Gradients.
10/8Exploration in RL1. Exploration strategies in deep RL, blog by Lilian Weng, 2020
2. Intrinsic Motivation Systems for Autonomous Mental Development, Oudeyer et al. 2007.
3. Curiosity-driven Exploration by Self-supervised Prediction, Pathak et al. 2017.
10/9Project Proposals Due
10/15Generalization in RL1. Quantifying Generalization in RL, blog by OpenAI 2018
2. Supersizing Self-supervision: Learning to Grasp from 50K Tries and 700 Robot Hours, Pinto and Gupta 2016
3. Visual Imitation Made Easy, Young et al. 2020
10/22HW-3 Due
10/22HW-4 ReleaseExploration with Bandits.
10/22Imitation Learning1. Apprenticeship Learning via Inverse Reinforcement Learning, Abbeel and Ng 2004
2. Generative Adversarial Imitation Learning, Ho et al. 2016
3. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations, Rajeswaran et al. 2018
10/29HW-4 Due
10/29Control and planning1. Iterative Linear Quadratic Regulator Design for Nonlinear Biological Movement Systems, Li and Todorov 2004
2. Benchmarking Model-Based Reinforcement Learning, Wang et al. 2019
11/5Latent State Discovery -- John LangfordGuest Lecture
11/12Multi-Agent Learning -- Noam BrownGuest Lecture
11/19Unsupervised Reinforcement Learning -- Denis YaratsGuest Lecture
12/3Current frontiers
12/10Final Project Presentations and Writeups
Last updated on