Skip to main content


There will be four homework assignments posted to the class campuswire. They must be submitted online (platform TBD) and there will be no extensions. All the answers must be typeset using either LaTeX or Microsoft Word and submitted as a pdf file. Handwritten answers will not be accepted. Each homework will include one or more programming assignments. Assignment 0 is solely to check pre-requisites. No grade will be assigned to this.


The assignments will be released on Campuswire.

AssignmentRelease DateDue DateLink
Assignment 0: Pre-requisitesJanuary 31-Link
Assignment 1: NN, BC, multimodal BCFebruary 9February 23Link
Assignment 2: Dagger, reward-weighted regressionFebrurary 23March 8Link
Assignment 3: Off-Policy Reinforcement LearningMarch 8March 29Link
Assignment 4: PPOMarch 29April 12Link

Collaboration Policy

Collaboration is encouraged, but the work you submit for assignments is expected to be entirely your own. That is, the writing and code must be yours, and you must fully understand everything that you hand in. Discussing the details of how to solve a problem is fine, but you must write the solution yourself. To avoid plagiarizing, you shouldn't be looking at someone else's solution while you write down your own. If you collaborated significantly (use your own discretion for "significantly") on a problem, list the people you collaborated with next to your solution.

Final Project

  • Project proposals (1 page) will be due on 2/23.
  • Maximum (and recommended) team size is 2.
  • Final presentations of all projects will take place on 5/3.