Interpretability and Explainability in Machine Learning

COMPSCI 282BR, Harvard University

Fall 2019, Class: Friday 12:00pm - 2:30pm, Maxwell Dworkin G125


As machine learning models are increasingly being employed to aid decision makers in high-stakes settings such as healthcare and criminal justice, it is important to ensure that the decision makers (end users) correctly understand and consequently trust the functionality of these models. This graduate level course aims to familiarize students with the recent advances in the emerging field of interpretable and explainable ML. In this course, we will review seminal position papers of the field, understand the notion of model interpretability and explainability, discuss in detail different classes of interpretable models (e.g., prototype based approaches, sparse linear models, rule based techniques, generalized additive models), post-hoc explanations (black-box explanations including counterfactual explanations and saliency maps), and explore the connections between interpretability and causality, debugging, and fairness. The course will also emphasize on various applications which can immensely benefit from model interpretability including criminal justice and healthcare.


We will first review the fundamentals through lectures, readings, discussions, and assignments. There will be two homework assignments. After the first few lectures by the instructor, students will be expected to present research papers in class. Students will also carry out a semester-long project applying and extending ideas learnt in the course. For further details about grading and course format, see this.


Students are expected to be fluent in basic linear algebra, probability, algorithms, and machine learning (at the level of CS181). Students are also expected to have programming and software engineering skills to work with data sets using Python, numpy, and sklearn.


Please use this form to provide feedback about the course:


Ike Lage

Ike Lage

Teaching Fellow
Office Hours: Thursday 2:00pm - 3:00pm
Location: Maxwell Dworkin 337
Hima Lakkaraju

Hima Lakkaraju

Office Hours: Tuesday 3:30pm - 4:30pm
Location: Maxwell Dworkin 337
Webpage | Twitter


Date Topic Readings Background Material Assignments / Deadlines
Week 1
September 6
Understanding Interpretability
Slides | Video
Doshi-Velez and Kim, 2017
Lipton, 2017

Additional Reading:
Weller, 2019
Week 2
September 13
Evaluating Interpretability
Slides | Video
Lage et. al., 2019
Poursabzi-Sangdeh, 2018
HW1 out on 09/16
Week 3
September 20
Rule Based Approaches
Slides | Video
Letham and Rudin, 2015
Lakkaraju et. al., 2016
Metropolis Hastings Algorithm
Submodular Optimization
[Sections 1 & 2]
CP1 due on 09/26
Week 4
September 27
Prototype Based Approaches
Slides | Video
Li et. al., 2017
Kim et. al., 2014
Gibbs Sampling HW2 out on 09/30
HW1 due on 10/03
Week 5
October 4
Risk Scores &
Generalized Additive Models

Slides | Video
Ustun and Rudin, 2017
Caruana et. al., 2015
Cutting Plane Methods
[Section 7.1]
Generalized Additive Models
Week 6
October 11
Explaining Black-Box Models
Slides | Video
Ribeiro et. al., 2016
Rudin, 2019

Additional Reading:
Ghorbani et. al., 2019
Week 7
October 18
Visualizing Model Behavior
Slides 1 | Slides 2 | Video
Zintgraf et. al., 2017
Adebayo et. al., 2018
HW2 due on 10/22
Week 8
October 25
Feature Importance Based Explanations
Slides | Video
Lundberg and Lee, 2017
Kim et. al., 2018
CP2 due on 10/28
Week 9
November 1
Actionable Explanations
Slides | Video
Wachter et. al., 2018
Ustun et. al., 2018
Week 10
November 8
Causal Models & Explanations
Slides | Video
Zhao and Hastie, 2018
Lakkaraju and Rudin, 2017
Week 11
November 15
Human-in-the-loop Models & Explanations
Slides | Video
Lage et. al., 2018
Lakkaraju et. al., 2019
CP3 due on 11/18
Week 12
November 22
Connections with Debugging & Fairness
Slides | Video
Koh and Liang, 2017
Kleinberg and Mullainathan, 2019
Week 13
November 29
Thanksgiving Holiday
Week 14
December 6
Final Presentations Final report due on 12/09

    © Hima Lakkaraju 2019