**CPSC 532Y / 538L: Causal Machine Learning** **Instructor**: [Mathias Lécuyer](https://mathias.lecuyer.me) **Schedule**: MW 12:00-13:30 -- Term 1 (September - December 2022) **Location**: ICCS 246 (the class is in person, but there will be a zoom link for exceptional cases) **Office Hours**: Thu 11-noon, zoom (class link) or ICICS 317. I am also often available after class on Monday. **Logistics**: Logistics and discussions will happen on Piazza [^piazza]. You can [login through Canvas here](https://canvas.ubc.ca/courses/111875), which also shows the access code. **If you want to audit or attend the first sessions, please feel free to join**. Objectives ========== This class has two main educational goals: 1. Cover the basics of causal inference, including: 1. Definitions and notations. 2. Usefulness and Challenges. 3. Core techniques, their strengths and limitations; 1. Conditioning on confounders (regression; matching; propensity scoring) 2. Instrument Variables 3. Panel data 4. Mediating mechanisms 2. Introduce recent developments using Machine Learning (ML) to extend the core techniques, and address some of their limitations. Topics include: 1. Using ML to estimate heterogeneous causal impacts, including in regression (meta learners, random forests, orthogonal learning…) and IV settings (deep-IV, method of moments). 2. Using ML to learn confounders. 3. Using ML for sensitivity analysis. 4. Causal discovery 5. Link with RL. We will cover these topics trough a mix of lectures, paper presentations by students, and in class discussions of the material. **This is a graduate seminar, and all students are expected to be actively involved during lectures and paper presentations. The grading with reflect these expectations.** Prerequisites ============= While there is no formal requirement to allow anyone to take the class, general, basic background on the following topics is assumed: probability, statistics, and Machine Learning. The assignment and mini-project will also require the ability to program in python. It should be very possible for a motivated student to learn any missing background on their own as the class progresses (and students with missing background knowledge are expected to). Evaluation ========== Your course grade will be based on the following breakdown: - In class participation: 30% + 10% (for going above and beyond) - Paper presentation: 20% - Assignment: 10% - Mini-project: 25% + 5% (for going above and beyond) In class participation ---------------------- In class participation will make the bulk of the grade, based on the consistency (participate in most classes) and quality of the participation. There are extra points for going above and beyond in terms of quality and enthusiasm. The participation is important and graded in both lectures and paper discussions. Paper presentation ------------------ Each student must give a 45-60 min presentation of one paper (potentially in groups depending on the number of students). The class discussion will begin after the presentation (though questions during presentation are encouraged), and the presenter(s) will be expected to lead the discussion. The following questions can help structure the presentation and discussion: 1. High level: What problem is the paper solving? Why is that problem important/challenging? 2. What is the precise causal setting of applicability? Any key assumptions? 3. What are the key results? (What was the previous state of the art? How does this paper advance the state of the art?) 4. How is it evaluated? 5. What are challenges in applying the proposed solution? 6. What related problems are still open? Students are welcome to use slides, and will present over zoom (in addition to in person when available). The presentation will be graded based on content, clarity, delivery, as well as on the quality of the follow up discussion (the presenters are expected to encourage and manage the discussion). Students who are not presenting should ask questions and participate in the discussion, and this participation will count towards their general participation grade. **Preparing topics of discussion, comments, and questions is part of the requirements** for participation. Students can participate in discussions in class (in person or via zoom over voice or chat) as well as on Piazza. If you are unable to attend a class on a given day, you can still get participation points by sharing discussion questions on Piazza before the class. Assignment ---------- There will be one intermediary, open ended assignment. You can start the assignment whenever you want. Each student will return a short explanation ($\leq 2$ pages, including plots -- shorter is good), and present their code and results to me in a short meeting scheduled for this purpose. Assignment: 1. Create a data setting (based on real or generated data) that illustrates the challenges of causal inference. 2. Express these challenges based on the Potential Outcomes framework and causal graphs. 3. Empirically demonstrate these challenges. 4. Give at least one condition sufficient for identifyability (expressed with the proper formalism). 5. Demonstrate it empirically. **Presented to me on Tue Oct 18 or Fri Oct 21.** You should start thinking about it! Mini-project ------------ The "mini-project" is a long, developed version of the above assignment. 1. Use a real data setting, or create an involved/modular data generation setting. Bonus points for links to your research area (if applicable, not a requirement). 2. Formalize your setting and identification goals using the Potential Outcomes framework and causal graphs. 3. Implement and apply causal inference techniques we covered (or other ones!) to work on your identification goals. 4. Comment on the robustness of your approach(es) (theoretically and empirically). Key milestones: - Pitch your project ideas in class to find teammates and get feedback. - Project proposal: tell me your project area and group, as well as any written material you would like feedback on ($\leq 2$ page, non binding). - Presentations (based on number of projects). - Final report. Feel free to chat with me (early) for: project ideas and feedback on your ideas/progress. Syllabus ======== Syllabus sketch --------------- Potential readings are linked to get a sense of what we will discuss. We will not have time to read and discuss all of them. 0. Why causal inference? (class overview) 1. Causal effect definitions and formalisms (lectures and discussions) 1. The Potential Outcomes framework 2. Causal graphs / structural causal models 3. Identifyability 2. Core estimation approaches, and how to use/adapt ML accordingly (lectures and paper discussions) 1. Conditioning on counfounders [[1](https://www.researchgate.net/profile/Guido-Imbens-2/publication/274644919_Machine_Learning_Methods_for_Estimating_Heterogeneous_Causal_Effects/links/553c02250cf2c415bb0b1720/Machine-Learning-Methods-for-Estimating-Heterogeneous-Causal-Effects.pdf), [2](https://arxiv.org/abs/1706.03461), [3](https://arxiv.org/abs/1608.00060), [4](https://arxiv.org/abs/1901.09036), [5](https://proceedings.mlr.press/v139/jung21b.html)] 2. Inverse Propensity weighting [[1](https://arxiv.org/pdf/1812.03372.pdf), [2](https://arxiv.org/abs/1906.02120)] 3. Instrument Variables [[1](https://proceedings.mlr.press/v70/hartford17a/hartford17a.pdf), [2](https://arxiv.org/abs/1905.12495)] 4. Repeated observations (panel data) 5. Mediating mechanisms 3. Newer developments and synergies with ML (paper discussions---tentative) 1. Sensitivity analysis [[1](https://arxiv.org/abs/2003.01747), [2](https://proceedings.mlr.press/v67/gutierrez17a)] 2. Learning decounfounders [[1](https://arxiv.org/abs/1805.06826), [2](https://arxiv.org/abs/1902.10286)] 3. Causal Discovery [[1](https://www.frontiersin.org/articles/10.3389/fgene.2019.00524/full), [2](https://arxiv.org/abs/1705.10220)] 4. Link with RL [[1](https://papers.nips.cc/paper/2020/file/6d34d468ac8876333c4d7173b85efed9-Paper.pdf), [2](https://arxiv.org/abs/1511.03722), [3](https://arxiv.org/abs/1103.4601)] Schedule (in progress) ---------------------- (Will be updated based on the speed of progress in the material we cover.) 7 Sep 2022: Introduction: Why Causal Inference? 12 Sep 2022: The Potential Outcomes framework (lecture) Definitions and notations. 14 Sep 2022: The Potential Outcomes framework (lecture) Concepts: Identifyability. Assumptions: ignorability and positivity. Techniques and intuition: matching and conditioning. 19 Sep 2022: No class 21 Sep 2022: The Potential Outcomes framework (lecture continued) Concepts: Identifyability. Assumptions: ignorability and positivity. Techniques and intuition: matching and conditioning. 26 Sep 2022: Regression (lecture) Regression basics. Regression interpretation: as conditioning, as matching, as conditioning, as counterfactual prediction. 28 Sep 2022: Causal graphs / structural causal models (lecture) DAGs, SCMs, (in)dependence patterns, DO operator, link with Potential Outcomes. 03 Oct 2022: Causal graphs / structural causal models (lecture continued) Causal and non-causal paths, d-separation, back-door criterion. 05 Oct 2022: Causal graphs / structural causal models (lecture continued) Do-calculus, and examples: - Understanding colliders. - Mediating Mechanisms. 10 Oct 2022: No class (Thanksgiving) 12 Oct 2022: Modeling Causal Effects (lecture) We will cover (probably a subset of) the following topics: - Inverse Propensity Weighting and its Graphical interpretation, - Weighted regression, - Causal effect heterogeneity (reminder), - Regression and orthogonality, - Doubly Robust estimators. 17 Oct 2022: ML for heterogeneous causal effects (paper) [Machine Learning Methods for Estimating Heterogeneous Causal Effects (Athey, Imbens, 2013)](https://www.researchgate.net/profile/Guido-Imbens-2/publication/274644919_Machine_Learning_Methods_for_Estimating_Heterogeneous_Causal_Effects/links/553c02250cf2c415bb0b1720/Machine-Learning-Methods-for-Estimating-Heterogeneous-Causal-Effects.pdf) 18 Oct 2022: ⚠️ show your assignment. Times TBD. 19 Oct 2022: ML for predicting counterfactuals (paper) [Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning (Künzel, Sekhon, Bickel, Yu, 2017)](https://arxiv.org/abs/1706.03461) 21 Oct 2022: ⚠️ show your assignment. Times TBD. 24 Oct 2022: Importance Weighting for Neural Nets (papers) [Adapting Neural Networks for the Estimation of Treatment Effects (Shi, Blei, Veitch, 2019)](https://arxiv.org/abs/1906.02120), with context from [What is the Effect of Importance Weighting in Deep Learning? (Byrd, Lipton, 2019)](https://arxiv.org/abs/1812.03372) 26 Oct 2022: Double/Debiased/Orthogonal ML (paper 1) [Double/Debiased Machine Learning for Treatment and Causal Parameters (Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, Newey, Robins, 2016)](https://arxiv.org/abs/1608.00060) 31 Oct 2022: Double/Debiased/Orthogonal ML (paper 2) [Orthogonal Statistical Learning (Foster, Syrgkanis, 2019)](https://arxiv.org/abs/1901.09036) 2 Nov 2022: Review Session 7 Nov 2022: ⚠️ Pitch your project idea to the class 9 Nov 2022: No class (Midterm Break) 14 Nov 2022: Instrument Variables (lecture) - Natural Experiments - Instrument Variables - Two stages least square (2SLS) - Method of Moments 16 Nov 2022: DeepIV (paper) [Deep IV: A Flexible Approach for Counterfactual Prediction (Hartford, Lewis, Leyton-Brown, Taddy, 2017)](https://proceedings.mlr.press/v70/hartford17a/hartford17a.pdf) 18 Nov 2022: ⚠️ Project groups and topic - Send me your project group and topic (either over email or on piazza). - Feel free to attach written material you would like feedback on ($\leq 2$ page, non binding). 21 Nov 2022: Deep Method of Moments (paper) [Deep Generalized Method of Moments for Instrumental Variable Analysis (Bennett, Kallus, Schnabel, 2019)](https://arxiv.org/abs/1905.12495) 23 Nov 2022: Sensitivity analysis (paper) [Sense and Sensitivity Analysis: Simple Post-Hoc Analysis of Bias Due to Unobserved Confounding (Veitch, Zaveri, 2020)](https://proceedings.neurips.cc/paper/2020/hash/7d265aa7147bd3913fb84c7963a209d1-Abstract.html) 28 Nov 2022: Causal Discovery (papers) [Permutation-based Causal Inference Algorithms with Interventions (Wang, Solus, Dai Yang, Uhler, 2017)](https://arxiv.org/abs/1705.10220) with context from this [Review of Causal Discovery Methods Based on Graphical Models (Glymour, Zhang, Spirtes, 2019)](https://www.frontiersin.org/articles/10.3389/fgene.2019.00524/full) 30 Nov 2022: The decounfounder saga (papers) Snow day 5 Dec 2022: The decounfounder saga (papers) [The Blessings of Multiple Causes (Wang, Blei, 2018)](https://arxiv.org/abs/1805.06826) and [On Multi-Cause Causal Inference with Unobserved Confounding: Counterexamples, Impossibility, and Alternatives (D'Amour 2019)](https://arxiv.org/abs/1902.10286) 7 Dec 2022: Link with RL (paper) We will discuss a set of papers: - [Empirical Likelihood for Contextual Bandits (Karampatziakis, Langford, Mineiro, 2020)](https://proceedings.neurips.cc/paper/2020/file/6d34d468ac8876333c4d7173b85efed9-Supplemental.pdf). - [Doubly Robust Off-policy Value Evaluation for Reinforcement Learning (Jiang, Li, 2015)](https://arxiv.org/abs/1511.03722), with context from [Doubly Robust Policy Evaluation and Learning (Dudik, Langford, Li, 2011)](https://arxiv.org/abs/1103.4601). This discussion aims to: - Link Causal Inference and bandits (in particular off-policy evaluation); - introduce techniques for dealing with time, and Doubly Robust estimators in action; - and introduce a new technique: Empirical Likelihood. 16 Dec 2022: ⚠️ Reports due. One paper-like project report per group, one short report on personal contributions per student. [^piazza]: In this course, you will be using Piazza, which is a tool to help facilitate discussions. When creating an account in the tool, you will be asked to provide personally identifying information. Because this tool is hosted on servers in the U.S. and not in Canada, by creating an account you will also be consenting to the storage of your information in the U.S. Please know you are not required to consent to sharing this personal information with the tool, if you are uncomfortable doing so. If you choose not to provide consent, you may create an account using a nickname and a non-identifying email address, then let your instructor know what alias you are using in the tool.