<meta charset="utf-8" emacsmode="-*- markdown -*-">
                       **CPSC 532Y: Causal Machine Learning**


**Instructor**: [Mathias Lécuyer](https://mathias.lecuyer.me)

**Schedule**: MW	9:30-11:00 -- Term 1 (September - December 2025)

**Location**: [SPPH B108](https://learningspaces.ubc.ca/classrooms/spph-b108/) (the class is in person)

**TA**: [Frederick Shpilevskiy](mailto:fshpil@students.cs.ubc.ca)

**Office Hours**: right after class.

**Logistics**: Logistics and discussions will happen on Piazza [^piazza]. You can [login through Canvas here](https://canvas.ubc.ca/courses/168658). **If you want to audit or attend the first sessions, please feel free to join**. Every UBC student should be able to access Canvas for now.

**Previous offerings**: while anything might change, this session will be similar to previous offerings  [2024w1](previous_offerings/2024w1), [2023w1](previous_offerings/2023w1), [2022w1](previous_offerings/2022w1).


Objectives
==========

This class has two main educational goals:
  1. Cover the basics of causal inference, including:
    a. The Potential outcome framework ("Rubin model")
    b. The Causal Graph framework ("Pearl model")
    c. Goals of causal inference, and core techniques
  2. Introduce recent developments using Machine Learning (ML) to extend the core techniques, and address some of their limitations. Topics include:
    a. Using ML to estimate heterogeneous causal effects, including in regression (meta learners, random forests, orthogonal learning, cocycles…) and IV settings (deep-IV, method of moments).
    b. Evaluating Causal ML models.


We will cover these topics trough a mix of lectures, paper readings, and in class discussions of the material. **This is a graduate seminar, and all students are expected to be actively involved during lectures and paper discussions. The grading with reflect these expectations.**


Prerequisites
=============

While there is no formal requirement to allow anyone to take the class, general, basic background on the following topics is assumed: probability, statistics, and Machine Learning. The assignment and mini-project will also require the ability to program in python.
It should be very possible for a motivated student to learn any missing background on their own as the class progresses (and students with missing background knowledge are expected to).

Evaluation
==========

Your course grade will be based on (potentially a subset of) the following: in class participation, assignments, project.

Paper Readings
==============

Reaging all assigned papers, as well as regular discussion participation, is required and part of the grade.
When reading a paper, focus on:
  - The formal causal question addressed by the paper.
  - The core challenges and main contributions (how they solve the challenges).
  - Data settings / assumptions / models, and examples of what applying the approach looks like.

To encourage informed and lively discussions, before each class with a paper discussion you are required to submit a short write-up that highlights:
 - What you think is main take-away of the paper.
 - What is your reaction to it (e.g., in reaction to your field of research, how you might want to use it, what you liked/didn't like, etc...).
 - One **specific** point that you liked or disliked or have questions on. It needs to be specific (e.g., not "there was too much math" or "it was hard to read"), and not too obscure (e.g., not "I didn't get this line in the proof," though we can discuss specifics about results and their proofs in class). As concrete examples, it could be about the meaning or validity of a hypothesis.

**Deliverable**: A pdf, written in latex, **emailed to my AND the TA UBC email with subject [532Y reading] your-name and lecture date**.
One or two paragraphs.

**Due date**: Before the class during which we discuss the paper.

Assignments
===========

Assignment 1 <a name="a1"></a>
------------

In the first assignment, you will create a data setting that demonstrates the pitfalls of causal inference, and some solutions.

  1. Create a generative data model that illustrates the challenges of causal inference (i.e. when "naive" observational estimators are different from causal ones). Make sure to explicitely state the causal effect of interest. Be concise by precise when describing your model (e.g., use notation such as x ~ N(0, 1) to define the variables and their relationships).
  2. Using the potential outcomes framework, show theoretically that the "naive" observational estimator (difference of conditional expectations) is biased.
  3. Empirically demonstrate this bias by implementing your generative data model and showing (at least) one plot demonstrating the issue.
  4. Give at least two estimators that exactly identify the causal effect. What assumptions do this estimators rely on? Why are those assumptions verified in your model? Prove identification under those assumptions.
  5. Implement those estimators and demonstrate using plots that they correctly identify the causal effect of interest.

**Deliverable**: A pdf, written in latex, **emailed to my AND the TA UBC email with subject [532Y HW1] your-name**.
**At most 2 pages, including plots and formulas** (less is more).

**Due date**: Oct. 3, 2025.

Assignment 2 <a name="a2"></a>
------------

In the second (and last) assignment, you will create a data setting that demonstrates the pitfalls of fitting CATE models, and some solutions.

  1. Create a generative data model that illustrates the challenges of causal inference: make sure to explicitely state you causal model (you can use plots for complex functions and other parts to make the communication clear and precise).
  2. On this model, implement a "naive" CATE model (potentially using a meta-algorithm that combines several mdoels) to empirically demonstrate the challenges of convergence. Show one version that does not converge to the CATE, and one the converges slowly. Use plots of the fit, as well as plots of the relevant error as a function of dataset size, to make your point.
  3. Explain the intuition for why your data model creates those challenges, due to its interaction with your model class (this does not need to be fully formal, but consider using counterfactual notation to make your point---don't be too lengthy!).
  4. Show an orthogonal ML version that will converge faster. Briefly explain why based on the paper's results (no need to re-derive results in the paper! but be as formal as you can).
  5. Implement this orthogonal ML approach with model classes of your choice (justify those choices), and show plots for how this impacts your results compared to the approach in 2.
  6. **Bonus** (optional): implement a cocycles approach to solve the same generative model, and compare the results to orthogonal learning.

**Deliverable**: A pdf, written in latex, **emailed to my AND the TA UBC email with subject [532Y HW2] your-name**.
**At most 3 pages, including plots and formulas** (less is more).

**Due date**: Nov. 5, 2025.


Syllabus
========

This is very tentative and likely to change.
See [what we covered last year](previous_offerings/2023w1/#schedule).

Schedule
--------

3 Sep 2025: Introduction: Why Causal Inference?

8 Sep 2025: The Potential Outcomes framework (lecture)

    Definitions and notations.

10 Sep 2025: The Potential Outcomes framework (lecture)

    Concept: Identifiability. Assumptions: (conditional) ignorability. Techniques and intuition: randomization.


15 Sep 2025: The Potential Outcomes framework (lecture)

    Identifiability without randomization, estimation, heterogeneity. Confidence intervals.

17 Sep 2025: Linear regression for causal inference (lecture)

    Basic facts, interpretation(s) as an estimator for the ATE (using potential outcomes), confidence intervals.

22 Sep 2025: Linear regression and Propensity Scores (lecture)

    Finish linear regression, start on propensity scores.

24 Sep 2025: Propensity Scores (lecture)

    Identification through conditioning on the propensity. Eestimators: Iverse Propensity Scores; Doubly Robust estimators.

29 Sep 2025: Causal Decision Trees (paper)

    [Recursive Partitioning for Heterogeneous Causal Effects (Athey, Imbens, 2013/2015)](https://arxiv.org/abs/1504.01132). [**See guidelines and deliverable here**](#paperreadings).

1 Oct 2025: Meta-learners (paper)

    [Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning (Künzel, Sekhon, Bickel, Yu, 2019)](https://arxiv.org/abs/1706.03461). [**See guidelines and deliverable here**](#paperreadings).

3 Oct 2025: ⚠️  Assignement 1 due

    [Assignment 1](#a1) is due. Deliverable: a pdf, written in latex, **emailed to my AND the TA UBC email with subject [532Y HW1] your-name**. **At most 2 pages, including plots and formulas** (less is more).

6 Oct 2025: Structural Causal Models (lecture)

    Definitions: DAGs, SCMs, Do().

7 Oct 2025: ⚠️  Submit a project idea [optional]

    Optional but encouraged! I'll send (short) feedback, and it's an opportunity to chat about it before the deadline too. You can submit with a group (groups of 1 to 4 are fine--if you have a good reason to do a larger group let me know), or form a group later. If you decide to do it, **email me your pitch with subject [532Y project]**. Make sure to include the group in the email if you already have one.

8 Oct 2025: Structural Causal Models (lecture)

    Definitions: DAG structures.

13 Oct 2025: 🌴 *No Class*

    Thanksgiving Day

15 Oct 2025: Structural Causal Models (lecture)

    Backdoor criterion, do-calculus.

20 Oct 2025: Structural Causal Models (lecture)

    Using do-Calculus, Counterfactuals

22 Oct 2025: Structural Causal Models (lecture)

    Ladder of Causation, PNS

27 Oct 2025: Causal Deep learning (lecture/discussion)

    The goal of causal deep learning models.

29 Oct 2025: Orthogonal Learning (paper)

    [Orthogonal Statistical Learning](https://arxiv.org/pdf/1901.09036), up to and **including** section 3. Pay attention to the examples (and ideally associated proofs in appendix!). [**See guidelines and deliverable here**](#paperreadings).

31 Oct 2025: ⚠️  Project group and proposal

    Deliverable: a pdf, written in latex, **emailed to my AND the TA UBC email with subject [532Y Project] names of group members**. A **short** (one paragraph) abstract for your project. Make sure you define the causal question, causal strategy, and expected outcomes (that last one doesn't need to be precise but it'll help you). You can add other details and formalism (up to 2 pages total) if you want feedback.

3 Nov 2025: Orthogonal Learning (paper)

    We will continue the discussion of the [Orthogonal Learning](#schedule1_2025-10-9) paper.

5 Nov 2025: Causal Inference with Cocycles (paper)

    [Causal Inference with Cocycles](https://arxiv.org/abs/2405.13844).

    [**See guidelines and deliverable here**](#paperreadings).

10 Nov 2025: 🌴 *No Class*

    Remembrance Day and Midterm Break

12 Nov 2025: 🌴 *No Class*

    Midterm Break

17 Nov 2025: Instrument Variables and RDDs (lecture + papers)

    [DeepIV](https://proceedings.mlr.press/v70/hartford17a/hartford17a.pdf) and [Deep Generalized Method of Moments](https://arxiv.org/abs/1905.12495).

    [**See guidelines and deliverable here**](#paperreadings).

18 Nov 2025: ⚠️  Assignement 2 due

    [Assignment 2](#a2) is due. Deliverable: a pdf, written in latex, **emailed to my AND the TA UBC email with subject [532Y HW2] your-name**. **At most 3 pages, including plots and formulas** (less is more).

19 Nov 2025: RL + causality (lecture / discussion)

    TBD

24 Nov 2025: RL + causality (lecture / discussion)

    TBD

26 Nov 2025: ⚠️  Project presentations or Sensitivity analysis (paper)

    Group order TBD

1 Dec 2025: ⚠️  Project presentations and due date

    Group order TBD

3 Dec 2025: ⚠️  Project presentations

    Group order TBD


[^piazza]: In this course, you will be using Piazza, which is a tool to help facilitate discussions. When creating an account in the tool, you will be asked to provide personally identifying information. Because this tool is hosted on servers in the U.S. and not in Canada, by creating an account you will also be consenting to the storage of your information in the U.S. Please know you are not required to consent to sharing this personal information with the tool, if you are uncomfortable doing so. If you choose not to provide consent, you may create an account using a nickname and a non-identifying email address, then let your instructor know what alias you are using in the tool.

<div class="logo-bar">
<a class="logo" href="https://www.ubc.ca/"> </a>
</div>

<style>.md li.plus {
list-style: none}
.md li.plus:before {
  content: "\2B51";
  margin-left: -19px;
  padding-right: 3px;
}

.md em.asterisk { font-weight: bold; font-style: normal;}

div.logo-bar { background: #EEE; position: absolute; top: 0px; left: 0px; right: 0px; text-align: center; border-bottom: 2px solid #888; }
a.logo { display: inline-block; width:300px; height: 55px; background: url(https://brand3.sites.olt.ubc.ca/files/2018/09/5NarrowLogo_ex_768.png); background-size: 250px;
background-repeat: no-repeat; margin-top: 5px }
.md div.title { margin-top: 100px }
.md table.calendar { font-size: 12px }
body { font-size: large }
</style>
<script>markdeepOptions = {tocStyle:'short'}</script>
<!-- Markdeep: --><style class="fallback">body{visibility:hidden;white-space:pre;font-family:monospace}</style><script src="markdeep.min.js" charset="utf-8"></script><script src="https://casual-effects.com/markdeep/latest/markdeep.min.js?" charset="utf-8"></script><script>window.alreadyProcessedMarkdeep||(document.body.style.visibility="visible")</script>