Skill Guide

Propensity score methods (matching, weighting, stratification)

Propensity score methods are statistical techniques used to estimate causal treatment effects from observational data by reducing selection bias through balancing covariates between treated and control groups.

These methods enable organizations to derive causal insights from non-experimental data, informing strategic decisions in marketing, medicine, and policy when randomized controlled trials are impractical or unethical. Accurate causal estimation directly impacts ROI measurement, risk assessment, and resource allocation.

1 Careers

1 Categories

8.7 Avg Demand

15% Avg AI Risk

How to Learn Propensity score methods (matching, weighting, stratification)

Focus on 1) Understanding counterfactual frameworks and the fundamental problem of causal inference; 2) Mastering logistic regression for estimating propensity scores; 3) Grasping the core concept of covariate balance as the primary diagnostic for successful application.

Transition to practice by 1) Applying methods to real-world datasets (e.g., medical claims, A/B test logs) and interpreting balance tables (Standardized Mean Differences); 2) Implementing and comparing matching (nearest neighbor), weighting (Inverse Probability of Treatment Weighting - IPTW), and stratification techniques; 3) Avoiding common pitfalls like misspecified propensity models, extrapolation beyond common support, and ignoring effect modification.

Master the domain by 1) Designing and justifying complex analytical pipelines that combine methods (e.g., doubly robust estimation); 2) Strategically aligning method selection with business constraints (e.g., sample size, required precision, stakeholder interpretability); 3) Mentoring teams on assumptions, limitations, and the responsible communication of causal estimates to decision-makers.

Practice Projects

Beginner

Project

Estimating the Effect of a Marketing Intervention on Customer Spend

Scenario

You have historical data from a non-randomized marketing campaign sent to a subset of customers. Your goal is to estimate the campaign's average causal effect on total spend.

How to Execute

1) Clean the data and define the treatment (received campaign) and outcome (spend). 2) Estimate propensity scores using logistic regression on pre-intervention customer demographics and behavior. 3) Perform 1:1 nearest neighbor matching without replacement and assess balance using SMDs. 4) Estimate the Average Treatment Effect on the Treated (ATT) from the matched sample.

Intermediate

Case Study/Exercise

Comparing Propensity Score Approaches for a Healthcare Study

Scenario

Analyze a dataset to compare the effectiveness of two surgical procedures. The assignment to procedure was not randomized, creating potential confounding by patient severity and hospital characteristics.

How to Execute

1) Estimate a propensity score model including key confounders. 2) Implement and compare three analyses: a) Matching (Mahalanobis distance with propensity caliper), b) Inverse Probability Weighting (with truncation/ stabilization to handle extreme weights), c) Stratification on the propensity score quintiles. 3) Report and compare the estimated treatment effect, its variance, and the covariate balance achieved by each method. 4) Justify the most credible estimate based on diagnostics.

Advanced

Case Study/Exercise

Designing a Causal Analysis Framework for a Platform's Feature Rollout

Scenario

A tech company is gradually rolling out a new platform feature. You must design a rigorous causal analysis plan to evaluate its impact on key business metrics, acknowledging observational data constraints.

How to Execute

1) Pre-specify the analysis protocol, defining the treatment, outcome, covariates, and primary estimand (e.g., ATE vs. ATT). 2) Propose a doubly robust estimator (e.g., Augmented IPTW) that combines propensity score weighting with an outcome model for enhanced robustness. 3) Develop a comprehensive diagnostics suite including balance, positivity, and model fit checks. 4) Create a communication plan to translate the statistical findings into actionable business recommendations for product managers and executives.

Tools & Frameworks

Statistical Software & Libraries

R: `MatchIt`, `twang`, `WeightIt`, `cobalt`Python: `causalml`, `DoWhy`, `EconML`Stata: `psmatch2`, `teffects`

Use these for implementation. `MatchIt` (R) is the standard for matching and balance diagnostics. `causalml`/`DoWhy` (Python) provide modern, ML-integrated pipelines for estimation. Stata offers robust built-in causal commands.

Mental Models & Methodologies

Directed Acyclic Graphs (DAGs)Causal Inference Framework (Potential Outcomes)Doubly Robust Estimation

DAGs are critical for identifying sufficient adjustment sets to avoid bias. The Potential Outcomes framework defines the target estimand. Doubly Robust methods provide insurance against misspecification of either the propensity or outcome model, enhancing analytical credibility.

Interview Questions

Answer Strategy

Structure the answer using the causal inference framework: Define the estimand (ATT), select covariates based on a DAG, estimate the propensity score (logit/probit), perform matching (caliper), and evaluate. Emphasize key assumptions: Unconfoundedness (no unmeasured confounders), Positivity, and SUTVA. For diagnostics, prioritize SMD balance tables, visual inspection of score distributions, and checking the number of matches lost to common support violations.

Answer Strategy

Testing understanding of practical implementation issues and robustness. The core concern is that extreme weights inflate variance and can make estimates unstable, violating the positivity assumption or indicating model misspecification. The professional response is to first investigate the extreme-weight units (are they near-deterministic treatment assignments?), then apply weight truncation or stabilization (e.g., using the average treatment probability in the denominator). Finally, report results from both the original and stabilized analyses as a sensitivity check.