Why Causality Matters: A Primer for Medical Research (Causal Inference)
- Mayta 
- Jun 19
- 5 min read
1 Why Causality Matters
- Correlation → two variables move together. 
- Causation → changing exposure X would change outcome Y for the same individual at that moment (the counterfactual). 
- Clinical and public-health questions need causal answers because we intend to act on the exposure (administer a drug, remove a toxin, redesign a protocol). 
2 The Three-Step Causal-Inference Workflow
| Step | Key task | Typical tools | 
| 1. Define the estimand | Spell out what effect and for whom. | ATE, ATT, ATU, CACE, etc. | 
| 2. State assumptions | List the conditions that let data identify the estimand. | Exchangeability, Positivity, Consistency, No-interference (SUTVA), MAR/MCAR | 
| 3. Stress-test assumptions | Check robustness when assumptions wobble. | Sensitivity analyses, E-values, tipping-point plots, negative-control tests | 
3 Estimand ≠ Parameter
| Term | What it means | Why it matters | 
| Estimand | A target causal quantity written in words and math. | Anchors every design and analysis choice. | 
| Parameter | A numeric feature inside one statistical model (for example, a regression coefficient). | Too narrow—ties you to one model. | 
Mnemonic: “Design for the estimand, then fit whatever model achieves it.”
4 Rubin’s Causal Model (Potential-Outcomes Framework)
- For each subject i we imagine two potential outcomes: - Yᵢ(1) if exposed / treated 
- Yᵢ(0) if unexposed / control 
 
- Fundamental problem: we see only one of the pair. 
- Individual Treatment Effect (ITE): Yᵢ(1) − Yᵢ(0) (unobservable). 
- Therefore we aim for population averages we can identify. 
5 Core Treatment Effects
| Effect | Formula (written out) | Interpretation | 
| ATE (Average Treatment Effect) | E[Y(1) − Y(0)] | Mean causal effect in the entire target population | 
| ATT (Average Treatment Effect on the Treated) | E[Y(1) − Y(0) | D = 1] | Mean causal effect only among those who actually received the treatment | 
| ATU (Average Treatment Effect on the Untreated) | E[Y(1) − Y(0) | D = 0] | Mean causal effect only among those who did not receive the treatment | 
Key: Y(1) = outcome if exposed / treated Y(0) = outcome if unexposed / control D = 1 indicates the individual was treated; D = 0 indicates untreated E[ ] denotes the average (expected value) across the specified group
This table shows all three common population-level causal effects in a compact, plain-text format that should render correctly in any viewer.
6 Identification Assumptions in Plain Language
- Exchangeability / Conditional independence – after adjusting for measured confounders L, treated and untreated groups would have had the same prognosis. 
- Positivity (Overlap) – every pattern of confounders L had a non-zero chance to receive each exposure level. 
- Consistency – “treatment” means one well-defined version for everyone. 
- No-interference (SUTVA) – one person’s treatment does not affect another’s outcome. 
- Missing at Random (MAR) – required for many ATE estimates; missing potential outcomes depend only on observed data. 
If any assumption fails, the causal estimate may be biased—hence the need for Step 3 (stress-testing).
Causal Assumptions (Identifiability Conditions)
These four conditions let us turn the counterfactual story {Y(1), Y(0)} into an estimand we can actually calculate. Miss one, and your “causal effect” may simply be bias in disguise.
| Assumption | Plain-English test | Typical violations | How to diagnose / repair | 
| Exchangeability (ignorability / no unmeasured confounding) | After accounting for measured covariates, the treated and untreated groups would have had identical risk had they swapped exposure. | Confounding by indication, self-selection, and physician preference | RCTs: built-in by randomization. Observational: draw a DAG → choose an adjustment set; use propensity scores / instrumental variables. Keep in mind this assumption can never be proven from data alone. | 
| Positivity (Overlap) | Every combination of covariates seen in your data had a real—not zero—chance to receive each exposure level. | Structural zeros (e.g., pregnancy test in men), protocol-mandated treatment, rare subgroups | Plot or tabulate treatment probabilities; trim or weight observations outside the common-support region; redefine target population if necessary. | 
| SUTVA – Stable Unit Treatment Value Assumption | |||
| • No Interference | My treatment does not change your outcome. | Herd immunity, spill-over in wards, social-network effects | Use cluster or network designs; model spill-over explicitly. | 
| • Consistency (Stability) | “Treatment = treatment”: there are no hidden versions that matter. | Dose heterogeneity, vague exposure definitions, evolving surgical techniques | Tighten protocol definitions; analyse distinct versions separately; in registries, verify coding accuracy. | 
Why these assumptions are easier in RCTs
Randomization balances both known and unknown confounders (exchangeability) and gives each participant an equal chance of assignment (positivity); the trial protocol fixes treatment versions (consistency) and is usually individual-level (no interference). Hence, all four conditions “hold by design” in a well-run trial.
Practical tips for observational work
- Start with a DAG. It forces you to state why exchangeability might fail and how to close back-door paths. 
- Check overlap early. Plot the propensity-score (or exposure-probability) distributions; consider trimming 0 / 1 regions. 
- Define “treatment” precisely. If your exposure can vary in dose, timing, or formulation, either restrict or stratify. 
- Think about interference. Vaccination, decontamination, and social-behaviour interventions rarely act in isolation. 
7 Examples
Question: Does a loading dose of Drug X reduce 30-day mortality in septic shock?
- Estimand: ATE of Drug X versus no Drug X among adult ICU patients with septic shock. 
- Assumptions & design aids: - Exchangeability → randomise, or adjust for APACHE II, infection source, early vasopressor use. 
- Positivity → Remove patients with absolute contraindications to Drug X before analysis. 
- Consistency → one loading protocol (2 g IV over 30 min). 
- No-interference → mortality is individual, plausible. 
 
- Design / Analysis paths: - A. Pragmatic RCT → unadjusted risk difference is causal. 
- B. Target-trial emulation in a registry → propensity-score weighting plus E-value for residual confounding. 
 
- Sensitivity check: Vary the strength of an unmeasured confounder until the risk difference crosses zero. 
8 Common Pitfalls & Fixes
| Pitfall | Why it hurts | Quick fix | 
| Declaring an adjusted odds ratio “causal” | Odds ratios are non-collapsible; conditioning ≠ the marginal effect | Report risk difference or risk ratio from marginal models (g-formula, IPW) | 
| Poor overlap of propensity scores | Violates positivity | Trim to common support, match, or use overlap weights | 
| Mixing multiple treatment versions | Breaks consistency | Define exposure more tightly or stratify by version | 
| Ignoring interference (e.g., with vaccines) | Violates SUTVA | Use cluster designs or models that allow spill-over | 
9 Self-Check Quiz
- True / False: If exchangeability holds, positivity may still fail. 
True – positivity must be checked separately; without it, causal effects cannot be identified in regions where treatment probability is 0 or 1.
- List two sensitivity-analysis techniques for unmeasured confounding. 
Examples: E-values, tipping-point analysis, Rosenbaum bounds, negative-control outcomes/exposures, Bayesian bias analysis.
- Write the algebraic expression for ATT of an “antibiotic stewardship consult”. 
ATT = E[ Y(1) − Y(0) | D = 1 ].
- Why do published papers rarely quote ITEs? 
Because an individual’s counterfactual outcome is never observed; only population-level contrasts can be identified and estimated.
10 Key Takeaways
- Start with the estimand, not the model. 
- The potential-outcomes framework gives the language; the four (plus MAR) assumptions give the bridge to data. 
- ATE, ATT, ATU let you target different policy or clinical decisions. 
- Always pair your main causal estimate with a robustness or sensitivity check. 






Comments