Why Causality Matters: A Primer for Medical Research (Causal Inference)

Mayta
Jun 19
5 min read

1 Why Causality Matters

Correlation → two variables move together.
Causation → changing exposure X would change outcome Y for the same individual at that moment (the counterfactual).
Clinical and public-health questions need causal answers because we intend to act on the exposure (administer a drug, remove a toxin, redesign a protocol).

2 The Three-Step Causal-Inference Workflow

Step	Key task	Typical tools
1. Define the estimand	Spell out what effect and for whom.	ATE, ATT, ATU, CACE, etc.
2. State assumptions	List the conditions that let data identify the estimand.	Exchangeability, Positivity, Consistency, No-interference (SUTVA), MAR/MCAR
3. Stress-test assumptions	Check robustness when assumptions wobble.	Sensitivity analyses, E-values, tipping-point plots, negative-control tests

3 Estimand ≠ Parameter

Term	What it means	Why it matters
Estimand	A target causal quantity written in words and math.	Anchors every design and analysis choice.
Parameter	A numeric feature inside one statistical model (for example, a regression coefficient).	Too narrow—ties you to one model.

Mnemonic: “Design for the estimand, then fit whatever model achieves it.”

4 Rubin’s Causal Model (Potential-Outcomes Framework)

For each subject i we imagine two potential outcomes:
- Yᵢ(1) if exposed / treated
- Yᵢ(0) if unexposed / control
Fundamental problem: we see only one of the pair.
Individual Treatment Effect (ITE): Yᵢ(1) − Yᵢ(0) (unobservable).
Therefore we aim for population averages we can identify.

5 Core Treatment Effects

Effect	Formula (written out)	Interpretation
ATE (Average Treatment Effect)	E[Y(1) − Y(0)]	Mean causal effect in the entire target population
ATT (Average Treatment Effect on the Treated)	E[Y(1) − Y(0) \| D = 1]	Mean causal effect only among those who actually received the treatment
ATU (Average Treatment Effect on the Untreated)	E[Y(1) − Y(0) \| D = 0]	Mean causal effect only among those who did not receive the treatment

Key: Y(1) = outcome if exposed / treated Y(0) = outcome if unexposed / control D = 1 indicates the individual was treated; D = 0 indicates untreated E[ ] denotes the average (expected value) across the specified group

This table shows all three common population-level causal effects in a compact, plain-text format that should render correctly in any viewer.

6 Identification Assumptions in Plain Language

Exchangeability / Conditional independence – after adjusting for measured confounders L, treated and untreated groups would have had the same prognosis.
Positivity (Overlap) – every pattern of confounders L had a non-zero chance to receive each exposure level.
Consistency – “treatment” means one well-defined version for everyone.
No-interference (SUTVA) – one person’s treatment does not affect another’s outcome.
Missing at Random (MAR) – required for many ATE estimates; missing potential outcomes depend only on observed data.

If any assumption fails, the causal estimate may be biased—hence the need for Step 3 (stress-testing).

Causal Assumptions (Identifiability Conditions)

These four conditions let us turn the counterfactual story {Y(1), Y(0)} into an estimand we can actually calculate. Miss one, and your “causal effect” may simply be bias in disguise.

Assumption	Plain-English test	Typical violations	How to diagnose / repair
Exchangeability (ignorability / no unmeasured confounding)	After accounting for measured covariates, the treated and untreated groups would have had identical risk had they swapped exposure.	Confounding by indication, self-selection, and physician preference	RCTs: built-in by randomization. Observational: draw a DAG → choose an adjustment set; use propensity scores / instrumental variables. Keep in mind this assumption can never be proven from data alone.
Positivity (Overlap)	Every combination of covariates seen in your data had a real—not zero—chance to receive each exposure level.	Structural zeros (e.g., pregnancy test in men), protocol-mandated treatment, rare subgroups	Plot or tabulate treatment probabilities; trim or weight observations outside the common-support region; redefine target population if necessary.
SUTVA – Stable Unit Treatment Value Assumption
• No Interference	My treatment does not change your outcome.	Herd immunity, spill-over in wards, social-network effects	Use cluster or network designs; model spill-over explicitly.
• Consistency (Stability)	“Treatment = treatment”: there are no hidden versions that matter.	Dose heterogeneity, vague exposure definitions, evolving surgical techniques	Tighten protocol definitions; analyse distinct versions separately; in registries, verify coding accuracy.

Why these assumptions are easier in RCTs

Randomization balances both known and unknown confounders (exchangeability) and gives each participant an equal chance of assignment (positivity); the trial protocol fixes treatment versions (consistency) and is usually individual-level (no interference). Hence, all four conditions “hold by design” in a well-run trial.

Practical tips for observational work

Start with a DAG. It forces you to state why exchangeability might fail and how to close back-door paths.
Check overlap early. Plot the propensity-score (or exposure-probability) distributions; consider trimming 0 / 1 regions.
Define “treatment” precisely. If your exposure can vary in dose, timing, or formulation, either restrict or stratify.
Think about interference. Vaccination, decontamination, and social-behaviour interventions rarely act in isolation.

7 Examples

Question: Does a loading dose of Drug X reduce 30-day mortality in septic shock?

Estimand: ATE of Drug X versus no Drug X among adult ICU patients with septic shock.
Assumptions & design aids:
- Exchangeability → randomise, or adjust for APACHE II, infection source, early vasopressor use.
- Positivity → Remove patients with absolute contraindications to Drug X before analysis.
- Consistency → one loading protocol (2 g IV over 30 min).
- No-interference → mortality is individual, plausible.
Design / Analysis paths:
- A. Pragmatic RCT → unadjusted risk difference is causal.
- B. Target-trial emulation in a registry → propensity-score weighting plus E-value for residual confounding.
Sensitivity check: Vary the strength of an unmeasured confounder until the risk difference crosses zero.

8 Common Pitfalls & Fixes

Pitfall	Why it hurts	Quick fix
Declaring an adjusted odds ratio “causal”	Odds ratios are non-collapsible; conditioning ≠ the marginal effect	Report risk difference or risk ratio from marginal models (g-formula, IPW)
Poor overlap of propensity scores	Violates positivity	Trim to common support, match, or use overlap weights
Mixing multiple treatment versions	Breaks consistency	Define exposure more tightly or stratify by version
Ignoring interference (e.g., with vaccines)	Violates SUTVA	Use cluster designs or models that allow spill-over

9 Self-Check Quiz

True / False: If exchangeability holds, positivity may still fail.

True – positivity must be checked separately; without it, causal effects cannot be identified in regions where treatment probability is 0 or 1.

List two sensitivity-analysis techniques for unmeasured confounding.

Examples: E-values, tipping-point analysis, Rosenbaum bounds, negative-control outcomes/exposures, Bayesian bias analysis.

Write the algebraic expression for ATT of an “antibiotic stewardship consult”.

ATT = E[ Y(1) − Y(0) | D = 1 ].

Why do published papers rarely quote ITEs?

Because an individual’s counterfactual outcome is never observed; only population-level contrasts can be identified and estimated.

10 Key Takeaways

Start with the estimand, not the model.
The potential-outcomes framework gives the language; the four (plus MAR) assumptions give the bridge to data.
ATE, ATT, ATU let you target different policy or clinical decisions.
Always pair your main causal estimate with a robustness or sensitivity check.