Causal Inference in Observational Research: Strategies, Assumptions, and Stata Tools

Mayta
Jun 19
4 min read

1 Why Observational Data Are Harder Than RCTs

RCT privilege	What you lose in an observational study
Investigator controls treatment → exchangeability holds by design	Treatment choice depends on prognosis, access, physician preference → systematic confounding
Random allocation → everyone has a shot at either arm (positivity)	Some covariate patterns receive only one treatment (structural zeros)
Protocol dictates a single treatment version (consistency)	Dose, timing, brand, adherence often vary across sites / time
Individual assignment rarely affects others (no-interference/SUTVA)	Spill-over and social or herd effects can occur (e.g., vaccination)

Because none of these guarantees are automatic, every causal claim from observational data must prove or approximate the four identifiability assumptions:

Exchangeability (no unmeasured confounding)
Positivity (overlap)
Consistency (well-defined exposure)
No-interference (one person’s treatment doesn’t change another’s outcome)

Reality check : Exchangeability can never be proven with data alone—only argued through design (choosing confounders with a DAG) and diagnostics.

2 Core Strategy: “Make treated and untreated look interchangeable”

2·1 Model-Based Regression

Feature	Pros	Cons / Assumptions
Adds treatment and measured confounders in one model (linear, log-binomial, logistic, etc.)	Simple; familiar; can incorporate interactions	Requires correct link/linearity; gives conditional odds ratios (must marginalise for causal OR); sensitive to extrapolation outside covariate range

Marginalising a logistic model (e.g., Stata margins) converts conditional effects to the population-average scale.

2·2 Standardisation (G-Computation)

Stratify on categorical confounders
Estimate treatment effect within each cell
Average over the covariate distribution

Benefits: permits different effects in different strata (relaxes “no effect modification”) Limits: works only with categorical confounders; many strata → sparse data → positivity issues.

2·3 Matching

Idea	Typical tool	Diagnostic
For every treated case, find one (or more) untreated with similar covariates → analyse matched pairs/sets	Mahalanobis distance, nearest-neighbour PS matching	Standardised Mean Difference (SMD) < 0.10 for every covariate

Strengths: intuitive “mini-trial” feel, automatically respects common support.Limits: can discard many observations; quality depends on measured covariates only.

Item	Explanation
Why “standardised”?	Dividing by the pooled SD removes the units of measurement, so you can compare imbalance across variables with different scales (e.g., age in years vs BMI in kg/m²).
Intuitive scale	• 0.00 = perfect balance (groups identical on that covariate) • 0.10 ≈ groups differ by one-tenth of a pooled SD (small) • 0.20 ≈ one-fifth of an SD (medium) • 0.50 ≈ half an SD (large imbalance)
Why the 0.10 rule-of-thumb?	Simulation and empirical studies show that an SMD below 0.10 (10% of an SD) typically translates to negligible bias in most treatment-effect estimates. It is strict enough to ensure balance but flexible enough to be achievable in real data.
Application	After matching, weighting, or any propensity-score method, compute the SMD for every confounder. → If all SMDs < 0.10, you can reasonably claim the groups are “balanced” on the observed covariates. → If some SMDs are ≥ 0.10, refine the PS model (add interactions, non-linear terms) or tighten the matching caliper, then re-check.

2·4 Propensity-Score (PS) Methods

Step	Goal / Comment
1. Estimate the propensity score (PS)	Model Pr(Treatment = 1 \| X) with logistic regression or a machine-learning algorithm.
2. Check the Region of Common Support (RCS)	Plot the PS distribution by treatment group; drop or trim observations where the two groups do not overlap.
3. Apply the PS • Adjustment: include PS as a covariate in the outcome model. • Stratification: divide data into PS quintiles/deciles and compare within strata. • Matching: pair treated and untreated units with similar PS (e.g., 1:1, caliper). • Inverse Probability Weighting (IPW): weight each subject by 1/PS (treated) or 1/(1 − PS) (untreated).	All four techniques aim to balance the observed covariates; IPW usually retains more data than strict matching.
4. Re-check balance	Use Standardised Mean Differences (SMDs) or Love plots before vs after applying the chosen PS method. If imbalance persists, refine the PS model (add interactions, non-linear terms, etc.).

PS methods shine when you have many confounders and a moderate-to-large sample.

3 Method Comparison – Big Picture

Method	Handles continuous X	Avoids positivity problems	Scales to many X	Outputs marginal effect directly*	Key modelling risk
Regression / GLM	✅	❌ (depends on extrapolation)	⚠️ (over-fitting)	✅ (for RD/RR) / ⚠️ (OR needs marginalising)	link & linearity
Standardisation	❌ (categorical only)	❌ (many cells)	❌	✅	cell sparsity
Matching	✅	✅ (works inside RCS)	⚠️ (drops data)	✅	poor matches
Propensity Score	✅	✅ (trim/weight)	✅	✅	PS model mis-spec.

*Risk difference / risk ratio are collapsible; odds ratio is not.

4 From Concept to Code – Stata Recipes

(Use only if you already grasp the logic above.)

Goal	Core Stata commands
Outcome model (continuous)	regress y i.treat X …, then margins, dydx(treat)
Outcome model (binary, marginal OR)	logit death i.treat X … → margins treat, predict(pr) → nlcom …
Parametric G-computation (standardisation)	glm y i.treat##i.cat1##i.cat2, link(logit) → margins, atmeans dydx(treat)
Nearest-neighbour matching	teffects nnmatch (y X …) (treat), metric(maha) → pstest X …
Propensity-score IPW	logit treat X … → predict pscore → visual overlap → teffects ipw (y) (treat pscore)

5 Diagnostic Checklist — Don’t Skip

Draw a DAG to pick your confounders.
Check overlap / positivity visually.
Verify balance (SMD < 0.10) after weighting / matching.
Run a sensitivity analysis (e.g., E-value) for unmeasured confounding.
Report the causal scale (risk diff / ratio or marginal OR), not just model coefficients.

6 Key Takeaways

Observational causal inference is all about designing your analysis to recreate the comparability that randomisation would have given you.

Choose the simplest method that addresses your data’s weaknesses.
Always back claims of exchangeability with diagnostics and domain logic.
Balance first, estimate second, then stress-test your assumptions.