What is Marginalisation in Clinical Statistics?
- Mayta
- Jun 23
- 2 min read
Marginalisation refers to the process of transforming effect estimates derived from specific subgroups (i.e., conditional on covariates like age or sex) into a population-level average effect—that is, what would happen across the entire population.
💡 Why Do We Need Marginalisation?
When fitting a standard logistic regression, for example:
logistic outcome treat age sex
This produces a conditional odds ratio (OR) for treat:
It estimates the effect of treatment conditional on being of the same age and sex.
That is: “What is the effect of treatment among people who are the same in terms of age and sex?”
But often, our real-world question is:
“If we give the vaccine to everyone, how much will it reduce risk at the population level?”
This calls for a marginal OR—an estimate of the Average Treatment Effect (ATE) across the population.
🧮 How to Perform Marginalisation in Stata
Start with your logistic model that includes covariates:
logistic outcome treat age sex
This model predicts each person’s probability of the outcome, given their treatment and covariates.
Then use the margins command:
margins treat, predict(pr)
This command:
Simulates the outcome for everyone as if they were treated (treat = 1),
Then simulates the outcome for everyone as if they were not treated (treat = 0),
Then compares the average predicted probabilities across those two scenarios.
The difference (or ratio) between these predictions represents the Marginal Risk Difference or Marginal OR.
📊 Comparison Table: Conditional vs. Marginal OR
Feature | Conditional OR ✅ | Marginal OR ✅ |
Focused on specific strata (e.g., age = 60, sex = male) | ✅ Yes | ❌ No |
Applies to the full population | ❌ No | ✅ Yes |
Directly output from logistic model | ✅ Yes | ❌ No |
Requires margins command | ❌ No | ✅ Yes |
🎓 Definition Recap
Marginalisation is:
“The process of averaging model-based predictions across all levels of covariates to estimate population-level treatment effects.”
This helps translate regression outputs into clinically and policy-relevant quantities—especially for interpreting treatment effects across a diverse population.
Комментарии