Risk Difference vs Odds Ratio in Logistic Regression: A Quick Stata User Guide (with Stata Cheats)
- Mayta
- Jun 25
- 4 min read
TL;DRLogit shows log-odds, logistic shows odds ratios. Risk ≠ Odds; Risk Difference (RD) ≠ Odds Ratio (OR). Model choice flows from outcome type and clinical question. Use margins for predicted risk and RD; use glm, link(identity) to model RD directly.
1. One Model, Two Faces
Stata command | What it really models | Default output | Typical use-case |
logit | log-odds (β) | β coefficients | building models, plotting |
logistic | same equation as logit | exp(β) (OR) | reporting interpretable effect sizes |
Think of them as identical twins wearing different name-tags.
* Same fit, different dress code
logit outcome exposure covariates // β (log-odds)
logistic outcome exposure covariates // OR
2. Risk, Odds & Log-Odds – The Mental Map
Probability (Risk) → Odds (p / 1-p) → Log-Odds (log(odds))
0 ⇢ 1 0 ⇢ ∞ –∞ ⇢ +∞ (straight line)
Risk Difference (RD) = linear gap in probability (easy to grasp).
Odds Ratio (OR) = multiplicative change in odds (can exaggerate when outcome common).
Log-Odds = what the regression actually handles; linear in X, so estimation is easy.
3. Quick Model Picker

Outcome | Preferred model | Stata verbs |
Binary (0/1) | Logistic regression(OR or RD) | logistic, logit, glm |
Continuous | Linear regression | regress |
Counts | Poisson / NB | poisson, nbreg |
Time-to-event | Survival | stcox, streg |
Rule-of-thumb:
If clinicians ask, “How many times higher?” → OR/RR.
If they care, “How much absolute risk change?” → RD.
4. Getting Risk & Risk Difference from a Logistic Fit
* Fit logistic model (any flavour)
logit died i.exposure age sex bp
* A. Individual predicted risk
predict p_death, pr // probability for each person
* B. Group risks + RD in one shot
margins exposure // mean risk (p) by exposure
margins exposure, dydx(exposure) // RD_exposed – RD_unexposed
* Output: risk_unexp, risk_exp, RD, 95% CI
Direct RD modelling (optional)
glm died i.exposure age sex bp, family(binomial) link(identity) vce(robust)
* β_exposure is the RD (difference in probability)
Tip: Always tack on vce(robust) with the identity link—the variance formula is quirky.
5. Mini-Example
Imagine a cohort where treatment cut 28-day mortality from 12 % to 8 %.
. logit death i.treat age sex
. margins treat
--------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+------------------------------------------------------------
treat |
0 | .1200 .0078 15.40 0.000 .1048 .1352
1 | .0800 .0061 13.06 0.000 .0681 .0918
--------------------------------------------------------------------------
. margins treat, dydx(treat)
--------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------
treat | (1 vs 0) -.0400 .0099 -4.04 0.000 -.0594 -.0206
--------------------------------------------------------------------
RD = –4 % (absolute benefit).
Convert to Number-Needed-to-Treat: NNT ≈ 1/0.04 = 25.
For comparison, logistic would show an OR ≈ 0.64 for the same effect.
6. When RD Beats OR (and vice-versa)
Scenario | Recommend |
Outcome frequency <10 % | OR or RR fine (they converge) |
Clinicians want absolute impact | RD (NNT pops right out) |
Meta-analysis of diverse settings | Log-OR (statistically handy) |
Policy / public-health framing | RD, Risk Ratio, or ARR |
Message: Match the metric to the decision.
7. A Decision Flow (20-second check)
Binary outcome?
└── Yes ──► Is absolute risk change what matters?
├─ Yes: model RD (glm identity or margins)
└─ No : report OR (logistic) or RR (modified Poisson)
Continuous outcome?
└── Use linear regression (regress)
Count outcome?
└── Poisson/NB (poisson, nbreg)
Time-to-event?
└── Cox / parametric survival
8. Common Pitfalls (Rapid Fire)
OR ≠ RR when outcome common – avoid surprise.
glm, link(identity) needs robust SE.
Perfect prediction? Combine sparse categories or use firthlogit.
For adjusted risk or RD, always follow with margins – it standardises across covariates.
Report CI and sample size alongside any metric; NNT alone hides precision.
9. Copy-Paste Stata Cheat Sheet
* Basic logistic with OR output
logistic death i.treat age sex comorbid, or
* Predicted risk by treat group
margins treat
* Risk difference
margins treat, dydx(treat)
* Direct RD model
glm death i.treat age sex, family(binomial) link(identity) vce(robust)
* Modified Poisson for adjusted RR
glm death i.treat age sex, family(poisson) link(log) vce(robust) eform
*--- 2×2 cohort table: RISK RATIO & RISK DIFFERENCE --------------------------
cs outcome exposure, risk // RR, RD, AR% (+ exact CI if add option exact)
cs outcome exposure, risk exact // exact (Fisher) CI
*--- 2×2 case-control table: ODDS RATIO --------------------------------------
cc case exposure, woolf // OR (+ Woolf CI; add exact for Fisher)
cc case exposure, exact // exact (Cornfield/Fisher) CI
*--- Extra bells -------------------------------------------------------------
cs outcome exposure, by(strata) // stratified RR (M-H pooled)
cc case exposure, by(strata) // stratified OR (M-H pooled)
cs (cohort study) estimates Risk Ratio, Risk Difference, Attributable Risk% from a 2×2 table.
cc (case-control) estimates Odds Ratio from a 2×2 table.
10. Wrap-Up
Same equation, different lens: choose logit for model development, logistic for quick ORs.
Translate coefficients into risk (with margins) whenever the audience thinks in probabilities.
Risk Difference speaks the language of clinicians and guidelines; get it either via margins ... dydx() or a GLM with identity link.
Let the clinical question drive the statistic, not the other way around.
Happy modelling – and may your odds be ever in your favour (or your risks differ in the right direction).
Comments