← All posts

Step-by-Step External Validation and Recalibration in Stata (Step 2 in Debray Framework)

Clinical Epidemiology ResearchUniqcret doctor knowledgesMethodology and Research DesignDiagnosis [Methodology]Prognosis [Methodology]Data Analytics or StatisticsStata [Data Analytics]

Step 0 – Load the external validation cohort

clear
cd "C:\WORK\Location\External validation"

use "validation.dta", clear
summarize

Why? We start with the validation dataset only. Step 2 is about how the original model behaves in a new population – no re-fitting from scratch, just evaluation and recalibration.


Step 1 – Reconstruct the original linear predictor and risk

* 1.1 Generate the linear predictor (log-odds) using original coefficients
gen logodds = -4.4415 ///
    + 0.0084*(age - 74.3651) ///
    + 0.6452*(bt  - 37.1226) ///
    + 0.8591*((map/100)^-2 - 1.0545) ///
    + 2.15*ettyn ///
    - 0.0473*(na  - 138.6417) ///
    + 0.0199*(bun - 18.7486) ///
    - 1.0573*(alb - 3.8174)

* 1.2 Convert log-odds to predicted probability
gen prob = invlogit(logodds)

* 1.3 Quick check of distribution
summarize logodds prob

Why?


Step 2 – Overall calibration (O:E and CITL)

2.1 Observed / Expected (O:E) ratio

* Expected number of events based on model
summ prob
local expected = r(sum)

* Observed number of events
summ death
local observed = r(sum)

display "Observed = "  `observed'
display "Expected = " `expected'
display "O:E = " `observed' / `expected'

Why?

Debray calls this part of “calibration in the large” — is the overall incidence aligned?

2.2 Calibration-in-the-large (CITL) via offset

This is the TRIPOD / Debray / Steyerberg recommended way.

* Fit calibration model with LP as offset
logit death, offset(logodds)

* CITL is the intercept of this model
display "CITL (system) = " _b[_cons]

What this model is:

logit { P ( Yi =1 ) } = α + 1 × logoddsi

Why not just use logit(mean(observed)) − logit(mean(predicted))? Because logit is nonlinear, the logistic regression intercept with an offset is the true maximum-likelihood CITL, accounting for the full distribution of predicted risks rather than just their mean.

You can still show the “manual” approximation for teaching:

* Manual approximation (for illustration; not for reporting)
summ death
local p_obs = r(mean)

summ prob
local p_exp = r(mean)

display "Manual CITL = " logit(`p_obs') - logit(`p_exp')

But for publications: use b[cons] from the offset model.


Step 3 – Visual calibration: calibration plot

pmcalplot prob death, ci legend(off) ///
    title("Calibration of Original _____ Model") ///
    name(cal_orig, replace)

Why?

This matches Debray’s Step 2 “visual assessment”.


Step 4 – Recalibration: intercept only (keep original slopes)

To update the intercept for the validation cohort, we need the LP without the original intercept.

4.1 Remove original intercept from LP

* Original intercept was -4.4415, so remove it
gen logodds_noint = logodds + 4.4415

Now:

logodds_nointi = βj Xji

(no intercept term).

4.2 Fit intercept-only recalibration (offset with no intercept inside LP)

* Recalibrate intercept only: slopes fixed, new intercept
logit death, offset(logodds_noint)

display "Recalibrated intercept (validation) = " _b[_cons]

Model:

logit { P ( Yi =1 ) } = αval + 1 × logodds_nointi

Why remove the intercept?

4.3 Get recalibrated probabilities (intercept-only)

* New linear predictor and probability after intercept-only update
predict lp_intonly, xb ///linear predictor
gen prob_intonly = invlogit(lp_intonly)

summarize lp_intonly prob_intonly death

* Calibration plot for intercept-only recalibrated model
pmcalplot prob_intonly death, ci legend(off) ///
    title("Calibration: Intercept-only Recalibrated _____ Model) ///
    name(cal_intonly, replace)

This step generates new probabilities from the recalibrated model you just adjusted.

Interpretation:


Step 5 – Full recalibration: intercept + slope (calibration slope)

Now we check whether the strength of predictor effects is the same in the new data.

5.1 Fit calibration model with logodds as a predictor

* Full recalibration: estimate both CITL and calibration slope
logit death logodds

display "New intercept (CITL*) = " _b[_cons]
display "Calibration slope = " _b[logodds]

Model:

logit { P ( Yi =1 ) } = α + β × logoddsi

Interpretation of slope β:

This is exactly Debray’s “calibration slope” metric.

5.2 Build fully recalibrated predictions from slope + intercept

* Recalibrated linear predictor using new intercept and slope
gen lp_full = _b[_cons] + _b[logodds] * logodds

* Recalibrated probabilities
gen prob_full = invlogit(lp_full)

summarize lp_full prob_full death

* Calibration plot for fully recalibrated model
pmcalplot prob_full death, ci legend(off) ///
    title("Calibration: Intercept + Slope Recalibrated _____ Model") ///
    name(cal_full, replace)

Why?

logit ( pi new ) = αnew + βnew × LPorig,i

Step 6 – Discrimination: c-statistic / AUROC

Debray Step 2 also requires discrimination assessment (how well the model ranks risk).

6.1 ROC for original model

* Discrimination of original model
roctab death prob

6.2 ROC for recalibrated models (optional)

* Discrimination of fully recalibrated model
roctab death prob_full

* Or use lroc after the logit fit
logit death logodds   // last model used for discrimination
lroc, ///
    title("ROC Curve for _____ Model") ///
    xtitle("1 - Specificity") ///
    ytitle("Sensitivity") ///
    name(roc______ Model, replace)

Why?


Step 7 – How all this maps to Debray Step 2

Putting it together:

  1. Construct original LP & risk (logodds, prob)→ evaluate the original model as is in the validation cohort.
  2. O:E ratio→ quick sense of global calibration: does the model over- or under-predict total events?
  3. CITL via offset (logit death, offset(logodds))→ exact intercept shift needed to match average risk.
  4. Calibration plot (pmcalplot)→ visualize local miscalibration across the risk spectrum.
  5. Intercept-only recalibration→ adjust baseline risk for the new cohort, keep original associations.
  6. Full recalibration with slope (logit death logodds)→ check if associations are weaker/stronger in validation; update both intercept and slope.
  7. C-statistic (roctab / lroc)→ quantify discrimination; interpret alongside LP SD and case-mix from Step 1.

That’s exactly Debray Step 2: quantitative metrics (CITL, slope, c-statistic) plus visual calibration, with clear interpretation of what is off — baseline risk vs predictor effects vs case-mix.


1 Intercept-only recalibration → new model

From the development model:

logoddsi = αdev + j βj Xji

e.g. α'dev ​= −4.4415, and the β’s are your age, BT, MAP, etc.

You then do:

gen logodds_noint = logodds + 4.4415   // remove old intercept
logit death, offset(logodds_noint)
display _b[_cons]

This fits:

logit (pi) = αval + 1 × j βj Xji

👉 New model (intercept-only recalibrated):

αnew = αval × (from _b[_cons])
βjnew = βjold

So the formula to use in this setting is:

logit ( pinew ) = αval + j βjold Xji

In practice you already did the key step:

predict lp_intonly, xb
gen prob_intonly = invlogit(lp_intonly)

✅ For clinical use in this population → you can report same β’s, new intercept (α_val) as the recalibrated model.

2 Intercept + slope recalibration (calibration slope)

Now you do:

logit death logodds
display _b[_cons]
display _b[logodds]

Stata fits the calibration model:

logit ( pi ) = αcal + βcal × logoddsi

Where:

You then create:

gen lp_full  = _b[_cons] + _b[logodds]*logodds
gen prob_full = invlogit(lp_full)

This is the recalibrated logit:

logit ( pinew ) = αcal + βcal × logoddsi

You can use this as-is in practice:

That’s already a valid updated model.

If you want explicit new α and new β for each predictor

Sometimes you want the expanded form in terms of the original X’s:

logit ( pinew ) = αcal + βcal ( αdev + j βj Xji ) = ( αcal + βcal αdev ) αnew + j ( βcal βj ) βjnew Xji

So:

αnew = αcal + βcal αdev
βj,new = βcal × βj

In Stata (conceptually):

* After: logit death logodds
scalar alpha_dev  = -4.4415               // from original model
scalar alpha_cal  = _b[_cons]             // from calibration model
scalar slope_cal  = _b[logodds]           // calibration slope β_cal

scalar alpha_new  = alpha_cal + slope_cal*alpha_dev
scalar beta_age_new = slope_cal*0.0084    // and so on for each β_j

✅ That gives you a new full set of coefficients (α_new's, β_new’s) that you can publish/put in a calculator if you want everything expressed in terms of the original predictors.

3 Is this what we call “model calibration”?

Yes, in the Debray / Steyerberg terminology:

So your summary is essentially correct:

If you tell me your actual α_cal and β_cal, I can write out the exact new coefficients for Age, BT, MAP, etc., using your MAGENTA numbers.