Step 1 of the Debray Framework: Investigating Relatedness in External Validation of Clinical Prediction Models

Mayta
Nov 14, 2025
3 min read

Updated: Nov 18, 2025

Introduction

Before evaluating the predictive performance of a clinical prediction model in a new dataset, a critical prerequisite is determining how similar or different the validation population is compared with the development population. This first step—Investigating Relatedness—forms the foundation of the Debray 3-Step Framework for external validation. It clarifies what kind of external validity is being assessed: reproducibility or transportability.

Why Relatedness Matters

External validation is not a single concept. Its interpretation depends on how the validation data relate to the original development data.

Reproducibility

Validation is conducted in a population that is highly similar to the model’s development sample.
Case mix, disease prevalence, and predictor distributions are closely aligned.
Aim: Show that model performance is consistent when applied to new but equivalent samples from the same target population.

Transportability

Validation is performed in a population that is different from the development sample.
Differences may include patient demographics, disease spectrum, clinical setting, or diagnostic workup.
Aim: Assess whether the model generalizes to different yet related clinical contexts.

Because real-world validation datasets almost never perfectly match the development population, Debray et al. emphasize viewing relatedness as a continuum, not a binary classification. Understanding where a validation study lies on this continuum prevents misinterpretation—especially when lower performance results simply from population differences rather than model failure.

How Relatedness Is Quantified

Debray et al. propose two complementary quantitative approaches to assess population relatedness:

Approach 1 — Membership Model Analysis

This method evaluates whether individuals can be statistically distinguished as coming from the development or validation dataset.

How it works

A logistic regression model is constructed where:

Outcome = indicator of dataset membership (0 = development, 1 = validation) Check that the development and validation groups can be separated. ดูว่าแยกกลุ่ม development และ validation ออกจากกันได้ไหม
Predictors = all variables used in the original prediction model, including the outcome that the model predicted (or all key case-mix variables)

Interpretation

High discrimination (c-statistic close to 1.0):The model can easily distinguish between samples → populations differ substantially.
Low discrimination (c-statistic close to 0.5):The model cannot distinguish them → populations are highly similar.

Why this matters

Membership modeling provides a single summary measure of relatedness and accommodates:

continuous variables
categorical predictors
nonlinearities (if specified)

This approach directly quantifies whether the two populations share the same case-mix structure.

Approach 2 — Comparing Linear Predictor (LP) Distributions

The second method examines differences in the distribution of the Linear Predictor (LP)—the weighted sum of predictor values used in the original model.

Interpretation Dimensions

1. LP Mean — Baseline Risk

LP mean = average risk profile in the population

Differences in mean LP reflect differences in:

baseline risk
disease prevalence
average severity or comorbidity burden

2. LP Standard Deviation — Case-Mix Heterogeneity

LP SD = spread of risk profiles

A wider LP SD indicates:

greater diversity of patient characteristics
broader spectrum of disease severity
greater potential for discrimination (higher c-statistic)

A narrower LP SD indicates a homogeneous population where discrimination may naturally decline.

Why this approach is powerful

The LP summarizes all predictor information into a single metric, allowing:

simple visual comparison using density curves
direct quantification of how the populations differ
linkage to expected performance (e.g., discrimination depends largely on LP SD)

Empirical Example from Debray et al.

In their DVT (deep venous thrombosis) study, Debray and colleagues applied both approaches across four validation datasets:

Validation Study 1

LP means nearly identical
LP SDs nearly identical
Membership model c-statistic ≈ 0.5→ Populations highly similar → assessing reproducibility

Validation Studies 2 and 3

LP distributions shifted and widened
Membership model showed clear separability→ Populations clearly different → assessing transportability

Interpretation

Differences in performance across these datasets were not simply “model failure”
They reflected population differences, which is essential for correct interpretation and model updating

Why Step 1 Must Come First

Evaluating a model’s calibration or discrimination without understanding population relatedness can lead to:

false assumptions about model generalizability
unnecessary or incorrect model updating
misleading clinical implementation decisions

Debray’s Step 1 ensures that performance metrics in Step 2 are interpreted in context, not in isolation.

Summary

The Debray framework transforms external validation into a structured diagnostic process. Step 1—Investigating Relatedness—is foundational and provides:

clarity on whether a study assesses reproducibility or transportability
quantitative evidence using membership models and LP distribution comparisons
essential context for correctly interpreting calibration and discrimination
guidance for deciding if model updating is appropriate

Step 1 of the Debray Framework: Investigating Relatedness in External Validation of Clinical Prediction Models

Introduction

Why Relatedness Matters

Reproducibility

Transportability

How Relatedness Is Quantified

Approach 1 — Membership Model Analysis

How it works

Interpretation

Why this matters

Approach 2 — Comparing Linear Predictor (LP) Distributions

Interpretation Dimensions

1. LP Mean — Baseline Risk

2. LP Standard Deviation — Case-Mix Heterogeneity

Why this approach is powerful

Empirical Example from Debray et al.

Validation Study 1

Validation Studies 2 and 3

Interpretation

Why Step 1 Must Come First

Summary

Recent Posts

Comments