top of page

Linear Predictor (LP): Foundation of Debray’s Relatedness Assessment in External Validation

  • Writer: Mayta
    Mayta
  • 7 days ago
  • 2 min read

Updated: 6 days ago

Introduction

Where It Comes From and How Its Distribution Is Formed**

The Linear Predictor (LP) is a central component in clinical prediction modeling and a key element of the Debray Step 1 approach to assessing relatedness between development and validation populations.

This article shows:

  1. Where LP values come from

  2. How LP is computed

  3. How LP becomes a distribution

  4. Why Debray uses LP to evaluate relatedness

  5. Four step-by-step images demonstrating LP creation

1 What Is the LP (Linear Predictor)?

The LP is the raw score produced by a regression model before it is converted to a probability.

For a logistic regression model:

Where:

  • β0 = intercept

  • βi = coefficient of predictor

  • Xi = value of the predictor for a specific patient

LP is the foundation of the model’s structure.

2 LP Comes Directly From the Model Equation

LP values do not come from a histogram. They do not come from probability. They do not come from bins.

They come directly from:

  • Model coefficients

  • Patient predictor values

💡 LP exists BEFORE probability. The predicted probability is:

But Debray’s method uses LP itself, because LP captures the model structure independent of prevalence.

3 Concrete Example (Simple Diabetes Model)

Imagine this model:

Example Patients:

Patient A Age = 60 Obese = 1

Patient B Age = 30 Obese = 0

These values (–2.1, –4.8…) are LP values.

4 How LP Becomes a Distribution (Debray Approach 2)

Once you compute LP for all patients, you:

  1. Collect the LP values

  2. Visualize them in a histogram

  3. Convert the histogram into a density curve (LP distribution)

  4. Compare LP mean & LP SD between development and validation datasets

Differences in:

  • LP Mean → baseline risk difference

  • LP SD → case-mix heterogeneity difference

These determine relatedness.

Step-by-Step Visual Explanation

Below are your four steps EXACTLY as they were generated.

🔵 Step 1 — LP Value for Each Patient

Model coefficients + patient predictor values → One LP per patient

You see one dot per patient.

ree

🟠 Step 2 — First 20 LP Values (Example Raw Values)

These are computed directly from:

[\text{LP} = -6 + 0.04 \cdot \text{Age} + 1.5 \cdot \text{Obese}]

Example printed values (from your code):

[-3.72, -1.98, -4.72, -4.88, -3.34, -3.26, -3.5 , -3.1 , -5.2 ,
 -3.06, -3.66, -4.72, -4.92, -3.4 , -4.96, -4.2 , -3.2 , -2.9 ,
 -3.72, -4.48]

These are EXACT LP values.

🔶 Step 3 — Histogram of LP Values

Groups the LP values into bins (ranges).

This shows how many patients fall into each LP range.

ree

🟨 Step 4 — Smooth LP Distribution Curve

This is the LP distribution used in Debray’s Approach 2.

ree

This representation makes it easy to compare:

  • LP Mean (center)

  • LP SD (spread)

Between development & validation datasets.

These are the two core measures of relatedness in Debray’s framework.

Summary

LP values come from:

  • The model formula (coefficients)

  • The patient data (predictor values)

LP distribution comes from:

  • Collecting all LP values

  • Plotting their histogram

  • Converting into a density curve

Debray Step 1 uses LP because:

  • LP reflects the model’s linear structure

  • LP Mean shows baseline risk differences

  • LP SD shows case-mix heterogeneity

  • Comparing LP distributions reveals whether two populations are similar (reproducible) or different (transportable)


Recent Posts

See All

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Post: Blog2_Post

​Message for International and Thai Readers Understanding My Medical Context in Thailand

Message for International and Thai Readers Understanding My Broader Content Beyond Medicine

bottom of page