Classic MAPE: Mean Absolute Prediction Error and Bootstrap Internal Validation
- Mayta

- 1 hour ago
- 2 min read
Apparent Performance and Internal Validation Using Bootstrap

1. Introduction
In clinical prediction models, performance is commonly evaluated using metrics such as:
AUROC → discrimination (ranking ability)
Calibration slope & intercept → agreement between predicted and observed risk
Brier score → overall accuracy
However, an additional intuitive metric is:
Mean Absolute Prediction Error (MAPE)
This metric directly quantifies how far predicted probabilities are from actual outcomes.
2. Definition of MAPE (classic form)
Here MAPE means the mean absolute error between predicted probabilities and observed binary outcomes—on the probability scale (0–1), not the textbook “percentage error” formula often also called MAPE in other fields.
Where:
p^i = predicted probability (0–1)
yi = observed outcome (0 or 1)
👉 This is mathematically equivalent to Mean Absolute Error (MAE) applied to predicted probabilities.

3. Interpretation of MAPE
MAPE represents:
Average absolute difference between predicted risk and actual outcome
Example:
Patient 1: predicted = 0.65, outcome = 1 → error = 0.35
Patient 2: predicted = 0.20, outcome = 0 → error = 0.20
If average across all patients = 0.25 → On average, predictions are 25% away from the truth
4. Apparent MAPE
Definition
Apparent MAPE is calculated using:
The final model
Evaluated on the same dataset used for model development
Key Insight (Important)
Even on training data:
MAPE ≠ 0
Why?
Logistic regression predicts probabilities, not exact outcomes
Patients with similar predictors may have different outcomes
The model estimates average risk, not individual truth
👉 Therefore, error always exists, even in training data
Interpretation
Apparent MAPE is optimistically low
Because the model is evaluated on data it has already “seen”

5. Internal Validation Using Bootstrap
To correct for optimism, bootstrap resampling is used.
Algorithm (e.g., 500 iterations)
For each bootstrap iteration (b):
Step 1 – Resample
Draw a bootstrap sample (with replacement) from original data
Step 2 – Fit model
Fit logistic regression on bootstrap sample
Step 3 – Apparent performance (app_b)
Predict on bootstrap sample
Compute:
(This is optimistic)
Step 4 – Test performance (test_b)
Use same bootstrap model
Predict on original dataset
Compute:
(This is more realistic)
Step 5 – Optimism
Because training error is lower:
👉 optimism is typically negative
6. Optimism-Corrected MAPE
After all bootstrap iterations:
Then:
Key Property
Since:
apparent MAPE < test MAPE
optimism < 0
Then:
Corrected MAPE > Apparent MAPE
Interpretation
Corrected MAPE represents:
Expected prediction error in new patients from the same population

7. Comparison with AUROC
Important Insight
A model can have:
High AUROC → good ranking
High MAPE → poor probability accuracy
👉 Therefore, MAPE adds complementary information

8. Role of MAPE in Clinical Research
MAPE is useful when:
You care about the accuracy of predicted probabilities
You want a simple, interpretable error metric
However, it should be supplementary, not primary.
Recommended reporting:
AUROC
Calibration slope & intercept
Brier score
MAPE (optional but informative)
9. Key Takeaways
MAPE = mean absolute difference between predicted probability and outcome
Apparent MAPE is optimistically low
Bootstrap estimates and corrects this optimism
Corrected MAPE reflects real-world expected error
MAPE complements (but does not replace) AUROC



Comments