Accuracy, Precision, Reliability, and Validity: Clinical Epidemiology and Clinical Statistics Target-Board Explained
- Mayta

- 1 day ago
- 3 min read
1. Accuracy & Precision as The Target-Board Metaphor: The Intuitive Foundation
Imagine a classic shooting target. The bullseye = the true value or true construct. Each shot = one measurement.
This metaphor perfectly illustrates how four fundamental measurement concepts differ.
Accuracy
Definition: How close the average of your measurements is to the true value, usually in a cross-sectional study.
On the target:
High accuracy → cluster centered around the bullseye.
Low accuracy → cluster systematically shifted away (bias).
Precision
Definition: How close measurements are to one another, usually in a cross-sectional study.
On the target:
High precision → tight grouping.
Low precision → scattered shots.
Precision says nothing about correctness—only consistency.
Putting Accuracy + Precision Together
Accuracy | Precision | Visual Meaning |
High | High | Tight cluster on bullseye |
High | Low | Scattered but centered |
Low | High | Tight cluster off-center |
Low | Low | Scattered and off-center |

2. Reliability & Validity: Are We Consistent? Are We Measuring the Right Thing?
While accuracy and precision are about numerical measurement,reliability and validity describe measurement quality and construct truth.
Reliability
Definition: How consistently the measurement method produces the same results under the same conditions.
On the target:
Reliable → tight cluster (regardless of location).
Unreliable → scattered shots.
Clinical perspective:If you repeat the test on the same patient in the same state, you should get similar values.
Types include:
Test–retest reliability
Inter-rater reliability
Internal consistency (scales)


Validity
Definition: Whether the measurement truly captures the phenomenon it claims to measure.
On the target:
Valid → shots centered on the correct bullseye.
Invalid → cluster on a wrong target.
Validity fundamentally asks:
“Are we measuring the right thing?”


Reliability vs Validity: Critical Logic
You can be reliable without being valid(tight cluster, wrong bullseye).
You cannot be valid with very poor reliability(if shots are everywhere, you cannot claim they represent the true value).
3. Clinical Research Examples for Each Concept
3.1 Accuracy Example – Blood Pressure Measurement
Reference standard: Arterial line = 130 mmHg
Cuff A (readings: 129, 131, 128, 132)
Mean ≈ 130 → high accuracy (low bias)
Cuff B (readings: 140, 142, 141, 143)
Mean ≈ 141 → low accuracy (systematic overestimation)
Key metric: Mean difference (bias)
3.2 Precision Example – Repeat BP Measurements
Same patient, repeated 5 times:
Device A: 144, 145, 143, 145, 144 → high precision
Device B: 130, 145, 120, 150, 135 → low precision
Key metric: Standard deviation or width of confidence intervals
Reminder: A device can be precise but inaccurate.
3.3 Reliability Examples
a) Continuous Measures – ICC
Example: Handgrip dynamometer, two measurements per patient
ICC = 0.93 → excellent reliabilityReflects low measurement error relative to patient-to-patient variation.
b) Categorical Measures – Cohen’s Kappa
Example: Two radiologists reading chest X-rays
Kappa = 0.80 → substantial agreementAccounts for chance agreement.
3.4 Validity Examples
a) Criterion Validity – Diagnostic Test
Rapid antigen test vs RT-PCR for COVID
High sensitivity, specificity, LR+, LR– → strong criterion validity(shots land around the correct “disease” bullseye)
b) Construct Validity – Psychological Scale
Example: 15-item depression questionnaire
Evidence needed:
Covers all relevant domains → content validity
Correlates with known depression scales → convergent validity
Does not correlate with unrelated constructs → discriminant validity
Differentiates clinically depressed from controls → known-group validity
Potential pitfall:
Patients answer consistently → high reliability
But the tool measures fatigue, not depression → poor validity
This is the classic reliable but not valid scenario.
4. Integrating the Four Concepts: The CECS Logic Map
Accuracy
Truthfulness of the average measurement
Reduced by systematic bias
Precision
Tightness of repeated measurements
Reduced by random error
Reliability
Consistency of measurement
Largely depends on precision
Key for reproducible clinical assessment
Validity
Correctness of the construct being measured
Requires conceptual correctness + adequate reliability
Key Takeaway Logic
Accuracy + Precision → Reliability(small random error + small systematic error)
Reliability is necessary but not sufficient for validity(a perfectly consistent tool can still measure the wrong construct)
Validity integrates all four concepts
Think:
Accuracy / Validity = closeness to the true bullseye
Precision / Reliability = tightness and consistency of shots
5. Summary
Accuracy = closeness to the true value
Precision = closeness of measurements to each other
Reliability = consistency of measurement (statistical reproducibility)
Validity = correctness of what you are measuring (construct truth)
Errors:
Systematic error → reduces accuracy & validity
Random error → reduces precision & reliability
Clinical practice depends on both:
Accurate enough to reflect truth
Precise enough to be trusted
Reliable enough to reproduce
Valid enough to matter






Comments