Accuracy, Precision, Reliability, and Validity: Clinical Epidemiology and Clinical Statistics Target-Board Explained

Mayta
Nov 19, 2025
3 min read

1. Accuracy & Precision as The Target-Board Metaphor: The Intuitive Foundation

Imagine a classic shooting target. The bullseye = the true value or true construct. Each shot = one measurement.

This metaphor perfectly illustrates how four fundamental measurement concepts differ.

Accuracy

Definition: How close the average of your measurements is to the true value, usually in a cross-sectional study.

On the target:

High accuracy → cluster centered around the bullseye.
Low accuracy → cluster systematically shifted away (bias).

Precision

Definition: How close measurements are to one another, usually in a cross-sectional study.

On the target:

High precision → tight grouping.
Low precision → scattered shots.

Precision says nothing about correctness—only consistency.

Putting Accuracy + Precision Together

Accuracy	Precision	Visual Meaning
High	High	Tight cluster on bullseye
High	Low	Scattered but centered
Low	High	Tight cluster off-center
Low	Low	Scattered and off-center

2. Reliability & Validity: Are We Consistent? Are We Measuring the Right Thing?

While accuracy and precision are about numerical measurement,reliability and validity describe measurement quality and construct truth.

Reliability

Definition: How consistently the measurement method produces the same results under the same conditions.

On the target:

Reliable → tight cluster (regardless of location).
Unreliable → scattered shots.

Clinical perspective:If you repeat the test on the same patient in the same state, you should get similar values.

Types include:

Test–retest reliability
Inter-rater reliability
Internal consistency (scales)

Validity

Definition: Whether the measurement truly captures the phenomenon it claims to measure.

On the target:

Valid → shots centered on the correct bullseye.
Invalid → cluster on a wrong target.

Validity fundamentally asks:

“Are we measuring the right thing?”

Reliability vs Validity: Critical Logic

You can be reliable without being valid(tight cluster, wrong bullseye).
You cannot be valid with very poor reliability(if shots are everywhere, you cannot claim they represent the true value).

3. Clinical Research Examples for Each Concept

3.1 Accuracy Example – Blood Pressure Measurement

Reference standard: Arterial line = 130 mmHg

Cuff A (readings: 129, 131, 128, 132)

Mean ≈ 130 → high accuracy (low bias)

Cuff B (readings: 140, 142, 141, 143)

Mean ≈ 141 → low accuracy (systematic overestimation)

Key metric: Mean difference (bias)

3.2 Precision Example – Repeat BP Measurements

Same patient, repeated 5 times:

Device A: 144, 145, 143, 145, 144 → high precision
Device B: 130, 145, 120, 150, 135 → low precision

Key metric: Standard deviation or width of confidence intervals

Reminder: A device can be precise but inaccurate.

3.3 Reliability Examples

a) Continuous Measures – ICC

Example: Handgrip dynamometer, two measurements per patient

ICC = 0.93 → excellent reliabilityReflects low measurement error relative to patient-to-patient variation.

b) Categorical Measures – Cohen’s Kappa

Example: Two radiologists reading chest X-rays

Kappa = 0.80 → substantial agreementAccounts for chance agreement.

3.4 Validity Examples

a) Criterion Validity – Diagnostic Test

Rapid antigen test vs RT-PCR for COVID

High sensitivity, specificity, LR+, LR– → strong criterion validity(shots land around the correct “disease” bullseye)

b) Construct Validity – Psychological Scale

Example: 15-item depression questionnaire

Evidence needed:

Covers all relevant domains → content validity
Correlates with known depression scales → convergent validity
Does not correlate with unrelated constructs → discriminant validity
Differentiates clinically depressed from controls → known-group validity

Potential pitfall:

Patients answer consistently → high reliability
But the tool measures fatigue, not depression → poor validity

This is the classic reliable but not valid scenario.