← All posts

Types of ICC (intraclass correlation coefficient): One-Way, Two-Way Random, and Mixed Effects Explained

Clinical Epidemiology ResearchUniqcret doctor knowledgesMethodology and Research Design
Types of ICC (intraclass correlation coefficient): One-Way, Two-Way Random, and Mixed Effects Explained

1 One-way Random Effects: ICC(1) / ICC(1,1) / ICC(1,k)

Concept

When to use

Typical examples

What problem it solves

Stata code (built-in icc)

* ICC(1,1): one-way random, single measurement
icc measure id, oneway single

* ICC(1,k): one-way random, average of k ratings per subject
icc measure id, oneway average

2 Two-way Random Effects: ICC(2,1) / ICC(2,k)

Concept

When to use

Typical examples

What problem it solves

Interpretation

“If a different clinician/GP/radiologist were to rate these patients, how reliable would the scores be?”

Stata code (built-in icc)

* ICC(2,1): two-way random, absolute agreement, single rating
icc measure id rater, random absolute single

* ICC(2,k): two-way random, absolute agreement, average of k raters
icc measure id rater, random absolute average

3 Two-way Mixed Effects: ICC(3,1) / ICC(3,k)

Concept

When to use

Typical examples

What problem it solves

Interpretation

“Given these exact raters, how reproducible are their scores?”Not: “What happens if a different rater reads the images?”

Stata code (built-in icc)

* ICC(3,1): two-way mixed, absolute agreement, single rating
icc measure id rater, mixed absolute single

* ICC(3,1) consistency version:
icc measure id rater, mixed consistency single

* ICC(3,k): two-way mixed, absolute agreement, mean of k raters
icc measure id rater, mixed absolute average

🔁 Single vs Average: (1,1) vs (1,k); (2,1) vs (2,k); (3,1) vs (3,k)

Example:

📌 Absolute Agreement vs Consistency

These are about what counts as “error” in the denominator:

🔹 Absolute Agreement (aka ICC_agreement, like in de Vet’s paper)

🔹 Consistency (aka ICC_consistency)

In de Vet et al., the formulas are:

[\text{ICC}{agreement} = \frac{\sigma_p^2}{\sigma_p^2 + \sigma{pt}^2 + \sigma_{residual}^2}]

[\text{ICC}{consistency} = \frac{\sigma_p^2}{\sigma_p^2 + \sigma{residual}^2}]

where:

Absolute agreement includes (\sigma_{pt}^2) as error; consistency does not.

🎯 ICC vs Agreement: What Question Are You Answering?

From de Vet et al.:

Key implications:

🧮 Agreement Side: SEM, SDC, Limits of Agreement

These are not ICC but frequently paired with it:

These are agreement measures, not reliability per se, and are recommended when the research focus is evaluating change (e.g., pre–post treatment).

🛠 Stata Implementation Summary

Built-in icc (for continuous outcomes):

* One-way random (ICC(1,1); ICC(1,k))
icc measure id, oneway single
icc measure id, oneway average

* Two-way random (ICC(2,1); ICC(2,k)), absolute agreement
icc measure id rater, random absolute single
icc measure id rater, random absolute average

* Two-way mixed (ICC(3,1); ICC(3,k)), absolute agreement
icc measure id rater, mixed absolute single
icc measure id rater, mixed absolute average

* Consistency versions (exclude systematic rater differences)
icc measure id rater, random consistency single
icc measure id rater, mixed consistency single

🔍 kappaetc vs icc

So:

If you’d like, I can now turn this into a one-page PDF-style summary or a slide-ready outline for your CECS teaching or your methods section.

Summary Table with Rationale + Fixes

ICC ModelUse WhenWhy Use (Statistical Issue It Fixes)Example
ICC(1,1) One-way randomDifferent raters rate different subjects; raters randomHandles unbalanced ratings; generalizable to population of ratersRotating lab techs
ICC(2,1) Two-way random, absolute agreementAll raters rate all subjects; raters random & interchangeableFixes bias by including rater mean differences; reliability generalizes to other ratersMultiple radiologists
ICC(3,1) Two-way mixed, consistencyAll raters rate all subjects; raters fixedFixes inflation from rater differences; focuses on consistency unique to these ratersTwo cardiologists in RCT


What Each ICC Model Fixes

ICC(1) — Fixes problems with:

ICC(2) — Fixes problems with:

ICC(3) — Fixes problems with:


ICC Note (Thai)

1 One-way Random Effects → ICC(1) / ICC(1,1)

2 Two-way Random Effects → ICC(2,1)

เหมาะกับ clinical measurement ที่ต้องการให้ “ใครก็อ่านได้” แล้วผลยัง reliable

3 Two-way Mixed Effects → ICC(3,1)

เหมาะกับงานที่ใช้แค่ raters ชุดนี้ใน study และใน practice จริง


Clinical Rule of Thumb (from COSMIN + de Vet)

Inter-rater reliability (generalizable) → ICC(2)Test–retest reliability (same assessor) → ICC(3)Multi-center clinical measurement studies → ICC(2)When only designated raters will ever perform scoring → ICC(3)When raters vary randomly → ICC(1)


Reference: Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016 Jun;15(2):155-63. doi: 10.1016/j.jcm.2016.02.012.

🧭 Step 1 – What kind of reliability?

  1. Test–retest / Intra-rater reliability
    • Same rater (or fixed team) measuring the same subjects at different times.➜ Go to Step 2A (Two-way mixed).
  2. Inter-rater reliability
    • Different raters assessing the same subjects (usually at one time).➜ Go to Step 2B.

🧭 Step 2A – Test–retest / Intra-rater → usually Two-way mixed

Here the rater is specific, not a random sample.

2A-1. Model

2A-2. Intended measurement protocol?

2A-3. Definition

Stata examples

* Test–retest, single measurement
icc measure id rater, mixed absolute single    // ICC(3,1)

* Test–retest, mean of k measurements
icc measure id rater, mixed absolute average   // ICC(3,k)

🧭 Step 2B – Inter-rater reliability

2B-1. Did the same set of raters rate all subjects?

✅ YES → Go to 2B-2

❌ NO → One-way random effects → ICC(1,·)

If NO (raters differ across subjects; unbalanced design):

Stata

* One-way random, single rating
icc measure id, oneway single       // ICC(1,1)

* One-way random, average of k ratings
icc measure id, oneway average      // ICC(1,k)

2B-2. If YES: same raters for all subjects

Ask: How do you conceptualize the raters?

  1. Randomized / interchangeable raters?(You want to generalize to other similar raters, e.g. any GP or radiologist.)➜ Two-way random effects → ICC(2,·)
  2. Specific raters only?(Only this panel or these named experts will ever rate.)➜ Two-way mixed effects → ICC(3,·)

🧭 Step 3 – Choose Random vs Mixed branch

🔹 Two-way random effects → ICC(2,·)

Use when:

Protocol:

Agreement vs consistency?

Stata

* Two-way random, absolute agreement, single rating
icc measure id rater, random absolute single   // ICC(2,1)

* Two-way random, absolute agreement, mean of k raters
icc measure id rater, random absolute average  // ICC(2,k)

* Consistency versions (if rank more important)
icc measure id rater, random consistency single
icc measure id rater, random consistency average

🔹 Two-way mixed effects → ICC(3,·)

Use when:

Protocol:

Agreement vs consistency?

Stata

* Two-way mixed, absolute agreement, single rating
icc measure id rater, mixed absolute single    // ICC(3,1)

* Two-way mixed, absolute agreement, mean of k raters
icc measure id rater, mixed absolute average   // ICC(3,k)

* Consistency versions
icc measure id rater, mixed consistency single
icc measure id rater, mixed consistency average

🔚 Ultra-Short Cheat Table

SituationModelICC Type
Test–retest / intra-raterTwo-way mixedICC(3,1) or ICC(3,k), usually absolute
Inter-rater, raters differ across subjectsOne-way randomICC(1,1) or ICC(1,k), absolute
Inter-rater, same raters, want to generalize to other ratersTwo-way randomICC(2,1) or ICC(2,k), absolute or consistency
Inter-rater, same fixed raters onlyTwo-way mixedICC(3,1) or ICC(3,k), absolute or consistency