Incidence Rate [IR] and Incidence Rate Ratio [IRR]: Analysis of Rates in Clinical Epidemiology Using Stata
- Mayta

- Feb 8
- 3 min read
1. Introduction
In clinical epidemiology, outcomes often occur over time, and individuals may contribute different lengths of follow-up. In such settings, simple risks or proportions are inadequate. Instead, incidence rates (IRs) and incidence rate ratios (IRRs) are appropriate measures of disease incidence and exposure effects.
Stata’s ir command is designed for this purpose and is widely used in cohort studies, occupational epidemiology, pharmacoepidemiology, and registry-based research.
2. Incidence Rate (IR)
Definition
The incidence rate quantifies how rapidly new events occur in a population:
Numerator: number of incident events (e.g., deaths, infections)
Denominator: accumulated person-time (person-years, person-months)
Key property (dominant concept)
IR measures speed of occurrence, not probability
Two groups may have the same number of events, but if their follow-up times differ, their incidence rates differ.
3. Person-Time: the Role of Time
Each individual contributes time until:
the event occurs,
loss to follow-up,
or end of study.
Example
Subject | Event | Follow-up (years) |
A | Yes | 1.0 |
B | No | 2.5 |
C | Yes | 0.5 |
Total events = 2
Total person-time = 4.0 person-years
4. Incidence Rate Ratio (IRR)
Definition
The incidence rate ratio compares incidence rates between two groups:
Interpretation
IRR = 1 → no association
IRR > 1 → higher event rate in exposed
IRR < 1 → protective association
Example:
IRR = 2.0 → exposed group experiences events twice as fast
5. Why IR / IRR are Dominant Measures
IR and IRR are dominant when:
Follow-up time varies between individuals
Entry into cohort is staggered
Loss to follow-up occurs
Population is dynamic
Outcome is recurrent or count-based
Interest is in rate, not cumulative risk
They outperform:
Risk ratios when follow-up is unequal
Odds ratios for longitudinal data
6. Data Structure Required for ir
Each observation must contain:
Variable | Meaning |
cases | Number of events |
exposed | Exposure indicator (0/1) |
time | Person-time contribution |
Data may be individual-level or aggregated.
7. Stata ir Command
Basic syntax
ir cases exposed time
What Stata does internally:
1. Splits data by exposed
2. Computes:
3. Calculates IRR and confidence intervals
8. Worked Example (Individual-Level Data)
Example dataset
clear
input id exposed events time
1 1 1 1.2
2 1 0 2.0
3 1 1 0.8
4 0 0 3.0
5 0 1 1.5
6 0 0 2.5
end
Run incidence rate analysis
ir events exposed time
Interpretation
Stata reports:
IR_exposed
IR_unexposed
IRR with 95% CI
Time is automatically used as the denominator
9. Manual Verification (Recommended for Understanding)
collapse (sum) events time, by(exposed)
gen ir = events / time
list
This produces two rows:
exposed = 1 → IR_exposed
exposed = 0 → IR_unexposed
IRR = ratio of these two IRs.
10. Immediate Form: iri (Aggregated Data)
When you already have totals:
iri events_exposed events_unexposed time_exposed time_unexposed
Example
iri 12 8 240 400
Equivalent to the full ir command.
11. Stratified Analysis
ir events exposed time, by(agegroup)
Computes stratum-specific IRRs
Combines them using Mantel–Haenszel weights
Optional standardization (istandard, estandard)
12. Relationship to Poisson Regression
ir is mathematically equivalent to:
poisson events exposed, exposure(time) irr
Differences:
ir | poisson |
Table-based | Model-based |
No covariates | Multiple covariates |
Teaching & descriptive | Multivariable inference |
13. Common Misinterpretations (Important)
❌ IR is not a proportion ❌ IRR is not an odds ratio ❌ IRR is not a hazard ratio
✔ IRR compares rates per time
✔ HR compares instantaneous hazards
14. How to Report in a Paper
“Incidence rates were calculated as events per person-year. Incidence rate ratios (IRRs) with 95% confidence intervals were estimated using person-time denominators.”
15. Take-Home Message
Incidence Rate (IR) = events ÷ person-time
Incidence Rate Ratio (IRR) = ratio of two IRs
Time is central, not optional
ir provides clean, transparent rate comparisons
Best used when follow-up is unequal or dynamic






Comments