Reporting Multiple Imputation in Clinical Research: Standards, Misconceptions, and Best Practices [Multiple imputation, MI]

Mayta
Jun 1
4 min read

Introduction

Multiple Imputation (MI) has become a cornerstone technique for managing missing data, but its power is matched by the responsibility to report it transparently. Poorly documented MI can undermine the credibility of otherwise valid results and hinder reproducibility. Furthermore, widespread misconceptions continue to limit its adoption or lead to misuse.

This article offers a comprehensive framework for reporting MI in research publications and clarifies common misunderstandings, including guidance on choosing the number of imputations and acceptable levels of missingness.

What Must Be Reported: Transparency Frameworks

A. Reporting Missing Data Characteristics

Any article involving MI should describe the nature, pattern, and context of the missing data. This includes:

Rates of missingness per variable (preferably in a table).
Casewise missingness rates (what proportion of cases are incomplete).
Patterns of missingness—monotone or non-monotone; dropout vs intermittent.
Reasons for missing data, if known (e.g., dropout, refusal, procedural error).
Assumed missingness mechanism—MCAR, MAR, or MNAR—with justification.

🔍 Practical Insight: Explicitly stating “We assume MAR due to auxiliary variable availability and predictable dropout patterns” is far more informative than a generic claim.

B. Describing the Imputation Model

The imputation model is a distinct statistical engine from the substantive analysis model and must be clearly documented. Key elements to report include:

Which variables were imputed (and which were predictors).
Auxiliary variables used—with justification for inclusion.
Non-linear terms and interactions, and how they were handled.
Statistical method used (e.g., linear regression, logistic, predictive mean matching).
Software and package/version—e.g., “Stata v.17, mi impute chained”.

💡 Always explain passive imputation or transformation handling, especially for terms like age², BMI, or composite scores.

C. Imputation Procedure Specifics

This includes the operational aspects of MI execution:

Algorithm type: Chained Equations (MICE), Multivariate Normal (MVN), etc.
Number of imputed datasets (m) and rationale.
Number of iterations per chain (if applicable).
Order of imputation (if set manually).
Diagnostics performed: convergence checks, trace plots, density plots.

📌 Best Practice: Report both the imputed and observed distributions of key variables to assure readers of imputation plausibility.

D. Pooling and Analysis Integration

Explain:

Pooling rules used (typically Rubin’s Rules).
Software handling of standard error combination.
If more than one analysis was performed, clarify consistency and rationale.

Correcting Misconceptions About Multiple Imputation

Misconception 1: MI is only for MAR

While MAR is a core assumption for many MI methods, MI can also support sensitivity analyses under MNAR by modeling deviations from MAR (e.g., Jump-to-Reference). Moreover, in borderline or unknown cases, carefully structured MI is still more defensible than complete-case analysis.

Misconception 2: MI is unnecessary when missingness is low

Even modest amounts of missingness in key covariates or outcomes can bias estimates, especially if data are not MCAR. MI remains valuable even when <10% of cases are incomplete.

Misconception 3: Certain variables must never be imputed

There is no hard rule against imputing outcomes or predictors. What matters is how and why. For repeated outcomes or time series, outcome imputation may be necessary, provided it is model-aligned (e.g., multilevel structure respected).

Misconception 4: MI “makes up” data

MI does not fabricate certainty. It generates multiple plausible values from a statistical distribution, explicitly modeling uncertainty and incorporating it into final estimates. It’s the opposite of “data fabrication”—it’s probabilistic honesty.

Misconception 5: MI is too complex or computationally heavy

Modern statistical packages have streamlined MI workflows. With clear design and modular code, MI can be implemented in minutes, even on large datasets.

How Many Imputations Are Enough?

The Number (m) is Contextual

Older rules (e.g., m = 5–10) are now seen as overly simplistic. Recommended practices include:

m ≥ % of incomplete cases (e.g., 20% incomplete → m = 20).
m tied to FMI (Fraction of Missing Information):
- FMI 0.05 → m ≥ 3
- FMI 0.20 → m ≥ 12
- FMI 0.50 → m ≥ 59

Quadratic Rule and von Hippel’s Two-Stage Procedure

Use preliminary results to estimate Monte Carlo error, then refine m:

Run MI with provisional m.
Apply how_many_imputations to check if results are stable.
If instability is detected, increase m and rerun.

🔁 Iterative tuning of m ensures robustness without over-computation.

How Much Missingness Is Too Much?

Simulation studies suggest MI can perform well even with up to 50% missingness in key variables, provided:

MAR assumption holds reasonably,
The imputation model is rich,
Enough imputations are used (high m for high FMI),
Proper diagnostics are performed.

⚠️ Extreme missingness in small samples is riskier than in large, well-structured datasets.

Conclusion

Multiple Imputation is a powerful method—but only when applied and reported with transparency and rigor. The burden is not on the reader to guess what you did—it is on the researcher to document clearly:

What was missing?
What assumptions were made?
How were imputations executed and checked?
How were final results derived?

Proper reporting isn’t bureaucracy—it’s the bedrock of reproducible science.

Key Takeaways

Report missing data rates, assumptions, imputation logic, software, and diagnostics.
Include auxiliary variables and justify their use.
Use more imputations (m) for higher FMI—tune iteratively if needed.
Don’t fear imputing outcomes or using MI with moderate-to-high missingness.
MI is transparent modeling, not “data invention.”