top of page

Step-by-Step Guide to Categorical Data and Effect Measures in Network Meta-Analysis (NMA)

  • Writer: Mayta
    Mayta
  • 20 minutes ago
  • 5 min read

0) Frame the clinical question & endpoint

What it is Define your PICO/PICOT and the binary outcome (event vs no event), its direction (“good” or “bad”), and time window.

Why we do it Clear framing prevents downstream mixing of incomparable endpoints or time horizons, and anchors interpretation (e.g., OR < 1 means benefit when the outcome is adverse).

Core focus

  • PICO/PICOT scope and eligibility criteria

  • Exact binary endpoint definition across trials

  • Direction of benefit (which side of 1.0 is “better”)

Typical outputs

  • Protocolized question statement and eligibility table

  • Outcome dictionary (definitions, windows, handling of competing risks)

  • Pre‑specified primary effect measure (often OR) and secondary measures

In your biologics manuscript, the team predefined outcomes then selected effect measures per endpoint (ORs for OCS reduction, IRRs for exacerbations).

1) Choose the effect measure

What it is Select a single primary scale for synthesis: Odds Ratio (OR) is most common for binary meta/NMA; RR or RD may be secondary.

Why we do it Consistent scaling avoids incoherence and facilitates network modeling and ranking; ORs behave consistently across baseline risks.

Core focus

  • Primary: OR on the log scale (analysis happens on log(OR))

  • Secondary (optional): RR, RD for absolute impact/NNT

Typical outputs

  • Rationale for measure choice

  • Conversion rules (if some trials report different measures)

  • Back‑transformed pooled estimates for clinical reading

Your NMA examples pooled ORs on the log scale for binary/ordinal dose‑reduction outcomes before ranking.

2) Build analyzable contrasts from trial data

What it is Create study‑level contrasts (log(OR), SE[log(OR)]) from arm‑level counts (r, n) or use reported contrasts consistently.

Why we do it Contrast‑based data are the lingua franca for synthesis, handle multi‑arm trials correctly, and feed the network model.

Core focus

  • Arm‑level → contrast‑level transformation

  • Handling zero cells (continuity corrections or robust methods)

  • Multi‑arm correlation (avoid double‑counting shared controls)

Typical outputs

  • A “contrast sheet” listing each comparison’s log(OR), SE, treat1, treat2, study label

Example In the classic antihypertensive–diabetes NMA, a separate spreadsheet of 45 two‑by‑two contrasts (log ORs + SEs) was prepared specifically to feed the network model.

3) Fit the synthesis model (start with random‑effects)

What it is Combine study contrasts using a random‑effects model; for multiple treatments, fit a frequentist NMA (e.g., netmeta).

Why we do it Random‑effects acknowledges real‑world between‑study variability. NMA integrates direct + indirect evidence across a network, enabling all pairwise comparisons.

Core focus

  • Random‑effects variance (τ²) estimator (REML / Paule–Mandel)

  • Common reference (for presentation only; the network itself is reference‑free)

  • Model convergence and plausibility checks

Typical outputs

  • Pooled log(OR) estimates and 95% CIs per treatment vs reference

  • τ² estimate and model diagnostics

  • Forest plot vs reference (interpret on OR scale)

Example My team’s NMA used a frequentist random‑effects approach (R netmeta) and then produced pooled estimates and ranks.

4) Assess heterogeneity (within‑comparison variability)

What it is Quantify how much the true effects differ across studies that assess the same contrast.

Why we do it High heterogeneity weakens a single pooled summary and signals effect modification, design differences, or quality issues.

Core focus

  • Cochran’s Q, I² (%), and τ² on the log(OR) scale

  • Visual check: study‑level forest plot

  • Pre‑planned exploration of sources (population, dose, follow‑up, risk of bias)

Typical outputs

  • Q test p‑value; I² bands (≈25/50/75%); τ² magnitude

  • Narrative on likely drivers; plan for subgroup/meta‑regression if warranted

Heterogeneity in your biologics NMA was explicitly assessed using Cochran’s Q and  before moving to network‑level checks.

5) Check transitivity & consistency (network validity)

What it is Transitivity = comparability of studies across treatment comparisons (distributions of effect modifiers). Consistency = agreement of direct and indirect evidence.

Why we do it NMA’s core promise is valid indirect inference; without transitivity and consistency, ranks/league tables are unreliable.

Core focus

  • Transitivity: compare effect‑modifier distributions across comparisons (biomarkers, disease severity, background therapy)

  • Consistency:

    • Global (design‑by‑treatment test / model incoherence)

    • Local (node‑splitting: direct vs indirect for a given pair)

Typical outputs

  • Transitivity table/figure

  • Global test/incoherence metric and p‑value; node‑split estimates and p‑values

  • Action if violated (stratified networks, meta‑regression, or cautious interpretation)

Example if biologics paper prespecified transitivity, tested global and node‑level consistency, and used a comparison‑adjusted funnel when appropriate. In the diabetes NMA, incoherence (ω) was very low (≈1.7×10⁻⁵), supporting internal agreement of the network model.

6) Rank treatments (SUCRA / P‑score) & visualize uncertainty (rankograms)

What it is Compute rank probabilities for each treatment (1st, 2nd, …, kth); summarize as SUCRA (0–1) or frequentist P‑scores; show the full rank distribution via rankograms.

Why we do it Clinicians need a synthesized hierarchy—but we must also show uncertainty, not just a single rank.

Core focus

  • Rank probabilities derived from the NMA estimates and their uncertainty

  • SUCRA = surface under the cumulative rank curve (higher = better rank)

  • Interpretation with effect sizes, not instead of them

Typical outputs

  • Table of SUCRA/P‑scores with CIs if available

  • Rankograms (bar/line) per treatment, plus cumulative rankograms

Your team explicitly computed rank probabilities and SUCRA, and presented rankograms as the visual counterpart.

7) Present comparative results (league table, network plot, forest vs reference)

What it is Translate the network into decision‑ready displays:

  • League table: all pairwise ORs with 95% CIs

  • Network plot: nodes (treatments) and edges (direct trials), node size ∝ n, edge width ∝ evidence

  • Forest vs reference: quick clinical read of each treatment vs chosen anchor

Why we do it Stakeholders must answer “A or B?” quickly, understand the evidence structure, and see where indirect evidence dominates.

Core focus

  • Clear directionality in the league (column vs row convention)

  • Ordering by SUCRA/P‑score (with a warning that ranks ≠ certainty)

  • Network connectivity and balance (star vs richly connected)

Typical outputs

  • League table arranged by rank

  • Network graph (node/edge‑weighted)

  • Forest plot vs reference (e.g., placebo or standard care)

Your biologics manuscript built league tables ordered by SUCRA and included a network graph with node/edge encodings; these are the standard outputs your professor emphasizes. The diabetes NMA also demonstrated stable rank ordering even when the reference was switched (from diuretic to placebo), a key interpretability point.

8) Assess small‑study effects / publication bias

What it is In networks (when k > 10), use a comparison‑adjusted funnel plot to assess asymmetry suggestive of small‑study effects/publication bias.

Why we do it Differential reporting or small‑study inflation can distort pooled effects and ranks.

Core focus

  • Visual asymmetry tests (caution with low k)

  • Narrative synthesis; consider study size, setting, and risk‑of‑bias domains

Typical outputs

  • Comparison‑adjusted funnel plot (and, if appropriate, a brief statistical test)

  • A reasoned statement on likely small‑study effects and their impact on conclusions

Your biologics NMA specified comparison‑adjusted funnels for networks with >10 studies—a prudent standard.

9) Contributions, sensitivity, and certainty of evidence

What it is Make the synthesis auditable and robust: show which direct comparisons contribute to which network estimates, probe robustness with sensitivity analyses, and appraise certainty (e.g., confidence in NMA/CINeMA domains).

Why we do it Stakeholders need to know who drives the estimates, how results change under perturbations, and the confidence they can place in the conclusions.

Core focus

  • Contribution matrix (evidence flow)

  • Sensitivity ladders (exclude high‑risk‑of‑bias trials; remove small studies; alternative τ²; population or biomarker strata)

  • Certainty/credibility across risk of bias, imprecision, inconsistency, indirectness, publication bias

Typical outputs

  • Contribution heatmap/table to target sensitivity checks

  • Sensitivity results (narrative + key re‑estimates)

  • Certainty summary (e.g., high/moderate/low/very low with reasons)

The diabetes NMA ran multiple one‑way sensitivities (removing specific trial types, reassigning drug classes) and found estimates were robust—this is exemplary practice. Your biologics paper applied a confidence in NMA framework to rate evidence across domains after the quantitative synthesis.

At‑a‑glance mapping to the professor’s pattern

  • Random‑effects frequentist NMA (netmeta) → pooled ORs/log(OR) and τ².

  • Heterogeneity quantified with Q, I², τ² (Step 4).

  • Consistency checked globally (design‑by‑treatment/incoherence) and locally (node‑splitting) (Step 5).

  • Ranking via SUCRA/P‑score with rankograms (Step 6).

  • Comparative displays: league table, network plot, forest vs reference (Step 7).

  • Small‑study effects: comparison‑adjusted funnel when k > 10 (Step 8).

  • Contribution & certainty: contribution matrix + confidence in NMA (Step 9).


Final note on interpretation

Always lead with effect sizes and their CIs, not ranks alone. Read ranks with consistency diagnostics, heterogeneity, and certainty judgments; your own examples model this restraint well.

Recent Posts

See All

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Post: Blog2_Post

​Message for International and Thai Readers Understanding My Medical Context in Thailand

Message for International and Thai Readers Understanding My Broader Content Beyond Medicine

bottom of page