← All posts

Sample Size for Hypothesis Testing: Understanding the BRAVES Method

Clinical Epidemiology ResearchUniqcret doctor knowledgesMethodology and Research Design

Introduction

Sample size calculation is one of the most misunderstood aspects of medical research because there is no single universal rule. The correct approach depends entirely on what the study is trying to achieve.

Designing a study is not merely about enrolling participants and running analyses. It is about anticipating the interplay between clinical importance, statistical rigor, ethical responsibility, and resource constraints. Sample size sits at the center of this balance.

The BRAVES method provides a structured way to think about sample size when the objective is hypothesis testing, while recognizing that other research objectives require entirely different logic.

Sample Size Depends on the Research Objective

Before calculating anything, the first question must be:

What is the objective of this research?

A medical study may aim to:

Each objective demands different assumptions, criteria, and stopping rules. Applying hypothesis-testing logic to all studies is a common and costly mistake.

This article focuses first on hypothesis testing, where the BRAVES method applies most directly.


Objective: Hypothesis Testing

“Is there a real effect or difference?”

Purpose

To determine whether an intervention, exposure, or factor has a statistically detectable effect that is clinically meaningful, not merely statistically non-zero.

Typical examples include:

Sample Size Logic

For hypothesis testing, sample size is chosen to ensure adequate statistical power to detect a predefined, clinically relevant effect if it truly exists.

The governing logic is error control: balancing false positives against false negatives.

The BRAVES Framework

BRAVES summarizes the five core design inputs that determine sample size, plus the operational layer that implements them.

ComponentRole in Sample SizeClinical Implication
B – Beta (β)Controls power (1 − β)Risk of missing a true effect
R – RatioAllocation ratioImbalance reduces efficiency
A – Alpha (α)Type I errorRisk of false discovery
V – VariabilityDrives standard errorMore noise → larger N
E – Effect SizeTarget differenceMust be clinically meaningful
S – SoftwareComputes NOnly as good as assumptions

Key Criterion

Power, typically 80–90%, depending on clinical stakes.

Main question: How many subjects are needed to detect this effect with acceptable error?


Hypothesis Testing Outcome Matrix

Every hypothesis-driven study falls into one of four logical outcomes, depending on the true state of nature and the statistical decision.

Trial ResultTruth: Effect ExistsTruth: Effect Does Not Exist
Positive resultTrue positiveType I error (α)
Negative resultType II error (β)True negative

Interpretation:


How BRAVES Shapes This Matrix

Each BRAVES component directly influences which quadrant your study is likely to fall into.

Beta (β)

Controls Type II error. A β of 0.2 accepts a 20% chance of missing a true effect. This may be unacceptable for life-saving interventions.

Alpha (α)

Controls Type I error. Standard α = 0.05 is a convention, not a law. Stricter thresholds may be warranted in high-stakes or multiplicity-heavy trials.

Effect Size

Defines what matters clinically. Smaller target effects require larger samples. Choosing an unrealistically large effect size guarantees an underpowered study.

Variability

Higher variability dilutes the signal. Underestimating variability is one of the most common causes of failed trials.

Ratio

Unequal allocation may be ethically or logistically justified, but reduces power unless compensated by increased total N.

Software

Tools automate calculation but cannot justify assumptions. Inputs must reflect clinical reality, not convenience.


A Critical Insight: Power Is a Clinical Decision

Power is often treated as a statistical default rather than a clinical judgment.

Missing a modest benefit in oncology is not equivalent to missing a modest benefit in allergic rhinitis. The acceptable risk of Type II error must reflect:

Sample size is, therefore, not just mathematics—it is ethics, economics, and epistemology combined.


Key Takeaways