How to Think Causally in Clinical Research — The Counterfactual + DAG Blueprint

Mayta
2 days ago
3 min read

“Causation isn’t just about what happened—it’s about what would have happened instead.”

🎯 The Clinical Dilemma

Your patient, Mr. G, age 50, has stage IV lung cancer. He receives an herbal treatment and survives.

Question: Did the herb save him?

Or would he have survived anyway?

That’s the counterfactual question. And it’s the gold standard for causal inference in medicine.

🔄 1. Association ≠ Causation

Let’s separate noisy thinking from causal clarity.

Concept	Meaning
Association	Knowing X helps predict Y
Causation	Changing X will change Y

Example:

“Patients who take Vitamin D have better COVID outcomes.”That’s association.

“Vitamin D improves COVID outcomes.”That’s a causal claim—and it needs counterfactual proof.

🧬 2. The Counterfactual Logic

Key Concept:

A causal effect = Observed outcome minus Counterfactual outcome

But we never observe both for the same person. You only see one timeline: treated or untreated.

So, how do we estimate the missing outcome?

🧪 3. Approximating the Counterfactual: Your Toolbox

We create “what-if” worlds using:

Tool	Function
RCTs	Create two comparable worlds via randomization
Matching	Pair treated patients with similar untreated ones
Stratification	Compare within subgroups (e.g., ECOG = 2)
Regression	Adjust for confounders statistically
Propensity Scores	Balance groups based on treatment likelihood
DAGs	Show where bias might sneak in (more below)

🔁 4. The DAG: Your Bias-Detecting Blueprint

What is a DAG?

A Directed Acyclic Graph is a visual map of your assumptions:

Arrows = causal directions
Nodes = variables
Helps decide what to adjust for

✍️ Example DAG: Does the Herbal Drug Help?

Age → Herbal Treatment → Survival ↘-------------------------↗

Age affects both treatment and survival → confounder
Must adjust to isolate the drug’s effect.

🛑 Don't Adjust for These:

Role	Rule	Example
Confounder	Adjust ✅	Age, ECOG
Mediator	Don’t adjust if estimating total effect ❌	Side effects
Collider	Never adjust ❌	ER admission caused by both severity and treatment

🔍 Secret Insight: Adjusting for colliders introduces false associations.DAGs protect you from this silent sabotage.

🎲 5. Mr. G’s Real-World Counterfactual

You want to answer:

Would Mr. G have survived without the herbal treatment?

Since we can’t observe that:

Define precise inclusion/exclusion: age, cancer stage, ECOG.
Find matched patients who didn’t receive the drug.
Adjust for known confounders (e.g., smoking).

If survival differs → credible evidence for causality.

🔄 6. Time Matters: Temporality

Causal claims demand:

Cause precedes effect
No reverse logic

If treatment happens after improvement, it can’t be the cause.

⚔️ 7. Confounding, Indication Bias & Comparability

Real-world mess:

Doctors prescribe based on prognosis = confounding by indication
Sickest avoid treatment = confounding by contraindication
Self-selection = selection bias

How to beat this:

Design with comparability in mind
Use DAGs to plan adjustments
Use propensity scores, matching, or inverse probability weighting

📘 Occurrence Equation for Causal Studies

🧠 Summary Table

Element	Function	Caution
Counterfactual	The unseen alternative	Estimate, don’t assume
DAG	Bias map	Build before analyzing
Confounder	Adjust ✅	Blocks bias
Mediator	Adjust ❌ if estimating the total effect
Collider	Never adjust ❌	Causes a spurious association
RCT	Gold standard	Randomizes counterfactual

✅ Final Takeaways

Causal inference = imagining alternate realities and estimating them scientifically.
Counterfactual thinking is not optional—it’s the foundation.
DAGs are essential for identifying what’s safe (and unsafe) to adjust.
Comparability is key, and it can be achieved through design or statistics.
Always clarify the variable roles: confounder, mediator, and collider.