top of page

Age and Cancer Risk: From Log-Odds to Probability in Logistic Regression (Image)

  • Writer: Mayta
    Mayta
  • 17 hours ago
  • 2 min read

🚦Three Panels, One Story: Modeling Binary Outcomes

Imagine we're studying how the probability of developing cancer increases with age.You want to know: "How does age affect risk?"The actual data are: Age (years) + Cancer status (yes/no).

Panel 1: Log-Odds (log(odds of cancer))

  • What you see: A perfectly straight, upward-sloping line.

  • Why: Logistic regression models the relationship between age and the log-odds of developing cancer.

  • Interpretation:

    • For each year older, the log-odds of developing cancer increases by the same amount (the slope, β₁).

    • This is what the logit command in Stata is fitting.

  • But: Log-odds are hard for most people to interpret directly!

Panel 2: Odds of Cancer

  • What you see: A curve that starts flat, then rises very fast.

  • Why:

    • Odds are the exponentiated value of log-odds (odds = exp(log-odds)).

    • They always stay positive and can get very large.

  • Interpretation:

    • If odds = 1, chance is 50:50.

    • Odds >1 means it’s more likely to happen than not; odds <1 means less likely.

  • Clinically: Odds are easier than log-odds, but still unintuitive for common outcomes.

Panel 3: Probability of Cancer

  • What you see: The classic S-shaped ("sigmoid") curve.

  • Why:

    • Logistic regression uses the log-odds line to calculate the probability:

  • This maps any value of age onto a probability between 0 and 1.

  • Interpretation:

    • When age is low, probability is near zero.

    • As age rises, probability increases rapidly in the middle years, then levels off as it approaches 1.

  • This is what most clinicians/patients care about: “Given this age, what is the chance of developing cancer?”

Connecting the Panels:

  • Logistic regression fits a straight line to log-odds (Panel 1).

  • That line translates to a sharply rising curve for odds (Panel 2).

  • Which then transforms to a smooth S-shaped probability curve (Panel 3).

Why this matters in clinical research:

  • The true model is linear only in log-odds—this is why you get an S-shaped risk curve even if the log-odds are perfectly linear.

  • It lets you make predictions for any age—even if nobody in your data was exactly 47.5 years old.

  • You can use this for any binary outcome: disease/no disease, event/no event, mortality, admission, etc.

Summary: Logistic regression is the bridge between “linear world” (log-odds) and the “real world” (probabilities). You model in the first panel, but you interpret using the third.

Let me know if you want:

  • A real-data example with two groups (e.g., smokers vs non-smokers)

  • How to use these ideas in Stata code or a clinical paper

  • Or a deeper dive into odds ratios or margins plots!

Recent Posts

See All

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Post: Blog2_Post

​Message for International and Thai Readers Understanding My Medical Context in Thailand

Message for International and Thai Readers Understanding My Broader Content Beyond Medicine

bottom of page