← All posts

From Regression to Neural Networks: A Conceptual Bridge for Clinical Researchers

Clinical Epidemiology ResearchUniqcret doctor knowledgesData Analytics or StatisticsMethodology and Research DesignData-Sci & Digital Health

Abstract

Logistic regression and neural networks share a deep mathematical and conceptual structure. Both compute weighted sums of predictors, pass them through activation functions, and produce structured prediction surfaces. What neural networks automate through layers, regression achieves explicitly through polynomial or spline transformations. Understanding this bridge allows clinical researchers to visualize how each term in a regression equation corresponds to a neuron’s operation—and why including both X and X² terms is essential to capture direction and curvature simultaneously.


1. The Shared Foundation: Weighted Summation as a Neural Operation

Every neural network starts with a summation node—a neuron that aggregates weighted inputs and adds a bias term:

z = β0 + β1X1 + β2X2 + + βnXn

For logistic regression, this is identical. The output is then transformed through a logistic activation:

p = 1 1 + e z = 1 1 + e ( β0 + β1X1 + + βnXn )

This produces the familiar S-shaped probability curve.


2. Building Curvature: Each Term as a “Node” with Its Own Shape

When you add a quadratic term, the equation becomes:

p = 1 1 + e ( β0 + β1X + β2X2 )

Each component forms its own subgraph:

TermOperationGraph ShapeNeural Analogy
β0BiasHorizontal shiftNode bias
β1XLinear nodeStraight line (direction)First neuron
β2X^2Quadratic nodeParabolic curve (bend)Second neuron

Each of these nodes produces its own activation shape before being combined.The logistic function then compresses that combined curve into probabilities (0–1).

Therefore:

logit CHFS_bincutoff2B c.MREkPa c.MREkPa2

Creates a composite decision surface — the logistic activation of a weighted sum of two distinct nonlinear shapes (linear + curved).

That’s why you cannot fit only c.MREkPa2: c.MREkPa2 alone would force the model to build a symmetric U-shaped probability curve, losing the main trend direction. Including both terms allows the output graph to “lean” — rise, bend, then plateau — exactly as seen in real biomarkers.


3. The Graph-Building Logic: From Node to Outcome

Let’s visualize the operation conceptually:

Step 1. Compute individual node contributions

1️⃣ Linear node (X)

z1 = β1 X

→ Graph: Straight increasing or decreasing line.

2️⃣ Quadratic node (X²)

z2 = β2 X2

→ Graph: U- or inverted-U shape (curved).

Step 2. Combine into a single pre-activation layer

Z = β0 + z1 + z2 = β0 + β1X + β2X2

This “summation graph” is the raw decision surface—often a smooth hump or sigmoid-like curve depending on β’s signs.

Step 3. Apply the activation (logistic link)

p = 1 1 + e Z

Now the curve becomes a bounded, clinically interpretable probability — capturing both the general trend (from X) and the curvature (from X²).

This is precisely what a shallow neural network does: Each neuron creates a shape, then the activation compresses and fuses them into a pattern that matches the observed data.


4. Polynomial Regression as “Manual Feature Learning”

Modeling LevelRegression EquationNeural Layer AnalogyOutput Pattern
Linearβ₀ + β₁XOne neuronMonotonic (↑ or ↓)
Quadraticβ₀ + β₁X + β₂X²Two neurons (one linear, one curved)Sigmoid with bend or plateau
Splineβ₀ + Σ β_k f_k(X)Multi-node hidden layerSmooth flexible curve
Deep NNLearned nonlinear functionsMultiple hidden layersComplex, multi-peak surface

This “manual feature learning” in regression explicitly mirrors the automated hidden layer learning in neural networks. The difference is transparency: regression tells you exactly which shape each feature contributes.


5. Why Including Both X and X² Is Clinically and Mathematically Correct

ScenarioModelResultInterpretation
Only X²logit Y c.X2Symmetrical U-shape centered near 0No directionality — biologically implausible
X + X²logit Y c.X c.X2Asymmetric curve with slope and bendCaptures both baseline trend and saturation

The linear term defines direction (does risk rise or fall?), the quadratic term defines curvature (does it plateau or bend?). Together, they form the biologically realistic S-shaped or saturating response seen in continuous biomarkers (MRE, AST, ALT, FIB-4).


6. Clinical Illustration: Fibrosis Probability by MREkPa

logit (p) = β0 + β1 MREkPa + β2 MREkPa2
TermInterpretation
β₁ (linear)Overall direction — higher MRE increases fibrosis probability
β₂ (quadratic)Adjustment for curvature — captures flattening or downturn
CombinedClinically realistic sigmoid-type relationship: rises fast at first, then levels off

Graphically: Each term draws its own subcurve. Their combination forms a “master curve.”The logistic transformation then compresses it to 0–1, producing the final probability graph familiar to clinicians.

Hence, the logit model’s geometry is neural-like: a sum of shapes transformed into a bounded outcome.


7. If Y Is Continuous — The Same Logic Applies

For continuous outcomes, the neural analogy still holds, but the activation is identity (no sigmoid compression).

FeatureLinear RegressionLogistic RegressionNeural Network Equivalent
Input layerPredictors (X)Predictors (X)Inputs
Weightsβ₁, β₂, …β₁, β₂, …Learned weights
Biasβ₀β₀Bias node
ActivationIdentitySigmoid (logit)Nonlinear activation
OutputContinuousProbability (0–1)Output neuron

So whether you run:

regress ALT c.MREkPa c.MREkPa2

or

logit CHFS_bincutoff2B c.MREkPa c.MREkPa2

You are performing the same neural operation — summing nodes and shaping outputs — only differing by the activation applied to Y.


8. Summary: Seeing the Neural Pattern Inside Every Regression

ConceptRegression TermNeural AnalogyGraph Effect
Intercept (β₀)BiasBias nodeShifts curve vertically
Linear term (β₁X)Weighted inputNode 1Sets direction
Quadratic term (β₂X²)Nonlinear inputNode 2Creates curvature
Logistic linkSigmoid activationOutput neuronCompresses to 0–1
Combined outputPredicted p(Y=1)Neural outputClinical probability

Key Takeaway In regression, each term is an operation that creates a graph. The model combines these shapes systematically, applies a link function, and produces a final patterned output — exactly as a neural network does through its layered structure. That is why the correct model is: logit CHFS_bincutoff2B c.MREkPa c.MREkPa2 not c.MREkPa2 alone — because every neural-like model must preserve both direction and curvature to form a coherent and interpretable clinical pattern.

Comments

No comments yet. Be the first to share your thoughts.

Sign in to comment