From Regression to Neural Networks: A Conceptual Bridge for Clinical Researchers
- Mayta

- Oct 10
- 4 min read
Abstract
Logistic regression and neural networks share a deep mathematical and conceptual structure. Both compute weighted sums of predictors, pass them through activation functions, and produce structured prediction surfaces. What neural networks automate through layers, regression achieves explicitly through polynomial or spline transformations. Understanding this bridge allows clinical researchers to visualize how each term in a regression equation corresponds to a neuron’s operation—and why including both X and X² terms is essential to capture direction and curvature simultaneously.
1. The Shared Foundation: Weighted Summation as a Neural Operation
Every neural network starts with a summation node—a neuron that aggregates weighted inputs and adds a bias term:
For logistic regression, this is identical. The output is then transformed through a logistic activation:
This produces the familiar S-shaped probability curve.
In Stata,
logit CHFS_bincutoff2B c.MREkPa
creates exactly one neuron: one linear input (MREkPa) passed through a sigmoid activation. The graph is a smooth, monotonic S-curve — constantly increasing or decreasing.
2. Building Curvature: Each Term as a “Node” with Its Own Shape
When you add a quadratic term, the equation becomes:
Each component forms its own subgraph:
Each of these nodes produces its own activation shape before being combined.The logistic function then compresses that combined curve into probabilities (0–1).
Therefore:
logit CHFS_bincutoff2B c.MREkPa c.MREkPa2
Creates a composite decision surface — the logistic activation of a weighted sum of two distinct nonlinear shapes (linear + curved).
That’s why you cannot fit only c.MREkPa2:
c.MREkPa2 alone would force the model to build a symmetric U-shaped probability curve, losing the main trend direction.
Including both terms allows the output graph to “lean” — rise, bend, then plateau — exactly as seen in real biomarkers.
3. The Graph-Building Logic: From Node to Outcome
Let’s visualize the operation conceptually:
Step 1. Compute individual node contributions
1️⃣ Linear node (X)
→ Graph: Straight increasing or decreasing line.
2️⃣ Quadratic node (X²)
→ Graph: U- or inverted-U shape (curved).
Step 2. Combine into a single pre-activation layer
This “summation graph” is the raw decision surface—often a smooth hump or sigmoid-like curve depending on β’s signs.
Step 3. Apply the activation (logistic link)
Now the curve becomes a bounded, clinically interpretable probability — capturing both the general trend (from X) and the curvature (from X²).
This is precisely what a shallow neural network does:
Each neuron creates a shape, then the activation compresses and fuses them into a pattern that matches the observed data.
4. Polynomial Regression as “Manual Feature Learning”
This “manual feature learning” in regression explicitly mirrors the automated hidden layer learning in neural networks.
The difference is transparency: regression tells you exactly which shape each feature contributes.
5. Why Including Both X and X² Is Clinically and Mathematically Correct
The linear term defines direction (does risk rise or fall?), the quadratic term defines curvature (does it plateau or bend?).
Together, they form the biologically realistic S-shaped or saturating response seen in continuous biomarkers (MRE, AST, ALT, FIB-4).
6. Clinical Illustration: Fibrosis Probability by MREkPa
Graphically: Each term draws its own subcurve. Their combination forms a “master curve.”The logistic transformation then compresses it to 0–1, producing the final probability graph familiar to clinicians.
Hence, the logit model’s geometry is neural-like: a sum of shapes transformed into a bounded outcome.
7. If Y Is Continuous — The Same Logic Applies
For continuous outcomes, the neural analogy still holds, but the activation is identity (no sigmoid compression).
So whether you run:
regress ALT c.MREkPa c.MREkPa2
or
logit CHFS_bincutoff2B c.MREkPa c.MREkPa2
You are performing the same neural operation — summing nodes and shaping outputs — only differing by the activation applied to Y.
8. Summary: Seeing the Neural Pattern Inside Every Regression
Key Takeaway In regression, each term is an operation that creates a graph. The model combines these shapes systematically, applies a link function, and produces a final patterned output — exactly as a neural network does through its layered structure. That is why the correct model is: logit CHFS_bincutoff2B c.MREkPa c.MREkPa2 not c.MREkPa2 alone — because every neural-like model must preserve both direction and curvature to form a coherent and interpretable clinical pattern.





Comments