What Is the Split Rule (Discrimination Rule) in Random Forest? Gini vs Extra Trees Explained

Mayta
Mar 25
3 min read

What is the Split Rule?

At each node in a decision tree, the algorithm must decide:

“Where should I split this feature to best separate the outcome?”

This decision is governed by the split rule (criterion).

In Random Forest, the most common split rules are:

Gini impurity (standard Random Forest)
Extremely Randomized Trees (Extra Trees)

The key difference lies in how the split threshold is chosen.

The Core Difference: How a Split Point is Chosen

Consider a single feature:

Feature: Age
Values: 22, 35, 41, 55, 63, 70, 78

Standard Random Forest (Gini impurity)

Process:

Candidate split point	Left group	Right group	Result
Between 22–35	[22]	[35, 41, 55, 63, 70, 78]	Evaluate impurity
Between 35–41	[22, 35]	[41, 55, 63, 70, 78]	Evaluate impurity
Between 41–55	[22, 35, 41]	[55, 63, 70, 78]	Evaluate impurity
Between 55–63	[22, 35, 41, 55]	[63, 70, 78]	Evaluate impurity

The algorithm:

Evaluates all possible split points
Computes impurity (e.g., Gini) for each
Selects the split with the lowest impurity (best separation)

Interpretation:

Behavior

Exhaustive search

Always selects the optimal split

Deterministic given the data

Extra Trees (Extremely Randomized Trees)

Process:

Step	Action
1	Randomly generate one split point within the feature range
2	Apply that split directly
3	Do not compare with other candidates

Example:

Random split generated: Age < 52
Left: [22, 35, 41]
Right: [55, 63, 70, 78]

Interpretation:

Behavior

No search for the best split

Uses one random threshold

Stochastic (random) decision

Visual Analogy

Target analogy

Method	Strategy
Gini (Standard RF)	Tests many positions and selects the best
Extra Trees	Picks one random position

What Happens Across the Forest

Standard Random Forest

Property	Behavior
Feature selection	Random (controlled by mtry)
Data sampling	Bootstrap
Split selection	Optimal (deterministic)

Result:

Trees are strong (high-quality splits)
Trees are more similar (correlated)

Extra Trees

Property	Behavior
Feature selection	Random
Data sampling	Bootstrap
Split selection	Random threshold

Result:

Trees are weaker individually
Trees are more different (less correlated)

Bias–Variance Trade-off

Method	Bias	Variance	Explanation
Single decision tree	Low	High	Overfits data
Standard Random Forest	Moderate	Moderate	Balanced
Extra Trees	Slightly higher	Lower	More randomness reduces variance

Interpretation

Gini (standard RF):
- Lower bias
- Higher correlation between trees
Extra Trees:
- Slightly higher bias
- Lower variance due to greater diversity

Effect on Individual Trees

Standard Random Forest

Tree 1	Tree 2
Age < 48	Age < 48
SBP < 120	SBP < 125

Pattern:

Similar splits across trees
Trees are correlated

Extra Trees

Tree 1	Tree 2
Age < 52	Age < 37
SBP < 108	SBP < 135

Pattern:

Different splits across trees
Trees are less correlated

When Each Approach Performs Better

Scenario	Standard RF (Gini)	Extra Trees
Small dataset	Better	Acceptable
Large dataset	Good	Better
Few predictors	Better	Acceptable
Many predictors	Good	Better
Training speed	Slower	Faster

Practical Impact on Model Performance

In most real-world clinical prediction settings:

Model	Typical AUROC
Standard Random Forest	~0.78
Extra Trees	~0.77

Difference:

Usually 0.5–1%
Often not clinically meaningful

From a prediction modeling perspective:

Model performance is driven more by:
- Feature selection
- Sample size
- mtry
- minimum node size

not by the split rule itself.

Interpretation for Clinical Prediction Models

From a methodological standpoint:

Split rule affects variance vs bias balance
But has minor influence on overall discrimination (AUROC)
Calibration and clinical usefulness are largely unaffected

Practical Recommendation

Use Gini impurity as the default
Do not prioritize tuning the split rule
Focus on:
- Features per split
- Minimum node size
- Validation strategy

Key Takeaways

Split rule determines how thresholds are chosen at each node
Gini evaluates all possible splits and selects the best
Extra Trees uses a random split, increasing tree diversity
Extra Trees reduces variance but slightly increases bias
In practice, the effect on AUROC is small compared to other parameters

What Is the Split Rule (Discrimination Rule) in Random Forest? Gini vs Extra Trees Explained