Random Forest performance is driven by three core mechanisms:

Tree strength (how well each tree fits the data)
Tree diversity (how different trees are from each other)
Ensemble averaging (how predictions stabilize across trees)

Each parameter influences one or more of these mechanisms.

Category 1: Tree Structure Parameters (Most Important for Performance)

These parameters control how each individual tree grows and directly affect the bias–variance trade-off.

1. Features per split (mtry / maximum features)

Low number of features	High number of features
Few variables considered at each split	Many variables considered
High randomness	Low randomness
Trees very different (low correlation)	Trees similar (high correlation)
Higher bias	Lower bias
Lower variance (better ensemble effect)	Higher variance (less benefit from averaging)

Interpretation: This parameter controls how similar the trees are to each other. It is the most important tuning parameter because reducing correlation between trees greatly improves ensemble performance.

2. Minimum node size (min.node.size / minimum samples per leaf)

Small minimum node size	Large minimum node size
Very small terminal nodes	Larger terminal nodes
Highly complex trees	Simpler trees
Captures fine details (including noise)	Produces smoother predictions
Lower bias	Higher bias
Higher variance (overfitting risk)	Lower variance (underfitting risk)

Interpretation: This is the primary parameter that controls overfitting. Smaller values allow the model to memorize data; larger values force it to generalize.

3. Maximum tree depth (maximum depth)

Shallow trees	Deep trees
Limited interactions captured	Complex interactions captured
Higher bias	Lower bias
Lower variance	Higher variance
Risk of underfitting	Potential overfitting

Interpretation: In Random Forest, trees are usually allowed to grow fully because bagging already controls overfitting. Limiting depth is rarely necessary.

4. Minimum samples required to split (minimum samples to split)

Small minimum split size	Large minimum split size
Splits occur easily	Splits occur less often
More complex trees	Simpler trees
Lower bias	Higher bias
Higher variance	Lower variance

Interpretation: This parameter influences when splitting is allowed, but has less impact than minimum node size.

5. Maximum number of leaf nodes (maximum leaf nodes)

Few leaf nodes	Many leaf nodes
Strong restriction on tree size	Minimal restriction
Simpler trees	More complex trees
Higher bias	Lower bias
Lower variance	Higher variance

Interpretation: Another way to limit tree complexity. Its role overlaps with the minimum node size.

Category 2: Ensemble Parameters

These parameters control how multiple trees are built and combined.

6. Number of trees (number of estimators)

Few trees	Many trees
Unstable predictions	Stable predictions
Higher variance	Lower variance
Sensitive to sampling noise	Robust averaging
Faster computation	Slower computation

Interpretation: Increasing the number of trees reduces variance and stabilizes predictions. Performance improves until it plateaus (typically around 300–500 trees).

7. Sample fraction (fraction of data per tree)

Small sample fraction	Large sample fraction
Each tree sees less data	Each tree sees more data
Higher diversity between trees	Lower diversity
Higher bias	Lower bias
Lower variance (better ensemble effect)	Higher variance (trees are more similar)

Interpretation: Controls how similar trees are. Smaller fractions increase diversity and can reduce overfitting.

8. Bootstrap sampling (sampling with replacement)

Without replacement	With replacement
More unique observations per tree	Some observations repeated
Trees more similar	Trees more diverse
Slightly higher variance	Lower variance

Interpretation: Standard Random Forest uses sampling with replacement to increase variability across trees.

9. Class weighting (class weights)

No class weighting	Balanced class weighting
Majority class dominates learning	Minority class emphasized
Higher overall accuracy	Better minority detection
Lower sensitivity for rare events	Higher sensitivity for rare events
Better probability calibration	Possible distortion of predicted probabilities

Interpretation: This parameter changes how the model prioritizes errors. In clinical prediction, recalibration is often preferred over weighting when probability estimates are important.

Category 3: Feature Handling Parameters

These parameters control how features are evaluated during splitting.

10. Split rule (criterion for split quality)

Standard split rule (e.g., Gini)	More random split rule (e.g., Extra Trees)
Deterministic optimal splits	Randomized split points
Lower bias	Slightly higher bias
Higher variance	Lower variance
Stable performance	More randomness

Interpretation: The choice of the split rule has a relatively small impact. Standard methods such as Gini impurity work well in most situations.

11. Feature importance method

Impurity-based importance	Permutation-based importance
Fast to compute	Slower to compute
Biased toward variables with many categories	Unbiased
Less reliable for interpretation	More reliable for interpretation

Interpretation: This does not affect model performance but is critical for interpreting predictors.

Category 4: Computational / Reproducibility Parameters

These parameters do not affect model performance but ensure reproducibility and efficiency.

12. Random seed (random state)

Not fixed	Fixed
Results vary between runs	Results reproducible
Hard to debug	Consistent outputs

Interpretation: Always fix a random seed to ensure reproducibility.

13. Number of parallel threads (number of jobs)

Single thread	Multiple threads
Slower training	Faster training
Lower resource use	Higher CPU usage

Interpretation: Controls computation speed only.

14. Verbose output (verbosity level)

Low verbosity	High verbosity
Minimal output	Detailed progress logs
Cleaner console	More transparency

Interpretation: Useful for monitoring training progress.

15. Warm start (incremental tree building)

Disabled	Enabled
Model trained from scratch	Trees added incrementally
Simpler workflow	Flexible model expansion

Interpretation: Allows adding more trees without retraining from the beginning.

16. Out-of-bag error estimation (OOB score)

Disabled	Enabled
No internal validation	Built-in validation using unused samples
Requires external validation	Quick performance estimate

Interpretation: Provides an internal estimate of model performance without separate validation data.

Final Conceptual Summary

Random Forest performance depends on balancing:

Strong trees (deep, small leaf size, high feature usage)
Diverse trees (low feature usage, smaller sample fraction, bootstrapping)
Stable averaging (large number of trees)

The most influential parameters are:

Features per split (controls correlation between trees)
Minimum node size (controls overfitting)
Sample fraction (secondary diversity control)

Key Takeaways

Features per split is the most important parameter because it controls tree correlation
Minimum node size is the main control for overfitting
Number of trees stabilizes predictions and should be set sufficiently large
Sample fraction can further improve diversity between trees
Most other parameters have smaller or redundant effects

How Random Forest Hyperparameters Affect Model Performance