Decision Trees are considered fragile because small changes in the training data can lead to large changes in the tree structure and predictions.
Why this happens
-
Greedy splitting
- Trees build by making locally optimal splits at each node.
- A slight change in data (e.g., one sample or a small noise) can change which split is chosen, altering the entire subtree downstream.
-
High variance Variance in ML
- Decision trees have low bias but high variance.
- They overfit easily to noise or outliers unless pruned or regularized.
-
Hierarchical structure
- Early splits strongly influence the rest of the tree.
- A different root split due to minor changes cascades into a completely different structure.
-
Sensitivity to outliers
- Extreme values can dominate split selection, further increasing instability.
Consequence
- Different training sets produce very different trees → poor generalization.
- This is why ensemble methods like Random Forestand Gradient Boosted Trees are widely used; they reduce variance by combining multiple trees.