BaggingClassifier

  • Implements bootstrap aggregating.
  • Builds multiple independent copies of a chosen base estimator using bootstrapped samples of the training data.
  • Final predictions are obtained through averaging (regression) or majority voting (classification).
  • When to use:
    • You want to ensemble models other than trees.
    • Example: ensembling multiple KNN classifiers or SVMs.

RandomForestClassifier (Sklearn)

  • A specialised form of bagging that always uses decision trees as base estimators.
  • Adds extra randomness by selecting a random subset of features at each split, controlled by .
    • You need an ensemble of decision trees with feature-level randomness.
    • You want strong generalisation and minimal hyperparameter tuning.

Key Differences

Random forests often outperform standard bagging with trees because the additional feature randomness decorrelates the trees, improving generalisation.

FeatureBaggingClassifierRandomForestClassifier
Base modelAny estimatorAlways a decision tree
Feature randomnessNone by defaultRandom subset of features at each split (\text{max_features})
Correlation reductionFrom bootstrapped samples onlyFrom bootstrapping and random feature selection
Bias–variance behaviourReduces varianceReduces variance more effectively
Out-of-bag scoringSupportedSupported
Speed and tuningDepends on base modelOptimised for trees; typically faster and simpler to tune
InterpretabilityDepends on estimatorTree-specific tools available (feature importance, etc.)
API scopeGeneral-purposeSpecialised and efficient for tree ensembles