Ensemble Methods for Machine Learning 9781617297137

InEnsemble Methods for Machine Learning you'll learn to implement the most important ensemble machine learning meth

800 316 15MB

English Pages 339 Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Ensemble Methods for Machine Learning
 9781617297137

  • Commentary
  • (True EPUB, MOBI)

Table of contents :
contents
front matter
preface
acknowledgments
about this book
about the author
about the cover illustration
Part 1 The basics of ensembles
1 Ensemble methods: Hype or hallelujah?
1.1 Ensemble methods: The wisdom of the crowds
1.2 Why you should care about ensemble learning
1.3 Fit vs. complexity in individual models
Regression with decision trees
Regression with support vector machines
1.4 Our first ensemble
1.5 Terminology and taxonomy for ensemble methods
Part 2 Essential ensemble methods
2 Homogeneous parallel ensembles: Bagging and random forests
2.1 Parallel ensembles
2.2 Bagging: Bootstrap aggregating
Intuition: Resampling and model aggregation
Implementing bagging
Bagging with scikit-learn
Faster training with parallelization
2.3 Random forests
Randomized decision trees
Random forests with scikit-learn
Feature importances
2.4 More homogeneous parallel ensembles
Pasting
Random subspaces and random patches
Extra Trees
2.5 Case study: Breast cancer diagnosis
Loading and preprocessing
Bagging, random forests, and Extra Trees
Feature importances with random forests
3 Heterogeneous parallel ensembles: Combining strong learners
3.1 Base estimators for heterogeneous ensembles
Fitting base estimators
Individual predictions of base estimators
3.2 Combining predictions by weighting
Majority vote
Accuracy weighting
Entropy weighting
Dempster-Shafer combination
3.3 Combining predictions by meta-learning
Stacking
Stacking with cross validation
3.4 Case study: Sentiment analysis
Preprocessing
Dimensionality reduction
Blending classifiers
4 Sequential ensembles: Adaptive boosting
4.1 Sequential ensembles of weak learners
4.2 AdaBoost: Adaptive boosting
Intuition: Learning with weighted examples
Implementing AdaBoost
AdaBoost with scikit-learn
4.3 AdaBoost in practice
Learning rate
Early stopping and pruning
4.4 Case study: Handwritten digit classification
Dimensionality reduction with t-SNE
Boosting
4.5 LogitBoost: Boosting with the logistic loss
Logistic vs. exponential loss functions
Regression as a weak learning algorithm for classification
Implementing LogitBoost
5 Sequential ensembles: Gradient boosting
5.1 Gradient descent for minimization
Gradient descent with an illustrative example
Gradient descent over loss functions for training
5.2 Gradient boosting: Gradient descent + boosting
Intuition: Learning with residuals
Implementing gradient boosting
Gradient boosting with scikit-learn
Histogram-based gradient boosting
5.3 LightGBM: A framework for gradient boosting
What makes LightGBM “light”?
Gradient boosting with LightGBM
5.4 LightGBM in practice
Learning rate
Early stopping
Custom loss functions
5.5 Case study: Document retrieval
The LETOR data set
Document retrieval with LightGBM
6 Sequential ensembles: Newton boosting
6.1 Newton’s method for minimization
Newton’s method with an illustrative example
Newton’s descent over loss functions for training
6.2 Newton boosting: Newton’s method + boosting
Intuition: Learning with weighted residuals
Intuition: Learning with regularized loss functions
Implementing Newton boosting
6.3 XGBoost: A framework for Newton boosting
What makes XGBoost “extreme”?
Newton boosting with XGBoost
6.4 XGBoost in practice
Learning rate
Early stopping
6.5 Case study redux: Document retrieval
The LETOR data set
Document retrieval with XGBoost
Part 3 Ensembles in the wild: Adapting ensemble methods to your data
7 Learning with continuous and count labels
7.1 A brief review of regression
Linear regression for continuous labels
Poisson regression for count labels
Logistic regression for classification labels
Generalized linear models
Nonlinear regression
7.2 Parallel ensembles for regression
Random forests and Extra Trees
Combining regression models
Stacking regression models
7.3 Sequential ensembles for regression
Loss and likelihood functions for regression
Gradient boosting with LightGBM and XGBoost
7.4 Case study: Demand forecasting
The UCI Bike Sharing data set
GLMs and stacking
Random forest and Extra Trees
XGBoost and LightGBM
8 Learning with categorical features
8.1 Encoding categorical features
Types of categorical features
Ordinal and one-hot encoding
Encoding with target statistics
The category_encoders package
8.2 CatBoost: A framework for ordered boosting
Ordered target statistics and ordered boosting
Oblivious decision trees
CatBoost in practice
8.3 Case study: Income prediction
Adult Data Set
Creating preprocessing and modeling pipelines
Category encoding and ensembling
Ordered encoding and boosting with CatBoost
8.4 Encoding high-cardinality string features
9 Explaining your ensembles
9.1 What is interpretability?
Black-box vs. glass-box models
Decision trees (and decision rules)
Generalized linear models
9.2 Case study: Data-driven marketing
Bank Marketing data set
Training ensembles
Feature importances in tree ensembles
9.3 Black-box methods for global explainability
Permutation feature importance
Partial dependence plots
Global surrogate models
9.4 Black-box methods for local explainability
Local surrogate models with LIME
Local interpretability with SHAP
9.5 Glass-box ensembles: Training for interpretability
Explainable boosting machines
EBMs in practice
epilogue
E.1 Further reading
Practical ensemble methods
Theory and foundations of ensemble methods
E.2 A few more advanced topics
Ensemble methods for statistical relational learning
Ensemble methods for deep learning
E.3 Thank you!
index

Polecaj historie