AcademicConceptMachine LearningSeries

Machine learning: a quick review (part 6)

6- Ensemble techniques

6-1- What are ensemble techniques

In ensemble learning, you take multiple algorithms or same algorithm multiple times and put together a model that’s more powerful than the original.

Boosting method:

This is a four step process and our steps are as follows:

  1. Pick a random K data points from the training set.
  2. Build the decision tree associated to these K data points.
  3. Choose the number N tree of trees you want to build and repeat steps 1 and 2.
  4. For a new data point, make each one of your Ntree trees predict the value of Y for the data point in the question, and assign the new data point the average across all of the predicted Y values.

6-1- Random Forest

The random forest algorithm is an extension of the bagging method as it utilizes both bagging and feature randomness to create an uncorrelated forest of decision trees. Feature randomness, also known as feature bagging or “the random subspace method”, generates a random subset of features, which ensures low correlation among decision trees. This is a key difference between decision trees and random forests. While decision trees consider all the possible feature splits, random forests only select a subset of those features.

6-2- boosting/bagging

Adaboost

AdaBoost, short for Adaptive Boosting, is a machine learning meta-algorithm formulated by Yoav Freund and Robert Schapire, who won the 2003 Gödel Prize for their work.

An AdaBoost regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult cases.

One way for a new predictor to correct its predecessor is to pay a bit more attention to the training instances that the predecessor has incorrectly classified. This results in new predictors focusing more and more on the hard cases. This is the technique used by Ada‐Boost.

Gradient boosting:

Another very popular Boosting algorithm is Gradient Boosting. Just like Ada Boost, Gradient Boosting works by sequentially adding predictors to an ensemble, each one correcting its predecessor. However, instead of tweaking the instance weights at every iteration like Ada Boost does, this method tries to fit the new predictor to the residual errors made by the previous predictor.

Xtreme Gradient Boosting:

Gradient boosting is an ensemble method that sequentially adds our trained predictors and assigns them a weight. However, instead of assigning different weights to the classifiers after every iteration, this method fits the new model to new residuals of the previous prediction and then minimizes the loss when adding the latest prediction. So, in the end, you are updating your model using gradient descent and hence the name, gradient boosting. This is supported for both regression and classification problems. XGBoost specifically, implements this algorithm for decision tree boosting with an additional custom regularization term in the objective function.

One thought on “Machine learning: a quick review (part 6)

  • Useful and in-depth information. Helped me with my studies!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses User Verification plugin to reduce spam. See how your comment data is processed.