Ensemble learning 2 tree and forest
This presentation is the property of its rightful owner.
Sponsored Links
1 / 30

Ensemble Learning (2), Tree and Forest PowerPoint PPT Presentation


  • 80 Views
  • Uploaded on
  • Presentation posted in: General

Ensemble Learning (2), Tree and Forest. Classification and Regression Tree Bagging of trees Random Forest. Motivation. To estimate complex response surface and class boundary. Reminder: SVM achieves this goal by the kernel trick; Boosting achieves this goal by combining weak classifiers

Download Presentation

Ensemble Learning (2), Tree and Forest

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ensemble learning 2 tree and forest

Ensemble Learning (2),Tree and Forest

Classification and Regression Tree

Bagging of trees

Random Forest


Ensemble learning 2 tree and forest

Motivation

  • To estimate complex response surface and class boundary.

  • Reminder:

  • SVM achieves this goal by the kernel trick;

  • Boosting achieves this goal by combining weak classifiers

  • Classification tree achieves this goal by generating complex surface/boundary in a very complex model.

  • Bagging classification tree and random forest achieves this goal in a manner similar as boosting.


Ensemble learning 2 tree and forest

Classification tree


Ensemble learning 2 tree and forest

Classification Tree

An example classification tree.


Ensemble learning 2 tree and forest

Classification tree

Issues:

How many splits should be allowed at a node?

Which property to use at a node?

When to stop splitting a node and declare it a “leaf”?

How to adjust the size of the tree?

Tree size <-> model complexity.

Too large a tree – over fitting;

Too small a tree –

not capture the underlying structure.

How to assign the classification decision at each leaf?

Missing data?


Ensemble learning 2 tree and forest

Classification Tree

Binary split.


Ensemble learning 2 tree and forest

Classification Tree

To decide what split criteria to use, need to establish the measurement of node impurity.

Entropy:

Misclassification:

Gini impurity:

(Expected error rate if class label is permuted.)


Ensemble learning 2 tree and forest

Classification Tree


Ensemble learning 2 tree and forest

Classification Tree

Growing the tree.

Greedy search: at every step, choose the query that decreases the impurity as much as possible.

For a real valued predictor, may use gradient descent to find the optimal cut value.

When to stop?

- Stop when reduction in impurity is smaller than a threshold.

- Stop when the leaf node is too small.

- Stop when a global criterion is met.

- Hypothesis testing.

- Cross-validation.

- Fully grow and then prune.


Ensemble learning 2 tree and forest

Classification Tree

Pruning the tree.

- Merge leaves when the loss of impurity is not severe.

- cost-complexity pruning allows elimination of a branch in a single step.

When priors and costs are present, adjust training by adjusting the Gini impurity

Assigning class label to a leaf.

- No prior: take the class with highest frequency at the node.

- With prior: weigh the frequency by prior

- With loss function.…

Always minimize the Bayes error.


Ensemble learning 2 tree and forest

Classification Tree

Choice or features.


Ensemble learning 2 tree and forest

Classification Tree

Multivariate tree.


Ensemble learning 2 tree and forest

Classification Tree

Example of error rate v.s. tree size.


Ensemble learning 2 tree and forest

Regression tree

Example: complex surface by CART.


Ensemble learning 2 tree and forest

Regression tree

Model the response as region-wise constant:

Due to computational difficulty, a greedy algorithm:

Consider splitting variable j and split point s,

Seek j and s that minimize RSS:

Simply scan through all j and the range of xj to find s.

After partition, treat each region as separate data and iterate.


Ensemble learning 2 tree and forest

Regression tree

Tree size <-> model complexity.

Too large a tree – over fitting;

Too small a tree – not capture the underlying structure.

How to tune?

- Grow the tree until RSS reduction becomes too small.

Too greedy and “short sighted”.

- Grow until leafs are too small, then prune the tree using

cost-complexity pruning.


Ensemble learning 2 tree and forest

Bootstraping

  • Directly assess uncertainty from the training data

Basic thinking:

assuming the data approaches true underlying density, re-sampling from it will give us an idea of the uncertainty caused by sampling


Ensemble learning 2 tree and forest

Bootstrapping


Ensemble learning 2 tree and forest

Bagging

“Bootstrap aggregation.”

Resample the training dataset.

Build a prediction model on each resampled dataset.

Average the prediction.

It’s a Monte Carlo estimate of , where is the empirical distribution putting equal probability 1/N on each of the data points.

Bagging only differs from the original estimate when f() is a non-linear or adaptive function of the data! When f() is a linear function,

Tree is a perfect candidate for bagging – each bootstrap tree will differ in structure.


Ensemble learning 2 tree and forest

Bagging trees

Bagged trees are of different structure.


Ensemble learning 2 tree and forest

Bagging trees

Error curves.


Ensemble learning 2 tree and forest

Bagging trees

Failure in bagging a single-level tree.


Ensemble learning 2 tree and forest

Random Forest

Bagging can be seen as a method to reduce variance of an estimated prediction function. It mostly helps high-variance, low-bias classifiers.

Comparatively, boosting build weak classifiers one-by-one, allowing the collection to evolve to the right direction.

Random forest is a substantial modification to bagging – build a collection of de-correlated trees.

- Similar performance to boosting

- Simpler to train and tune compared to boosting


Ensemble learning 2 tree and forest

Random Forest

The intuition – the average of random variables.

B i.i.d. random variables, each with variance

The mean has variance

B i.d. random variables, each with variance , with pairwise correlation ,

The mean has variance

-------------------------------------------------------------------------------------

Bagged trees are i.d. samples.

Random forest aims at reducing the correlation to reduce variance. This is achieved by random selection of variables.


Ensemble learning 2 tree and forest

Random Forest


Ensemble learning 2 tree and forest

Random Forest

Example comparing RF to boosted trees.


Ensemble learning 2 tree and forest

Random Forest

Example comparing RF to boosted trees.


Ensemble learning 2 tree and forest

Random Forest

Benefit of RF – out of bag (OOB) sample  cross validation error.

For sample i, find its RF error from only trees built from samples where sample i did not appear.

The OOB error rate is close to N-fold cross validation error rate.

Unlike many other nonlinear estimators, RF can be fit in a single sequence. Stop growing forest when OOB error stabilizes.


Ensemble learning 2 tree and forest

Random Forest

Variable importance – find the most relevant predictors.

At every split of every tree, a variable contributed to the improvement of the impurity measure.

Accumulate the reduction of i(N) for every variable, we have a measure of relative importance of the variables.

The predictors that appears the most times at split points, and lead to the most reduction of impurity, are the ones that are important.

------------------

Another method – Permute the predictor values of the OOB samples at every tree, the resulting decrease in prediction accuracy is also a measure of importance. Accumulate it over all trees.


Ensemble learning 2 tree and forest

Random Forest


  • Login