cart classification and regression trees n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
CART: Classification and Regression Trees PowerPoint Presentation
Download Presentation
CART: Classification and Regression Trees

Loading in 2 Seconds...

play fullscreen
1 / 11

CART: Classification and Regression Trees - PowerPoint PPT Presentation


  • 145 Views
  • Uploaded on

CART: Classification and Regression Trees. By Alyssa Dalton. Outline. Purpose of CART Background Why use CART? Splitting Rules Developing a Classification or Regression Tree Missing values. Purpose of CART.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CART: Classification and Regression Trees' - burt


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
outline
Outline
  • Purpose of CART
  • Background
  • Why use CART?
  • Splitting Rules
  • Developing a Classification or Regression Tree
  • Missing values
purpose of cart
Purpose of CART
  • “The CART decision tree is a binary recursive partitioning procedure capable of processing continuous [regression] and nominal [classification] attributes as targets and predictors.” (Top ten algorithms)
  • Recursive: using a rule or procedure that can be applied repeatedly
background
Background
  • Classification and Regression Trees
  • Leo Breimen
  • Richard A. Olshen
  • Charles J. Stone
  • Jerome H. Friedman
  • UCSD Heart Disease Study
why use cart
Why use CART?
  • Originally designed to mimic thought process of a doctor when diagnosing a patient
  • Easy to create tree (software creates tree using given data)
  • Easy to understand result
splitting rules
Splitting Rules
  • The Gini coefficient determines which variable to split node on.
  • G(t)= 1- p(t)2 – (1- p(t)) 2
    • p(t) is the possibility of having 1 “class” in a node t.
  • The Gini coefficient ranges between 0 and 1.
  •  Reaches a value of zero when only one class is present at a node
  • An instance goes left if CONDITION, and goes right otherwise.
developing a classification or regression tree
Developing a Classification or Regression Tree
  • Divide data into test and training sets
  • Root Node
  • Binary split using Gini- value closest to zero.
    • Entropy is an alternative
  • Continue to split until no other data is available.
    • Gives maximal tree.
  • Prune- using cross validation (test data)
  • gets rid of nodes that do not contribute to lowering error
    • Avoids overfitting
    • Gives optimal tree
missing values
Missing values
  • Surrogate split- if optimal splitting value is missing, the value with the “highest measure of predictive association” with the optimal value is used
re cap
Re-cap
  • Purpose of CART
  • Background
  • Why use CART?
  • Splitting Rules
  • Developing a Classification or Regression Tree
  • Missing values
sources
sources
  • http://people.hofstra.edu/geotrans/eng/ch4en/meth4en/ch4m1en.html
  • http://www.statsoft.com/textbook/classification-and-regression-trees/#details
  • http://filipspagnoli.wordpress.com/2008/05/05/human-rights-facts-4/
  • Dictionary.reference.com