Decision Tree. By Wang Rui State Key Lab of CAD&CG 2004-03-17. Review. Concept learning Induce Boolean function from a sample of positive/negative training examples. Concept learning can be cast as Searching through predefined hypotheses space Searching Algorithm: FIND-S
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
By Wang Rui
State Key Lab of CAD&CG
Decision tree learning is a method for approximating discrete-valued target functions(Classifier), in which the learned function is represented by a decision tree.
Decision tree algorithm induces
concepts from examples.
Decision tree algorithm is a
Decision tree algorithm
Classify instances by sorting them down the tree from the root to some leaf node
Each path from the tree root to a leaf corresponds to a conjunction of attribute tests
(Overlook = Sunny) ^ (Humidity = Normal)
(Overlook = Sunny ^ Humidity = Normal)
V (Outlook = Overcast)
V (Outlook = Rain ^Wind = Weak)
1. A the “best” decision attribute for next node
2. Assign A as decision attribute for node
3. For each value of A, create new descendant of node
4. Sort training examples to leaf nodes
5. If training examples perfectly classified, Then STOP, Else iterate over new leaf nodes
Shorter trees are preferred over lager Trees = Strong }
Idea: want attributes that classifies examples well. The best attribute is selected.
How well an attribute alone classifies the training data?
Simple is beauty
Information = Strong }
The information of a single symbol in a n symbols message
Answer: is transmitted for sure. Therefore, no information.
Answer: Consider a symbol ,then the received information is
So the amount of information or information content in the symbols is
will occur, on average, times for
Therefore, total information of the M-symbol message is
The average information per symbol is and
A1 = overcast: + (4.0) = Strong }
A1 = sunny:
| A3 = high: - (3.0)
| A3 = normal: + (2.0)
A1 = rain:
| A4 = weak: + (3.0)
| A4 = strong: - (2.0)
Not successful in practice
its estimated accuracy
The most material of this part come from:http://sifaka.cs.uiuc.edu/taotao/stat/chap10.ppt
By Wang Rui
Cross Validation = Strong }
Previous methods : Individual classifiers were independent.
εk= [ ∑ W(xi )∙ I(yi ≠ Ck(xi )) ] / [ ∑ W(xi )]
L(y, f (x)) = exp(-y ∙ f (x)) - the exponential loss function
Original Training set : Equal Weights to all training samples
x = Strong }
s(x,y) = s(x,y-1) + i(x,y)
ii(x,y) = ii(x-1,y) + s(x,y)
Definition: The integral image at location (x,y) contains the sum of the pixels above and to the left of (x,y) , inclusive:
Using the following pair of recurrences:
Using the integral image any rectangular sum can be computed in four array references
ii(4) + ii(1) – ii(2) – ii(3)