PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning
Rajeev RastogiKyuseok Shim
Bell Laboratories
Murray Hill, NJ 07974
24th VLDB Conference, New York, USA, 1998
P76021140 郭育婷
P76021336 林吾軒
P76021043 黃喻豐
P76014339 李聲彥
P76021213 顏孝軒
Decision Tree
Training Data
Record
Attribute
Class
cost(S) < cost(each leaf of S)
A,B,C,D,E,F
E,F
A
B,C,D
B,C
F
D
E
If cost(B,C) < cost(B)+cost(C)
=> prune!!!
B
C
(PrUningand BuiLdingIntegrated in Classification)
A < vi
A ∈ V = {v1, v2, v3, … vm}
N
Y
Root Attribute lists : Z
Attribute lists : Y
Attribute lists : X
S : n records
S1 : n1 records
S2: n2records
Lelt
Right
Init & breadth-first
Compute entropy
split
leaf
node
prune
C(S)+1
Less then CS(S)+1
/*n1,…,nkare sorted in decreasing order*/
Thanks for your attention！