Decision tree learning
This presentation is the property of its rightful owner.
Sponsored Links
1 / 30

Decision Tree Learning PowerPoint PPT Presentation


  • 203 Views
  • Uploaded on
  • Presentation posted in: General

Decision Tree Learning. Kelby Lee. Overview. What is a Decision Tree ID3 REP IREP RIPPER Application. What is Decision Tree. What is Decision Tree. Select best attribute that classifies examples Top Down Start with concept that represents all Greedy Algorithm

Download Presentation

Decision Tree Learning

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Decision tree learning

Decision Tree Learning

Kelby Lee


Overview

Overview

  • What is a Decision Tree

  • ID3

  • REP

  • IREP

  • RIPPER

  • Application


What is decision tree

What is Decision Tree


What is decision tree1

What is Decision Tree

  • Select best attribute that classifies examples

  • Top Down

    • Start with concept that represents all

  • Greedy Algorithm

    • Select attribute that classifies maximum examples

  • Does not backtrack

  • ID3


Id3 algorithm

ID3 Algorithm

  • ID3(Examples, Target_attribute, Attributes)

  • Create a Root node for the tree

  • If Examples all positive?

    • Return Single Node Tree Root, with label = +

  • If Examples all negative?

    • Return Single node Tree Root, with label = -

  • If Attributes is empty

    • Return single-node tree Root, label = most common value of Target_attribute in Examples


Id3 algorithm1

ID3 Algorithm

  • Otherwise

    • A  Best_Attribute (Attributes, Examples)

    • Root  A

      • For each value vi of A

        • Add a new tree branch

        • Examples_svi is a subset of Examples for vi

          • If Examples_svi is empty?

          • Add leaf node label = most common value of Target_attribute

          • Add a new sub tree: ID3(Examples_svi, Target_attribute, Attributes – {A})


Selecting best attribute

Selecting Best Attribute

  • New property of Attribute: Information Gain

  • Information Gain: Measures how well a given attribute separates the training examples according to their target classification


Information gain

{E1+, E2+}

att1

{E1+, E2+, E3-, E4-}

{E3-, E4-}

{E1+, E3-}

att2

{E1+, E2+, E3-, E4-}

{E2+, E4-}

Information Gain

att1 = 1

att2 = 0.5


Tree pruning

Tree Pruning

  • Overfit and Simplify

  • Simplify Tree

  • In most cases it improves accuracy


Decision tree learning

REP

  • Reduced Error Pruning

  • Deletes Single Conditions or Single Rules

  • Improves on Noisy Data

  • O(n4) on large data sets


Decision tree learning

IREP

  • Incremental Reduced Error Pruning

  • Produces one rule at a time and eliminates all examples covered by that rule

  • Stops when no positive examples or pruning produces unacceptable error


Irep algorithm

IREP Algorithm

PROCEDURE IREP(Pos, Neg)

BEGIN

Ruleset := 0

WHILE Pos != 0 DO

/* Grow and Prune a New Rule */

split (Pos, Neg) into (GrowPos, GrowNeg)

Rule := GrowRule( GrowPos, GrowNeg )

Rule := PruneRule( Rule, PrunePos, PruneNeg )


Irep algorithm1

IREP Algorithm

IF error rate of Rule on

( PrunePos, PruneNeg ) exceeds 50% THEN

RETURN Ruleset

ELSE

Add Rule to Ruleset

Remove examples covered by Rule from ( Pos, Neg )

ENDIF

ENDWHILE

RETURN Ruleset

END


Ripper

RIPPER

  • Repeated Grow and Simplify produces quite different results than REP

  • Repeatedly prune the rule set to minimize the error

  • Repeated Incremental Pruning to Produce Error Reduction (RIPPER)


Ripper algorithm

RIPPER Algorithm

PROCEDURE RIPPERk (Pos, Neg)

BEGIN

Ruleset : = IREP(Pos, Neg)

REPEAT k TIMES

Ruleset := Optimize(Ruleset, Pos, Neg)

UncovPos : = Pos \ {data covered by Ruleset }

UncovNeg : = Neg \ {data covered by Ruleset }

Ruleset : = Ruleset  IREP(UncovPos, UncovNeg)

ENDREPEAT

END


Optimization function

Optimization Function

FUNCTION Optimize (Ruleset, Pos, Neg)

BEGIN

FOR each rule r  Ruleset do

split ( Pos, Neg) into (GrowPos, GrowNeg) and (PrunePos, PruneNeg)

/* Compute Replacement for r */

r’ : = GrowRule (GrowPos, GrowNet)

r’ : = PruneRule ( r’, PrunePos, PruneNeg )

guided by error of Ruleset \ {c}  {c’}


Optimization function1

Optimization Function

/* Compute Replacement for r */

r’’ : = GrowRule (GrowPos, GrowNet)

r’’ : = PruneRule ( r’, PrunePos, PruneNeg )

guided by error of Ruleset \ {c}  {c’’}

Replace c in Ruleset with best of c, c’, c’’ guided by description length of

Compress(Ruleset\{c}  {x})

ENDFOR

RETURN Ruleset

END


Ripper data

RIPPER Data

3,6.0E+00,6.0E+00,4.0E+00,none,35,empl_contr,7.444444444444445E+00,14,false,9,gnr,true,full,true,full,good.

2,4.5E+00,4.0E+00,3.913333333333334E+00,none,40,empl_contr,7.444444444444445E+00,4,false,10,gnr,true,half,true,full,good.

3,5.0E+00,5.0E+00,5.0E+00,none,40,empl_contr,7.444444444444445E+00,4.870967741935484E+00,false,12,avg,true,half,true,half,good.

2,4.6E+00,4.6E+00,3.913333333333334E+00,tcf,38,empl_contr,7.444444444444445E+00,4.870967741935484E+00,false,1.109433962264151E+01,ba,true,half,true,half,good.


Ripper names file

RIPPER Names file

good,bad.

dur:continuous.

wage1:continuous.

wage2:continuous.

wage3:continuous.

cola:none, tcf, tc.

hours:continuous.

pension:none, ret_allw, empl_contr.

stby_pay:continuous.

shift_diff:continuous.

educ_allw:false, true.

holidays:continuous.

vacation:ba, avg, gnr.

lngtrm_disabil:false, true.

dntl_ins:none, half, full.

bereavement:false, true.

empl_hplan:none, half, full.


Ripper output

RIPPER Output

Final hypothesis is:

bad :- wage1<=2.8 (14/3).

bad :- lngtrm_disabil=false (5/0).

default good (34/1).

=====================summary==================

Train error rate: 7.02% +/- 3.41% (57 datapoints) <<

Hypothesis size: 2 rules, 4 conditions

Learning time: 0.01 sec


Ripper hypothesis

RIPPER Hypothesis

bad 14 3 IF wage1 <= 2.8 .

bad 5 0 IF lngtrm_disabil = false .

good 34 1 IF .

.


Decision tree learning

IDS

  • Intrusion Detection System


Decision tree learning

IDS

  • Use Data Mining to Detect Anomaly

  • Better than Pattern Matching since may be possible to detect undiscovered attacks


Ripper ids data

RIPPER IDS data

86,543520084,192168000120,2698,192168000190,22,6,17,40,2096,158723779,14054,normal.

87,543520084,192168000190,22,192p168p0p120,2698,6,16,40,58387,39130843,46725,normal.

...........................

11,543520084,192168000190,80,192168000120,2703,6,16,40,58400,39162494,46738,anomaly.

12,543520084,192168000190,80,192168000120,2703,6,16,1500,58400,39162494,45277,anomaly.


Ripper ids names

RIPPER IDS names

normal,anomaly.

recID: ignore.

timestamp: symbolic.

sourceIP: set.

sourcePORT: symbolic.

destIP: set.

destPORT: symbolic.

protocol: symbolic.

flags: symbolic.

length: symbolic.

winsize: symbolic.

ack: symbolic.

checksum: symbolic.


Ripper output1

RIPPER Output

Final hypothesis is:

anomaly :- sourcePORT='80' (33/0).

anomaly :- destPORT='80' (35/0).

anomaly :- ack='7.01238e+07' (3/0).

anomaly :- ack='7.03859e+07' (2/0).

default normal (87/0).

=================summary=====================

Train error rate: 0.00% +/- 0.00% (160 datapoints) <<

Hypothesis size: 4 rules, 8 conditions

Learning time: 0.01 sec


Ripper output2

RIPPER Output

anomaly 33 0 IF sourcePORT = 80 .

anomaly 35 0 IF destPORT = 80 .

anomaly 3 0 IF ack = 7.01238e+07 .

anomaly 2 0 IF ack = 7.03859e+07 .

normal 87 0 IF .

.


Ids output

IDS Output


Ids output1

IDS Output


Conclusion

Conclusion

  • What is a Decision Tree

  • ID3

  • REP

  • IREP

  • RIPPER

  • Application


  • Login