- By
**abiba** - Follow User

- 246 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Decision Tree Learning' - abiba

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Decision Tree Learning

Kelby Lee

Overview

- What is a Decision Tree
- ID3
- REP
- IREP
- RIPPER
- Application

What is Decision Tree

- Select best attribute that classifies examples
- Top Down
- Start with concept that represents all

- Greedy Algorithm
- Select attribute that classifies maximum examples

- Does not backtrack
- ID3

ID3 Algorithm

- ID3(Examples, Target_attribute, Attributes)
- Create a Root node for the tree
- If Examples all positive?
- Return Single Node Tree Root, with label = +

- If Examples all negative?
- Return Single node Tree Root, with label = -

- If Attributes is empty
- Return single-node tree Root, label = most common value of Target_attribute in Examples

ID3 Algorithm

- Otherwise
- A Best_Attribute (Attributes, Examples)
- Root A
- For each value vi of A
- Add a new tree branch
- Examples_svi is a subset of Examples for vi
- If Examples_svi is empty?
- Add leaf node label = most common value of Target_attribute
- Add a new sub tree: ID3(Examples_svi, Target_attribute, Attributes – {A})

- For each value vi of A

Selecting Best Attribute

- New property of Attribute: Information Gain
- Information Gain: Measures how well a given attribute separates the training examples according to their target classification

att1

{E1+, E2+, E3-, E4-}

{E3-, E4-}

{E1+, E3-}

att2

{E1+, E2+, E3-, E4-}

{E2+, E4-}

Information Gainatt1 = 1

att2 = 0.5

Tree Pruning

- Overfit and Simplify
- Simplify Tree
- In most cases it improves accuracy

REP

- Reduced Error Pruning
- Deletes Single Conditions or Single Rules
- Improves on Noisy Data
- O(n4) on large data sets

IREP

- Incremental Reduced Error Pruning
- Produces one rule at a time and eliminates all examples covered by that rule
- Stops when no positive examples or pruning produces unacceptable error

IREP Algorithm

PROCEDURE IREP(Pos, Neg)

BEGIN

Ruleset := 0

WHILE Pos != 0 DO

/* Grow and Prune a New Rule */

split (Pos, Neg) into (GrowPos, GrowNeg)

Rule := GrowRule( GrowPos, GrowNeg )

Rule := PruneRule( Rule, PrunePos, PruneNeg )

IREP Algorithm

IF error rate of Rule on

( PrunePos, PruneNeg ) exceeds 50% THEN

RETURN Ruleset

ELSE

Add Rule to Ruleset

Remove examples covered by Rule from ( Pos, Neg )

ENDIF

ENDWHILE

RETURN Ruleset

END

RIPPER

- Repeated Grow and Simplify produces quite different results than REP
- Repeatedly prune the rule set to minimize the error
- Repeated Incremental Pruning to Produce Error Reduction (RIPPER)

RIPPER Algorithm

PROCEDURE RIPPERk (Pos, Neg)

BEGIN

Ruleset : = IREP(Pos, Neg)

REPEAT k TIMES

Ruleset := Optimize(Ruleset, Pos, Neg)

UncovPos : = Pos \ {data covered by Ruleset }

UncovNeg : = Neg \ {data covered by Ruleset }

Ruleset : = Ruleset IREP(UncovPos, UncovNeg)

ENDREPEAT

END

Optimization Function

FUNCTION Optimize (Ruleset, Pos, Neg)

BEGIN

FOR each rule r Ruleset do

split ( Pos, Neg) into (GrowPos, GrowNeg) and (PrunePos, PruneNeg)

/* Compute Replacement for r */

r’ : = GrowRule (GrowPos, GrowNet)

r’ : = PruneRule ( r’, PrunePos, PruneNeg )

guided by error of Ruleset \ {c} {c’}

Optimization Function

/* Compute Replacement for r */

r’’ : = GrowRule (GrowPos, GrowNet)

r’’ : = PruneRule ( r’, PrunePos, PruneNeg )

guided by error of Ruleset \ {c} {c’’}

Replace c in Ruleset with best of c, c’, c’’ guided by description length of

Compress(Ruleset\{c} {x})

ENDFOR

RETURN Ruleset

END

RIPPER Data

3,6.0E+00,6.0E+00,4.0E+00,none,35,empl_contr,7.444444444444445E+00,14,false,9,gnr,true,full,true,full,good.

2,4.5E+00,4.0E+00,3.913333333333334E+00,none,40,empl_contr,7.444444444444445E+00,4,false,10,gnr,true,half,true,full,good.

3,5.0E+00,5.0E+00,5.0E+00,none,40,empl_contr,7.444444444444445E+00,4.870967741935484E+00,false,12,avg,true,half,true,half,good.

2,4.6E+00,4.6E+00,3.913333333333334E+00,tcf,38,empl_contr,7.444444444444445E+00,4.870967741935484E+00,false,1.109433962264151E+01,ba,true,half,true,half,good.

RIPPER Names file

good,bad.

dur: continuous.

wage1: continuous.

wage2: continuous.

wage3: continuous.

cola: none, tcf, tc.

hours: continuous.

pension: none, ret_allw, empl_contr.

stby_pay: continuous.

shift_diff: continuous.

educ_allw: false, true.

holidays: continuous.

vacation: ba, avg, gnr.

lngtrm_disabil: false, true.

dntl_ins: none, half, full.

bereavement: false, true.

empl_hplan: none, half, full.

RIPPER Output

Final hypothesis is:

bad :- wage1<=2.8 (14/3).

bad :- lngtrm_disabil=false (5/0).

default good (34/1).

=====================summary==================

Train error rate: 7.02% +/- 3.41% (57 datapoints) <<

Hypothesis size: 2 rules, 4 conditions

Learning time: 0.01 sec

IDS

- Intrusion Detection System

IDS

- Use Data Mining to Detect Anomaly
- Better than Pattern Matching since may be possible to detect undiscovered attacks

RIPPER IDS data

86,543520084,192168000120,2698,192168000190,22,6,17,40,2096,158723779,14054,normal.

87,543520084,192168000190,22,192p168p0p120,2698,6,16,40,58387,39130843,46725,normal.

...........................

11,543520084,192168000190,80,192168000120,2703,6,16,40,58400,39162494,46738,anomaly.

12,543520084,192168000190,80,192168000120,2703,6,16,1500,58400,39162494,45277,anomaly.

RIPPER IDS names

normal,anomaly.

recID: ignore.

timestamp: symbolic.

sourceIP: set.

sourcePORT: symbolic.

destIP: set.

destPORT: symbolic.

protocol: symbolic.

flags: symbolic.

length: symbolic.

winsize: symbolic.

ack: symbolic.

checksum: symbolic.

RIPPER Output

Final hypothesis is:

anomaly :- sourcePORT='80' (33/0).

anomaly :- destPORT='80' (35/0).

anomaly :- ack='7.01238e+07' (3/0).

anomaly :- ack='7.03859e+07' (2/0).

default normal (87/0).

=================summary=====================

Train error rate: 0.00% +/- 0.00% (160 datapoints) <<

Hypothesis size: 4 rules, 8 conditions

Learning time: 0.01 sec

RIPPER Output

anomaly 33 0 IF sourcePORT = 80 .

anomaly 35 0 IF destPORT = 80 .

anomaly 3 0 IF ack = 7.01238e+07 .

anomaly 2 0 IF ack = 7.03859e+07 .

normal 87 0 IF .

.

Conclusion

- What is a Decision Tree
- ID3
- REP
- IREP
- RIPPER
- Application

Download Presentation

Connecting to Server..