Advanced Covering Algorithms in Data Mining

Data MiningCSCI 307, Spring 2019Lecture 17 Covering algorithms II

Covering Example continued After first rule established, delete instances covered by the first rule and start again. "Fresh" data set

Example part 2: Contact Lens Data Rule we seek: Possible Tests: Age = Young Age = Pre-presbyopic Age = Presbyopic Spectacle prescription = Myope Spectacle prescription = Hypermetrope Astigmatism = no Astigmatism = yes Tear production rate = Reduced Tear production rate = Normal

part 2: Modified Rule and Resulting Data Rule with the best test is added: Instances covered by modified rule:

Example part 2: Refine Rule we seek: Possible Tests: Spectacle prescription = Myope Spectacle prescription = Hypermetrope Astigmatism = no Astigmatism = yes Tear production rate = Reduced Tear production rate = Normal

part 2: Modified Rule and Resulting Data Rule with the best test is added: Instances covered by modified rule:

Example part 2: Refine More Rule we seek: Possible Tests: Spectacle prescription = Myope Spectacle prescription = Hypermetrope Tear production rate = Reduced Tear production rate = Normal

The Result Final Rules: Second rule for recommending "hard lenses": (build from the instances not covered by the first rule.) If astigmatism = yes and tear production rate = normal and spectacle prescription = myope then recommendation = hard

The Rest of the PRISM Rules If astigmatism = no and tear-prod-rate = normal and spectacle-prescription = hypermetrope then soft If astigmatism = no and tear-prod-rate = normal and age = young then soft If age = pre-presbyopic and astigmatism = no and tear-prod-rate = normal then soft If tear-prod-rate = reduced then none If age = presbyopic and tear-prod-rate = normal and spectacle-prescription = myope and astigmatism = no then none If spectacle-prescrip = hypermetrope and astigmatism = yes and age = pre-presbyopic then none If age = presbyopic and spectacle-prescription = hypermetrope and astigmatism = yes then none

Pseudo-Code for PRISM

Rules versus Decision Lists • PRISM with outer loop removed generates a decision list for one class • Subsequent rules are designed for rules that are not covered by previous rules • But: order doesn’t matter because all rules predict the same class • Outer loop considers all classes separately • No order dependence implied • Problems: overlapping rules, default rule required

Separate and Conquer • Methods like PRISM (for dealing with one class) are separate-and-conqueralgorithms: • First, identify a useful rule • Then, separate out all the instances it covers • Finally, “conquer” the remaining instances

Advanced Covering Algorithms in Data Mining

Advanced Covering Algorithms in Data Mining

Presentation Transcript

Data Mining CSCI 307 Spring, 2019

Data Mining CSCI 307, Spring 2019 Lecture 13

Data Structures CSCI 132, Spring 2019 Lecture 21 Doubly Linked Lists

CSci 8980: Data Mining (Fall 2002)

Data Structures CSCI 132, Spring 2014 Lecture 17 Backtracking

Data Structures CSCI 132, Spring 2019 Lecture 14 Review for Exam 1

Data Mining Spring 2013

Data Structures CSCI 132, Spring 2019 Lecture 18 Recursion and Look-Ahead

Data Mining Spring 2007