Fast Effective Rule Induction

Fast Effective Rule Induction By William W. Cohen

Overview • Rule Based Learning • Rule Learning Algorithm • Pruning Techniques • Modifications to IREP • Evolution of Ripper • Conclusion

Goal of the Paper • The goal of this paper is to develop a rule learning algorithm that perform efficiently on a large noisy datasets and are competitive in generalization performance with more mature symbolic learning methods, such as decision trees.

Concepts to Refresh • Overfit and simplify strategy - Separate and Conquer • Pruning

Separate and Conquer • General Idea: 1. Learn one rule that covers certain number of positive examples 2. Remove those examples covered by the rule 3. Repeat until no positive examples are left.

Sequential Covering Algorithm • Sequential-Covering(class,attributes,examples,threshold T) • RuleSet = 0 • Rule = Learn-one-rule(class,attributes,examples) • While (performance(Rule) > T) do • a. RuleSet += Rule • b. Examples = Examples \ {examples classified correctly by Rule} • c. Rule = Learn-one-rule(class,attributes,examples) • Sort RuleSet based on the performance of the rules • Return RuleSet

Pruning • Why do we need pruning? • Techniques of pruning: 1. Reduced Error Pruning 2. Grow 3. Incremental Reduced Error Pruning

IREP Algorithm

How to build a rule in IREP? • First the uncovered examples are randomly partitioned into two subsets, a growing set and a pruning set. • Next a rule is grown.The implementation of a Grow rule is a propositional version of FOIL.

Grow Rule • It begins with an empty conjunction of lconditions and considers adding to this any condition of the form An=v,Ac<=@ or Ac>=@ where An is a nominal attribute and v is a legal value for An or Ac is a continuous attribute and 2 is some value for Ac that occurs in the training data.

Grow Rule • Grow rule repeatedly adda the conditions that maximizes FOIL’s information gain criterion until the rule covers no negative examples from the growing dataset.

Pruning • After growing,the rule is immediately pruned by deleting any final sequence of conditions from the rule, and chooses the deletion that maximizes the function v(Rule,PrunePos,PruneNeg)= p+(N-n) / P+N

IREP • IREP algorithm works for - Two-class problems - Multiple classes - Handles missing attributes

Experiments with IREP The First Graph

CPU times for C4.5,IREP and RIPPER2

Improvements to IREP • Improvement in IREP needs modifications 1.The Real Value Metric 2. The stopping criterion 3. Rule optimization

Evolution of RIPPER • First IREP* is used to obtain the initial rule set.This rule set is next optimized and finally rules are added to cover any remaining positive examples using IREP*.This leads to a new algorithm , namely RIPPER

Fast Effective Rule Induction