Fast Effective Rule Induction

Fast Effective Rule Induction
By William W. Cohen

Overview Rule Based Learning Rule Learning Algorithm
Pruning Techniques Modifications to IREP Evolution of Ripper Conclusion

Goal of the Paper The goal of this paper is to develop a rule learning algorithm that perform efficiently on a large noisy datasets and are competitive in generalization performance with more mature symbolic learning methods, such as decision trees.

Concepts to Refresh Overfit and simplify strategy
- Separate and Conquer Pruning

Separate and Conquer General Idea:
1. Learn one rule that covers certain number of positive examples 2. Remove those examples covered by the rule 3. Repeat until no positive examples are left.

Sequential Covering Algorithm
Sequential-Covering(class,attributes,examples,threshold T) RuleSet = 0 Rule = Learn-one-rule(class,attributes,examples) While (performance(Rule) > T) do a. RuleSet += Rule b. Examples = Examples \ {examples classified correctly by Rule} c. Rule = Learn-one-rule(class,attributes,examples) Sort RuleSet based on the performance of the rules Return RuleSet

Pruning Why do we need pruning? Techniques of pruning:
1. Reduced Error Pruning 2. Grow 3. Incremental Reduced Error Pruning

IREP Algorithm

How to build a rule in IREP?
First the uncovered examples are randomly partitioned into two subsets, a growing set and a pruning set. Next a rule is grown.The implementation of a Grow rule is a propositional version of FOIL.

Grow Rule It begins with an empty conjunction of lconditions and considers adding to this any condition of the form or where An is a nominal attribute and v is a legal value for An or Ac is a continuous attribute and 2 is some value for Ac that occurs in the training data.

Grow Rule Grow rule repeatedly adda the conditions that maximizes FOIL’s information gain criterion until the rule covers no negative examples from the growing dataset.

Pruning After growing,the rule is immediately pruned by deleting any final sequence of conditions from the rule, and chooses the deletion that maximizes the function v(Rule,PrunePos,PruneNeg)= p+(N-n) / P+N

IREP IREP algorithm works for - Two-class problems - Multiple classes
- Handles missing attributes

Experiments with IREP The First Graph

CPU times for C4.5,IREP and RIPPER2

Improvements to IREP Improvement in IREP needs modifications
1.The Real Value Metric 2. The stopping criterion 3. Rule optimization

Evolution of RIPPER First IREP* is used to obtain the initial rule set.This rule set is next optimized and finally rules are added to cover any remaining positive examples using IREP*.This leads to a new algorithm , namely RIPPER

Fast Effective Rule Induction

Similar presentations

Presentation on theme: "Fast Effective Rule Induction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fast Effective Rule Induction

Similar presentations

Presentation on theme: "Fast Effective Rule Induction"— Presentation transcript:

Similar presentations

About project

Feedback