320 likes | 514 Views
Fuzzy Interpretation of Discretized Intervals Author: Dr. Xindong Wu IEEE TRANSACTIONS ON FUZZY SYSTEM VOL. 7, NO. 6, DECEMBER 1999. Presented by: Gong Chen. Outline. Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusions
E N D
Fuzzy Interpretation of Discretized IntervalsAuthor: Dr. Xindong WuIEEE TRANSACTIONS ON FUZZY SYSTEMVOL. 7, NO. 6, DECEMBER 1999 Presented by: Gong Chen
Outline • Concepts Review • Overview • Problem • Solution • Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusions • Answers for Final Exam
Concepts Review • Induction: Generalize rules from training data • Deduction: Apply generalized rules to testing data • Three possible results of Deduction: • Single match • No match • Multiple match
Concepts Review • Discretization of Continuous domains • Continuous numerical domains can be discretized into intervals • The discretized intervals can be treated as nominal values
Concepts Review • Using Information Gain Heuristic for Discretization: (employed by HCV) • x = (xi + xi+1)/2 for (i = 1, …, n-1) • x is a possible cut point if xi and xi+1 are of different classes • Use IGH to find best x • Recursively split on left and right • Stop recursive splitting when some criteria is met
Outline • Concepts Review • Overview • Problem • Solution • Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusion • Answers for Final Exam
Overview Training Data induction Discretizaion rules No match Testing Data Deduction Single match Multiple match Fuzzy Borders
Outline • Concepts Review • Overview • Problem • Solution • Several Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusion • Answers for Final Exam
Problem • Discretization of continuous domains does not always fit accurate interpretation! • Recall, using Info Gain, --a kind of heuristic measure applying in training data, cannot accurately fit “data inreal world”. • Example
Problem • Heuristic 1(e.g. Information Gain) • Heuristic 2(e.g. Gain Ratio) young old 35 49 18 49.49 young old 35 50 18 49.49
Problem • Suppose after induction, we just get one rule: • If (age=old) then Class=MORE_EXPERIENCE According to Heuristic 2, Instance(age=49.49) No match!
Outline • Concepts Review • Overview • Problem • Solution • Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusion • Answers for Final Exam
Solution • More safe way to describe age=49.49 is to say: To some degree, it is young; To some degree, it is old. • Rather than using one assertion that definitely tells it is young or old. • Thus, to some degree, it can get its rule and classification result other than no match. • No matchSingle match or multiple match with some degree • This is so-called fuzzy match!
Solution • “Fuzziness is a type of deterministic uncertainty. It describes the event class ambiguity.” • “Fuzzinessworks when there are the outcomes that belong to several event classes at the same time but to different degrees.” • “Fuzziness measures the degree to which an event occurs.” • Jim Bezdek, Didier Dubois, Bart osko, Henri Prade
Solution • “to some degree”? • Membership function describes “degree” • Membership function tells you to what degree, an event belongs to one class. • Membership function calculates this degree. • Three widely used membership functions are employed by HCV. • Linear • Polynomial • Arctan
l sl xleft xright Solution S: is user-specified parameter. e.g. 0.1 indicates the interval spreads out into adjacent intervals for 10% of its original length at each end. • Linear membership function k = 1/2sl; a = -kxleft + ½; b = kxright + ½ linleft(x) = kx + a linright(x) = -kx + b lin(x) = MAX(0, MIN(1,linleft(x),linright(x)))
Solution • Polynomial Membership Function—using more smooth curve function instead of linear function. • Arctan Membership Function • Experimental results shows that no significant difference between three kinds of functions—so Polynomial Membership Function is chosen.
Solution polyside(x) = asidex3 + bsidex2 + csidex + dside aside = 1/(4(ls)3) bside = -3asidexside side {left,right} cside = 3aside(xside2 - (ls)2) dside = -a(xside3 -3xside(ls)2 + 2(ls)3) polyleft(x), if xleft -ls x xleft + ls poly(x) = polyright(x), if xright -ls x xright +ls 1, if xleft +ls x xright -ls 0, otherwise To what degree, x belongs to one interval
Outline • Concepts Review • Overview • Problem • Solution • Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusion • Answers for Final Exam Problems
Related Techniques • No match • Largest Class • Assign all no match examples to the largest class, the default class • Multiple match • Largest Rule • Assign examples to the rules which cover the largest number of examples • Estimate of Probability • Fuzzy borders can bring multiple match--conflicts, so hybrid method is desired for the whole progress
Related Techniques # of e.g.s in training set covered by conj • Estimate of Probability The probability of e belongs to class ci Conj1 and Conj2 are two rules supporting e belongs to Ci
Outline • Concepts Review • Overview • Problem • Solution • Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusion • Answers for Final Exam Problems
Algorithms Design in HCV • HCV(Large) • No match: Largest Class • Multiple match: Largest Rule • HCV(Fuzzy) • No match: Fuzzy Match • Multiple match: Fuzzy Match • HCV(Hybrid) • No match: Fuzzy Match • Multiple match: Estimate of Probability
Outline • Concepts Review • Overview • Problem • Solution • Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusion • Answers for Final Exam Problems
Experimental Results • Data: • 17 datasets from UCI Machine Learning Repository • Why select these: 1) Numerical data 2) Situations where no rules clearly apply • Test conditions • 68 parameters in HCV are all default except deduction strategy • Parameters for C4.5 and NewID are adopted as the one recommended by respective inventors
Experimental Results • Predictive accuracy • HCV (hybrid) outperforms others in 9 datasets • HCV (large) 3 datasets • HCV (fuzzy) 2 datasets • C4.5 (R 8) 7 datasets • C4.5 (R 5) 6 datasets • NewID 3 datasets • HCV (hybrid)clearly and significantly outperforms other interpretation techniques (in HCV) for datasets with numerical data in “no match” and “multiple match” cases. • C4.5 and NewID are included for reference, not for extensive comparison.
Outline • Concepts Review • Overview • Problem • Solution • Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusion • Answers for Final Exam Problems
Conclusion • Fuzziness is strongly domain dependent, HCV allows users to specify their own intervals and fuzzy functions. • An important direction to take with specific domains • Fuzzy Borders design combined with probability estimation achieve better results in term of predicative accuracy. • Applicable to other machine learning and data mining algorithms
Outline • Concepts Review • Overview • Problem • Solution • Related Techniques • Algorithms Design in HCV • Experimental Results • Conclusion • Answers for Final Exam Problems
Answers for Final Exam Problems • Q1:When doing deduction on real world data, what are the three possible cases for each test example? • Single match • No match • Multiple match • Q2: Of the three cases during deduction, which ones do the HCV hybrid interpretation algorithm use fuzzy borders to classify? • No match • Q3: In the Hybrid interpretation algorithm used in HCV, • when are sharp borders set up? • “Sharp borders are set up as usual during induction” • when are fuzzy border defined? • In deduction, “only in the no match case, fuzzy borders are set up in order to find a rule which is closest to the test example in question”