Fuzzy Interpretation of Discretized Intervals Author: Dr. Xindong Wu IEEE TRANSACTIONS ON FUZZY SYSTEM VOL. 7, NO. 6, DECEMBER 1999 Presented by: Gong.

Slides:



Advertisements
Similar presentations
On the Optimality of Probability Estimation by Random Decision Trees Wei Fan IBM T.J.Watson.
Advertisements

Systematic Data Selection to Mine Concept Drifting Data Streams Wei Fan IBM T.J.Watson.
COMP3740 CR32: Knowledge Management and Adaptive Systems
Data Mining Lecture 9.
Associative Classification (AC) Mining for A Personnel Scheduling Problem Fadi Thabtah.
Brief introduction on Logistic Regression
CHAPTER 9: Decision Trees
Rulebase Expert System and Uncertainty. Rule-based ES Rules as a knowledge representation technique Type of rules :- relation, recommendation, directive,
Missing values problem in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Decision Tree.
Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
Ch5 Stochastic Methods Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011.
Fuzzy Genetics-based Machine Learning Algorithms Presented by Vahid Jazayeri.
SUPPORT VECTOR MACHINES PRESENTED BY MUTHAPPA. Introduction Support Vector Machines(SVMs) are supervised learning models with associated learning algorithms.
Cost-Sensitive Classifier Evaluation Robert Holte Computing Science Dept. University of Alberta Co-author Chris Drummond IIT, National Research Council,
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 4: Modeling Decision Processes Decision Support Systems in the.
Fuzzy Interpretation of Discretized Intervals Dr. Xindong Wu IEEE TRANSACTIONS ON FUZZY SYSTEM VOL. 7, NO. 6, DECEMBER 1999 Presented by Peter Duval.
PART 7 Constructing Fuzzy Sets 1. Direct/one-expert 2. Direct/multi-expert 3. Indirect/one-expert 4. Indirect/multi-expert 5. Construction from samples.
Induction of Decision Trees
Fuzzy Interpretation of Discretized Intervals Dr. Xindong Wu Andrea Porter April 11, 2002.
LEARNING DECISION TREES
Rule Induction with Extension Matrices Yuen F. Helbig Dr. Xindong Wu.
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
Lecture 10 Comparison and Evaluation of Alternative System Designs.
WELCOME TO THE WORLD OF FUZZY SYSTEMS. DEFINITION Fuzzy logic is a superset of conventional (Boolean) logic that has been extended to handle the concept.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
Chi Square Distribution (c2) and Least Squares Fitting
Rule Induction with Extension Matrices Leslie Damon, based on slides by Yuen F. Helbig Dr. Xindong Wu, 1998.
Prelude of Machine Learning 202 Statistical Data Analysis in the Computer Age (1991) Bradely Efron and Robert Tibshirani.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.
Module 04: Algorithms Topic 07: Instance-Based Learning
DATA MINING : CLASSIFICATION. Classification : Definition  Classification is a supervised learning.  Uses training sets which has correct answers (class.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall 報告人:黃子齊
Slides for “Data Mining” by I. H. Witten and E. Frank.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.
1 Data Mining Lecture 3: Decision Trees. 2 Classification: Definition l Given a collection of records (training set ) –Each record contains a set of attributes,
Measurement Theory Michael J. Watts
Experimental Evaluation of Learning Algorithms Part 1.
11 Copyright © Cengage Learning. All rights reserved. 11 Techniques of Differentiation with Applications.
3. Rough set extensions  In the rough set literature, several extensions have been developed that attempt to handle better the uncertainty present in.
One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.
Preprocessing for Data Mining Vikram Pudi IIIT Hyderabad.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Classification and Prediction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber.
CpSc 881: Machine Learning Evaluating Hypotheses.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Chapter 6 Classification and Prediction Dr. Bernard Chen Ph.D. University of Central Arkansas.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
An Interval Classifier for Database Mining Applications Rakes Agrawal, Sakti Ghosh, Tomasz Imielinski, Bala Iyer, Arun Swami Proceedings of the 18 th VLDB.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Decision/Classification Trees Readings: Murphy ; Hastie 9.2.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 5: Credibility: Evaluating What’s Been Learned.
Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 3 Polynomial and Rational Functions Copyright © 2013, 2009, 2005 Pearson Education, Inc.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
Measurement Why do we need measurement? We need to measure the concepts in our hypotheses. When we measure those concepts, they become variables. Measurement.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.
Measurement Theory Michael J. Watts
Chapter 7. Classification and Prediction
Rule Induction for Classification Using
Chapter 6 Classification and Prediction
RELIABILITY OF QUANTITATIVE & QUALITATIVE RESEARCH TOOLS
iSRD Spam Review Detection with Imbalanced Data Distributions
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
Data Mining CSCI 307, Spring 2019 Lecture 21
Machine Learning: Lecture 5
Presentation transcript:

Fuzzy Interpretation of Discretized Intervals Author: Dr. Xindong Wu IEEE TRANSACTIONS ON FUZZY SYSTEM VOL. 7, NO. 6, DECEMBER 1999 Presented by: Gong Chen

Outline Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusions Answers for Final Exam

Concepts Review Induction: Generalize rules from training data Deduction: Apply generalized rules to testing data Three possible results of Deduction: –Single match –No match –Multiple match

Concepts Review Discretization of Continuous domains –Continuous numerical domains can be discretized into intervals –The discretized intervals can be treated as nominal values

Concepts Review Using Information Gain Heuristic for Discretization: (employed by HCV) –x = (x i + x i+1 )/2 for (i = 1, …, n-1) –x is a possible cut point if x i and x i+1 are of different classes –Use IGH to find best x –Recursively split on left and right –Stop recursive splitting when some criteria is met

Outline Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusion Answers for Final Exam

Overview Training Data Discretizaion induction rules Testing Data Deduction No match Single match Multiple match Fuzzy Borders

Outline Concepts Review Overview Problem Solution Several Related Techniques Algorithms Design in HCV Experimental Results Conclusion Answers for Final Exam

Problem Discretization of continuous domains does not always fit accurate interpretation! Recall, using Info Gain, --a kind of heuristic measure applying in training data, cannot accurately fit “data in real world”. Example

Problem Heuristic 1(e.g. Information Gain) Heuristic 2(e.g. Gain Ratio) young 49 old young 50 old 49.49

Problem Suppose after induction, we just get one rule: If (age=old) then Class=MORE_EXPERIENCE According to Heuristic 2, Instance(age=49.49) No match!

Outline Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusion Answers for Final Exam

Solution More safe way to describe age=49.49 is to say: To some degree, it is young; To some degree, it is old. Rather than using one assertion that definitely tells it is young or old. Thus, to some degree, it can get its rule and classification result other than no match. –No match  Single match or multiple match with some degree This is so-called fuzzy match!

Solution “Fuzziness is a type of deterministic uncertainty. It describes the event class ambiguity.” “Fuzziness works when there are the outcomes that belong to several event classes at the same time but to different degrees.” “Fuzziness measures the degree to which an event occurs.” –Jim Bezdek, Didier Dubois, Bart osko, Henri Prade

Solution “to some degree”? –Membership function describes “degree” –Membership function tells you to what degree, an event belongs to one class. –Membership function calculates this degree. Three widely used membership functions are employed by HCV. –Linear –Polynomial –Arctan

Solution Linear membership function x left x right l sl k = 1/2sl; a = -kx left + ½; b = kx right + ½ lin left (x) = kx + a lin right (x) = -kx + b lin(x) = MAX(0, MIN(1,lin left (x),lin right (x))) S: is user-specified parameter. e.g. 0.1 indicates the interval spreads out into adjacent intervals for 10% of its original length at each end.

Solution Polynomial Membership Function—using more smooth curve function instead of linear function. Arctan Membership Function Experimental results shows that no significant difference between three kinds of functions—so Polynomial Membership Function is chosen.

Solution poly side (x) = a side x 3 + b side x 2 + c side x + d side a side = 1/(4(ls) 3 ) b side = -3a side x side side  {left,right} c side = 3a side (x side 2 - (ls) 2 ) d side = -a(x side 3 -3x side (ls) 2 + 2(ls) 3 ) poly left (x),if x left -ls  x  x left + ls poly(x) = poly right (x),if x right -ls  x  x right +ls 1,if x left +ls  x  x right -ls 0,otherwise To what degree, x belongs to one interval

Outline Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusion Answers for Final Exam Problems

Related Techniques –No match Largest Class –Assign all no match examples to the largest class, the default class –Multiple match Largest Rule –Assign examples to the rules which cover the largest number of examples Estimate of Probability –Fuzzy borders can bring multiple match--conflicts, so hybrid method is desired for the whole progress

Related Techniques Estimate of Probability # of e.g.s in training set covered by conj The probability of e belongs to class c i Conj1 and Conj2 are two rules supporting e belongs to Ci

Outline Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusion Answers for Final Exam Problems

Algorithms Design in HCV HCV(Large) –No match: Largest Class –Multiple match: Largest Rule HCV(Fuzzy) –No match: Fuzzy Match –Multiple match: Fuzzy Match HCV(Hybrid) –No match: Fuzzy Match –Multiple match: Estimate of Probability

Outline Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusion Answers for Final Exam Problems

Experimental Results Data: –17 datasets from UCI Machine Learning Repository –Why select these: 1) Numerical data 2) Situations where no rules clearly apply Test conditions –68 parameters in HCV are all default except deduction strategy –Parameters for C4.5 and NewID are adopted as the one recommended by respective inventors

Experimental Results DatasetHCVHCV (large)HCVC4.5 NewID (hybrid)(fuzzy)(R 8)(R 5) Anneal98.00%93.00% 95.00%93.00%81.00% Bupa57.60%55.90% 71.20%61.00%73.00% Cleveland %68.10%73.60%71.40%76.90%67.00% Cleveland %56.00%52.70%51.60%56.00%47.30%  CRX 82.50%72.50%82.00%83.00%80.00%79.00% Glass (w/out ID)72.30%60.00% 71.50%64.60%66.00% Hungarian %85.00% 81.20%80.00%78.00% Hypothroid97.80%86.30%96.30%99.40% 92.00% Imports %59.30%61.00% 67.80%61.00% Ionosphere88.00%81.20% 86.30%85.50%82.00% Labor Neg76.50% 82.40% 65.00% Pima73.90%69.10% 73.50%75.50%73.00% Swiss % 97.00% Swiss %25.00%28.10%40.60%31.20%22.00% Va % 77.50%70.40%77.00% Va %25.40%29.60%31.00%26.80%20.00% Wine90.40%76.90% 90.40%90.00%90.40%

Experimental Results Predictive accuracy –HCV (hybrid) outperforms others in 9 datasets –HCV (large) 3 datasets –HCV (fuzzy) 2 datasets –C4.5 (R 8) 7 datasets –C4.5 (R 5) 6 datasets –NewID 3 datasets –HCV (hybrid)clearly and significantly outperforms other interpretation techniques (in HCV) for datasets with numerical data in “no match” and “multiple match” cases. C4.5 and NewID are included for reference, not for extensive comparison.

Outline Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusion Answers for Final Exam Problems

Conclusion Fuzziness is strongly domain dependent, HCV allows users to specify their own intervals and fuzzy functions. –An important direction to take with specific domains Fuzzy Borders design combined with probability estimation achieve better results in term of predicative accuracy. –Applicable to other machine learning and data mining algorithms

Outline Concepts Review Overview Problem Solution Related Techniques Algorithms Design in HCV Experimental Results Conclusion Answers for Final Exam Problems

Q1:When doing deduction on real world data, what are the three possible cases for each test example? –Single match –No match –Multiple match Q2: Of the three cases during deduction, which ones do the HCV hybrid interpretation algorithm use fuzzy borders to classify? –No match Q3: In the Hybrid interpretation algorithm used in HCV, –when are sharp borders set up? “Sharp borders are set up as usual during induction” –when are fuzzy border defined? In deduction, “only in the no match case, fuzzy borders are set up in order to find a rule which is closest to the test example in question”

Thank You!