Data mining II The fuzzy way Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP.

Slides:



Advertisements
Similar presentations
Fuzzy Sets and Fuzzy Logic
Advertisements


 Negnevitsky, Pearson Education, Lecture 5 Fuzzy expert systems: Fuzzy inference n Mamdani fuzzy inference n Sugeno fuzzy inference n Case study.
AI – CS364 Fuzzy Logic Fuzzy Logic 3 03 rd October 2006 Dr Bogdan L. Vrusias
Fuzzy Logic and its Application to Web Caching
Clustering: Introduction Adriano Joaquim de O Cruz ©2002 NCE/UFRJ
Fuzzy Inference and Defuzzification
Support Vector Machines
GhostMiner Wine example Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP Porto,
Heterogeneous Forests of Decision Trees Krzysztof Grąbczewski & Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Torun, Poland.
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Chapter 4: Linear Models for Classification
Fuzzy Logic E. Fuzzy Inference Engine. “antecedent” “consequent”
Fuzzy Control Lect 3 Membership Function and Approximate Reasoning
Fuzzy rule-based system derived from similarity to prototypes Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Poland School.
Coloring black boxes: visualization of neural network decisions Włodzisław Duch School of Computer Engineering, Nanyang Technological University, Singapore,
Transfer functions: hidden possibilities for better neural networks. Włodzisław Duch and Norbert Jankowski Department of Computer Methods, Nicholas Copernicus.
Fuzzy Logic E. Fuzzy Inference Engine. “antecedent” “consequent”
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Neuro-Fuzzy Control Adriano Joaquim de Oliveira Cruz NCE/UFRJ
Fuzzy Medical Image Segmentation
Data mining II The fuzzy way Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP.
Competent Undemocratic Committees Włodzisław Duch, Łukasz Itert and Karol Grudziński Department of Informatics, Nicholas Copernicus University, Torun,
Fuzzy Rule-based Models *Neuro-fuzzy and Soft Computing - J.Jang, C. Sun, and, E. Mizutani, Prentice Hall 1997.
FUZZY LOGIC Babu Appat. OVERVIEW What is Fuzzy Logic? Where did it begin? Fuzzy Logic vs. Neural Networks Fuzzy Logic in Control Systems Fuzzy Logic in.
CPSC 386 Artificial Intelligence Ellen Walker Hiram College
Fuzzy Sets Introduction/Overview Material for these slides obtained from: Modern Information Retrieval by Ricardo Baeza-Yates and Berthier Ribeiro-Neto.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
 Negnevitsky, Pearson Education, Lecture 5 Fuzzy expert systems: Fuzzy inference n Mamdani fuzzy inference n Sugeno fuzzy inference n Case study.
Computational Intelligence: Methods and Applications Lecture 30 Neurofuzzy system FSM and covering algorithms. Włodzisław Duch Dept. of Informatics, UMK.
Fuzzy Expert Systems. 2 Motivation On vagueness “Everything is vague to a degree you do not realise until you have tried to make it precise.” Bertrand.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Fuzzy Inference (Expert) System
So Far……  Clustering basics, necessity for clustering, Usage in various fields : engineering and industrial fields  Properties : hierarchical, flat,
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Fuzzy Systems Michael J. Watts
“Principles of Soft Computing, 2 nd Edition” by S.N. Sivanandam & SN Deepa Copyright  2011 Wiley India Pvt. Ltd. All rights reserved. CHAPTER 12 FUZZY.
Fuzzy systems. Calculate the degree of matching Fuzzy inference engine Defuzzification module Fuzzy rule base General scheme of a fuzzy system.
Fuzzy Inference Systems
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Chapter 4: Fuzzy Inference Systems Introduction (4.1) Mamdani Fuzzy models (4.2) Sugeno Fuzzy Models (4.3) Tsukamoto Fuzzy models (4.4) Other Considerations.
Computational Intelligence: Methods and Applications Lecture 29 Approximation theory, RBF and SFN networks Włodzisław Duch Dept. of Informatics, UMK Google:
1 Lecture 4 The Fuzzy Controller design. 2 By a fuzzy logic controller (FLC) we mean a control law that is described by a knowledge-based system consisting.
Fuzzy Logic Artificial Intelligence Chapter 9. Outline Crisp Logic Fuzzy Logic Fuzzy Logic Applications Conclusion “traditional logic”: {true,false}
Computational Intelligence: Methods and Applications Lecture 26 Density estimation, Expectation Maximization. Włodzisław Duch Dept. of Informatics, UMK.
Introduction of Fuzzy Inference Systems By Kuentai Chen.
Intelligent Data Analysis
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Chapter 13 (Continued) Fuzzy Expert Systems 1. Fuzzy Rule-based Expert System 2.
Introduction to Fuzzy Logic and Fuzzy Systems
Data Transformation: Normalization
Fuzzy Systems Michael J. Watts
Fuzzy expert systems Fuzzy inference Mamdani fuzzy inference
Introduction to Fuzzy Logic
Artificial Intelligence
Fuzzy Logic and Fuzzy Sets
Introduction to Fuzzy Logic
Fuzzy logic Introduction 3 Fuzzy Inference Aleksandar Rakić
Dr. Unnikrishnan P.C. Professor, EEE
Dr. Unnikrishnan P.C. Professor, EEE
Dr. Unnikrishnan P.C. Professor, EEE
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Lecture 5 Fuzzy expert systems: Fuzzy inference
Fuzzy rule-based system derived from similarity to prototypes
Dr. Unnikrishnan P.C. Professor, EEE
Parametric Methods Berlin Chen, 2005 References:
Heterogeneous adaptive systems
Avoid Overfitting in Classification
Introduction to Fuzzy Set Theory
Fuzzy Logic KH Wong Fuzzy Logic v.9a.
Presentation transcript:

Data mining II The fuzzy way Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP Porto, 8-12 July 2002

Basic ideas Complex problems cannot be analyzed precisely Complex problems cannot be analyzed precisely Knowledge of an expert may be approximated using imprecise concepts. If the weather is nice and the place is attractive then not many participants stay at the school. Knowledge of an expert may be approximated using imprecise concepts. If the weather is nice and the place is attractive then not many participants stay at the school. Fuzzy logic/systems include: Mathematics of fuzzy sets/systems, fuzzy logics. Mathematics of fuzzy sets/systems, fuzzy logics. Fuzzy knowledge representation for clusterization, Fuzzy knowledge representation for clusterization, Classification and regression. Classification and regression. Extraction of fuzzy concepts and rules from data. Extraction of fuzzy concepts and rules from data. Fuzzy control theory. Fuzzy control theory.

Types of uncertainty Stochastic uncertainty Rolling dice, accident, insurance risk … - probability theory. Stochastic uncertainty Rolling dice, accident, insurance risk … - probability theory. Measurement uncertainty About 3 cm; 20 degrees - statistics. Measurement uncertainty About 3 cm; 20 degrees - statistics. Information uncertainty Trustworthy client, known constraints - data mining. Information uncertainty Trustworthy client, known constraints - data mining. Linguistic uncertainty Small, fast, low price – fuzzy logic. Linguistic uncertainty Small, fast, low price – fuzzy logic.

Crisp sets  young (x) Membership function young = { x  M | age(x)  20 }  young (x) =  1 : age (x)  20 0 : age (x) > 20 A=“young” x [years] 1 0

Fuzzy sets X  universum, space; x  X A  linguistic variable, concept, fuzzy set.  A – a Membership Function (MF), determining the degree, to which x belongs to A. Linguistic variables, concepts – sums of fuzzy sets. Logical predicate functions with continuous values. Membership value: different from probability.  (bold) = 0.8 does not mean bold 1 in 5 cases. Probabilities are normalized to 1, MF are not. Fuzzy concepts are subjective and context-dependent.

Fuzzy examples Crisp and fuzzy concept „young men” „Boiling temperature” has value around 100 degrees (pressure, chemistry). A=“young” x [years] 1 0 A=“young” x [years] 1 0  =0.8 x=23x=20

Few definitions Support of a fuzzy set A: supp(A) = { x  X :  A (x) > 0 } Core of a fuzzy set A: core(A) = { x  X :  A (x) =1 }  -cut of a fuzzy set A: A  = { x  X :  A (x) >  }  =0.6 Height = max x  A (x)  1 Normal fuzzy set : sup x  X  A (x) = 1

Definitions illustrated MF X Core Crossover points Support  - cut 

Types of MF x  (x) 1 0 abcd Trapezoid: x  (x) 1 0 Gaus/Bell: N(m,s) c 

MF example  (x) Singleton: (a,1) i (b,0.5) x 1 0 ab  (x) x 1 0 abc Triangular:

Linguistic variables W=20 => Age=young. L. variable = L. value. L. variable: : temperature terms, fuzzy sets : { cold, warm, hot} x [C]  (x) 1 0  cold  warm  hot

Fuzzy numbers MP are usually convex, with single maximum. MPs for similar numbers overlap. Numbers: core= point,  x (x)=1 Decrease monotonically on both sides of the core. Typically: triangular functions (a,b,c) or singletons.

Fuzzy rules Commonsense knowledge may sometimes be captured in an natural way using fuzzy rules. IF L-variable-1 = term-1 and L-variable-2 = term-2 THEN zm. L-variable-3 = term-3 IF Temperature = hot and air-condition price = low THEN cooling = strong What does it mean for fuzzy rules: IF x is A then y is B ?

Fuzzy implication If => means correlation T-norma T(A,B) is sufficient. A=>B has many realizations.

Interpretation of implication If x is A then y is B: correlation or implication. A=>B  not A or B A entails B A B x y A B y x A=>B  A and B

Types of rules FMR, Fuzzy Mapping Rules. Functional dependencies, fuzzy graphs, approximation problems. Mamdani type: IF MF A (x)=high then MF B (y)=medium. Takagi-Sugeno type: IF MF A (x)=high then y=f A (x) FIR, Fuzzy Implication Rules. Logic of implications between fuzzy facts. Linear f A (x) – first order Sugeno type. FIS, Fuzzy Inference Systems. Combine rules fuzzy rules to calculate final decisions.

 Fuzzy systems F:  n   p use m rules to map vector x on the output F(x), vector or scalar. Fuzzy approximation Singleton model: R i : IF x is A i Then y is b i

IF Temperatura=chilly and Heating-price=expensive THEN heating=no IF Temperature=freezing and Heating-price=cheap THEN heating=full full full medium full medium weak medium weakno Rules base Temperature freezing cold chilly Price cheap so-so expensive Heating

1. Fuzzification t 1 0  chilly (T)=0.5 IF Temperature = chilly 15C p 1 0  cheap (p)=0.3 and Heating-price = cheap Euro/MBtu Fuzzification: from measured values to MF: Determine membership degrees for all fuzzy sets (linguistic variables): Temperature: T=15 C Heating-price: p=48 Euro/MBtu

2. Term composition Calculate the degree of rule fulfillment for all conditions combining terms using fuzzy AND, ex. MIN operator.  A (X) =  A1 (X 1 )   A2 (X 2 )   AN (X N ) for rules R A  all (X) = min{  chilly (t),  cheap (p)} = min{0.5,0.3} = 0.3 t 1 0  chilly (T)=0.5 IF Temperature=chilly 15 C p 1 0  cheap (p)=0.3 and Heat-price=cheap Euro/MBtu

3. Inference Calculate the degree of truth of rule conclusion: use T- norms such as MIN or product to combine the degree of fulfillment of conditions and the MF of conclusion.  full (h) THEN Heating=full  conclusions (h) h 1 0  cond = h 1 0  mocno (h)  cond =  konkl (h) Inference MIN  concl =min{  cond,  full } Inference  concl. =  cond  full

4. Aggregation h 1 0 THEN Heating=full THEN Heating =medium THEN Heating =no Aggregate all possible rule conclusion using MAX operator to calculate the sum.

5. Defuzzification Calculate crisp value/decision using for example the “Center of Gravity” (COG) method: h 1 0  concl (h) COG 73 For discrete sets a „center of singletons”, for continuous:  i  i A i c i  i  i A i  i = degree of membership in i A i = area under MF for the set i c i = center of gravity for the set i. h =

FIS for heating Fuzzification Inference Defuzzification T  freeze  cold  warm Measured temperature if temp=freezing then valve=open if temp=cold then valve=half open if temp=warm then valve=closed Rule base  freeze =0.7  cold =0.2  hot =0.0 v  full  half  closed Output that controls the valve position

Takagi-Sugeno rules Mamdani rules: conclude that IF X 1 = A 1 i X 2 =A 2 … X n = A n Then Y = B TS rules: conclude some functional dependence f(x i ) IF X 1 = A 1 i X 2 = A 2 …. X n = A n Then Y=f(x 1,x 2,..x n ) TS rules are usually based on piecewise linear functions (equivalent to linear splines approximation): IF X 1 = A 1 i X 2 = A 2 …X n = A n Then Y=a 0 + a 1 x 1 … +a n x n

Fuzzy system in Matlab rulelist=[ ]; fis=addrule(fis,rulelist); showrule(fis) gensurf(fis); Surfview(fis); first input second input output rule weight operator (1=AND, 2=OR) 1. If (temperature is cold) and (oilprice is normal) then (heating is high) (1) 2. If (temperature is cold) and (oilprice is expensive) then (heating is medium) (1) 3. If (temperature is warm) and (oilprice is cheap) then (heating is high) (1) 4. If (temperature is warm) and (oilprice is normal) then (heating is medium) (1) 5. If (temperature is cold) and (oilprice is cheap) then (heating is high) (1) 6. If (temperature is warm) and (oilprice is expensive) then (heating is low) (1) 7. If (temperature is hot) and (oilprice is cheap) then (heating is medium) (1) 8. If (temperature is hot) and (oilprice is normal) then (heating is low) (1) 9. If (temperature is hot) and (oilprice is expensive) then (heating is low) (1)

Fuzzy Inference System (FIS) IF speed is slow then break = 2 IF speed is medium then break = 4* speed IF speed is high then break = 8* speed R1: w 1 =.3; r 1 = 2 R2: w 2 =.8; r 2 = 4*2 R3: w 3 =.1; r 3 = 8*2 speed slowmediumhigh Break =  (w i *r i ) /  w i = 7.12 MF(speed)

First-order TS FIS Rules IF X is A 1 and Y is B 1 then Z = p 1 *x + q 1 *y + r 1 IF X is A 2 and Y is B 2 then Z = p 2 *x + q 2 *y + r 2 Fuzzy inference A1A1 B1B1 A2A2 B2B2 x=3 X X Y Y y=2 w1w1 w2w2 z 1 = p 1 *x+q 1 *y+r 1 z = z 2 = p 2 *x+q 2 *y+r 2 w 1 +w 2 w 1 *z 1 +w 2 *z 2 

Induction of fuzzy rules All this may be presented in form on networks. Choices/adaptive parameters in fuzzy rules: The number of rules (nodes).The number of rules (nodes). The number of terms for each attribute.The number of terms for each attribute. Position of the membership function (MF).Position of the membership function (MF). MF shape for each attribute/term.MF shape for each attribute/term. Type of rules (conclusions).Type of rules (conclusions). Type of inference and composition operators.Type of inference and composition operators. Induction algorithms: incremental or refinement.Induction algorithms: incremental or refinement. Type of learning procedure.Type of learning procedure.

Feature space partition Regular gridIndependent functions

MFs on a grid Advantage: simplest approachAdvantage: simplest approach Regular grid: divide each dimension in a fixed number of MFs and assign an average value from all samples that belong to the region.Regular grid: divide each dimension in a fixed number of MFs and assign an average value from all samples that belong to the region. Irregular grid: find largest error, divide the grid there in two parts adding new MF.Irregular grid: find largest error, divide the grid there in two parts adding new MF. Mixed method: start from regular grid, adapt parameters later.Mixed method: start from regular grid, adapt parameters later. Disadvantages: for k dimensions and N MFs in each N k areas are created ! Poor quality of approximation.Disadvantages: for k dimensions and N MFs in each N k areas are created ! Poor quality of approximation.

Optimized MP Advantages: higher accuracy, better approximation, less functions, context dependent MPs.Advantages: higher accuracy, better approximation, less functions, context dependent MPs. Optimized MP may come from:Optimized MP may come from: Neurofuzzy systems – equivalent to RBF network with Gaussian functions (several proofs). FSM models with triangular or trapezoidal functions. Modified MLP networks with bicentral functions, etc.Neurofuzzy systems – equivalent to RBF network with Gaussian functions (several proofs). FSM models with triangular or trapezoidal functions. Modified MLP networks with bicentral functions, etc. Decision trees, fuzzy decision trees.Decision trees, fuzzy decision trees. Fuzzy machine learning inductive systems.Fuzzy machine learning inductive systems. Disadvantages: extraction of rules is hard, optimized MPs are more difficult to create.Disadvantages: extraction of rules is hard, optimized MPs are more difficult to create.

Improving sets of rules. How to improve known sets of rules?How to improve known sets of rules? Use minimization methods to improve parameters of fuzzy rules: usually non-gradient methods are used; most often genetic algorithms.Use minimization methods to improve parameters of fuzzy rules: usually non-gradient methods are used; most often genetic algorithms. change rules into neural network, train the network and convert it into rules again.change rules into neural network, train the network and convert it into rules again. Use heuristic methods for local adaptation of parameters of individual rules.Use heuristic methods for local adaptation of parameters of individual rules. Fuzzy logic – good for modeling imprecise knowledge but...Fuzzy logic – good for modeling imprecise knowledge but... How do the decision borders of FIS look like? Is it worthwhile to make input fuzzy and output crisp?How do the decision borders of FIS look like? Is it worthwhile to make input fuzzy and output crisp? Is it the best approximation method?Is it the best approximation method?

Fuzzy rules and data uncertainty Data has been measured with unknown error. Assume Gaussian distribution: x – fuzzy number with Gaussian membership function. A set of logical rules R is used for fuzzy input vectors: Monte Carlo simulations for arbitrary system => p(C i |X) Analytical evaluation p(C|X) is based on cumulant: Error function is identical to logistic f. < 0.02

Fuzzification of crisp rules Rule R a (x) = {x  a} is fulfilled by G x with probability Error function is approximated by logistic function; assuming error distribution  (x)  x)), for s 2 =1.7 approximates Gauss < 3.5% Rule R ab (x) = {b> x  a} is fulfilled by G x with probability:

Soft trapezoids and NN Conclusion: fuzzy logic with  (x)   (x  b) m.f. is equivalent to crisp logic + Gaussian uncertainty. Gaussian classifiers (RBF) are equivalent to fuzzy systems with Gaussian membership functions. The difference between two sigmoids makes a soft trapezoidal membership functions.

Optimization of rules Fuzzy: large receptive fields, rough estimations. G x – uncertainty of inputs, small receptive fields. Minimization of the number of errors – difficult, non- gradient, but now Monte Carlo or analytical p(C|X;M). Gradient optimization works for large number of parameters. Parameters s x are known for some features, use them as optimization parameters for others! Probabilities instead of 0/1 rule outcomes. Vectors that were not classified by crisp rules have now non- zero probabilities.

SummarySummary Fuzzy sets/logic is a useful form of knowledge representation, allowing for approximate but natural expression of some types of knowledge.Fuzzy sets/logic is a useful form of knowledge representation, allowing for approximate but natural expression of some types of knowledge. An alternative way is to include uncertainty of input data while using crisp logic rules.An alternative way is to include uncertainty of input data while using crisp logic rules. Adaptation of fuzzy rule parameters leads to neurofuzzy systems; the simplest are the RBF networks and Separable Function Networks (SFN), equivalent to any fuzzy inference systems.Adaptation of fuzzy rule parameters leads to neurofuzzy systems; the simplest are the RBF networks and Separable Function Networks (SFN), equivalent to any fuzzy inference systems. Results may sometimes be better than with other systems since it is easier to include a priori knowledge in fuzzy systems.Results may sometimes be better than with other systems since it is easier to include a priori knowledge in fuzzy systems.

DisclaimerDisclaimer A few slides/figures were taken from various presentations found in the Internet; unfortunately I cannot identify original authors at the moment, since these slides went through different iterations; one source seems to be J.-S. Roger Jang from NTHU, Taiwan. I have to apologize for that.