Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data mining II The fuzzy way Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP.

Similar presentations


Presentation on theme: "Data mining II The fuzzy way Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP."— Presentation transcript:

1

2 Data mining II The fuzzy way Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland http://www.phys.uni.torun.pl/~duch ISEP Porto, 8-12 July 2002

3 Basic ideas Complex problems cannot be analyzed precisely Complex problems cannot be analyzed precisely Knowledge of an expert may be approximated using imprecise concepts. If the weather is nice and the place is attractive then not many participants stay at the school. Knowledge of an expert may be approximated using imprecise concepts. If the weather is nice and the place is attractive then not many participants stay at the school. Fuzzy logic/systems include: Mathematics of fuzzy sets/systems, fuzzy logics. Mathematics of fuzzy sets/systems, fuzzy logics. Fuzzy knowledge representation for clusterization, Fuzzy knowledge representation for clusterization, Classification and regression. Classification and regression. Extraction of fuzzy concepts and rules from data. Extraction of fuzzy concepts and rules from data. Fuzzy control theory. Fuzzy control theory.

4 Types of uncertainty Stochastic uncertainty Rolling dice, accident, insurance risk … - probability theory. Stochastic uncertainty Rolling dice, accident, insurance risk … - probability theory. Measurement uncertainty About 3 cm; 20 degrees - statistics. Measurement uncertainty About 3 cm; 20 degrees - statistics. Information uncertainty Trustworthy client, known constraints - data mining. Information uncertainty Trustworthy client, known constraints - data mining. Linguistic uncertainty Small, fast, low price – fuzzy logic. Linguistic uncertainty Small, fast, low price – fuzzy logic.

5 Crisp sets  young (x) Membership function young = { x  M | age(x)  20 }  young (x) =  1 : age (x)  20 0 : age (x) > 20 A=“young” x [years] 1 0

6 Fuzzy sets X  universum, space; x  X A  linguistic variable, concept, fuzzy set.  A – a Membership Function (MF), determining the degree, to which x belongs to A. Linguistic variables, concepts – sums of fuzzy sets. Logical predicate functions with continuous values. Membership value: different from probability.  (bold) = 0.8 does not mean bold 1 in 5 cases. Probabilities are normalized to 1, MF are not. Fuzzy concepts are subjective and context-dependent.

7 Fuzzy examples Crisp and fuzzy concept „young men” „Boiling temperature” has value around 100 degrees (pressure, chemistry). A=“young” x [years] 1 0 A=“young” x [years] 1 0  =0.8 x=23x=20

8 Few definitions Support of a fuzzy set A: supp(A) = { x  X :  A (x) > 0 } Core of a fuzzy set A: core(A) = { x  X :  A (x) =1 }  -cut of a fuzzy set A: A  = { x  X :  A (x) >  }  =0.6 Height = max x  A (x)  1 Normal fuzzy set : sup x  X  A (x) = 1

9 Definitions illustrated MF X.5 1 0 Core Crossover points Support  - cut 

10 Types of MF x  (x) 1 0 abcd Trapezoid: x  (x) 1 0 Gaus/Bell: N(m,s) c 

11 MF example  (x) Singleton: (a,1) i (b,0.5) x 1 0 ab  (x) x 1 0 abc Triangular:

12 Linguistic variables W=20 => Age=young. L. variable = L. value. L. variable: : temperature terms, fuzzy sets : { cold, warm, hot} x [C]  (x) 1 0  cold  warm  hot 404020

13 Fuzzy numbers MP are usually convex, with single maximum. MPs for similar numbers overlap. Numbers: core= point,  x (x)=1 Decrease monotonically on both sides of the core. Typically: triangular functions (a,b,c) or singletons.

14 Fuzzy rules Commonsense knowledge may sometimes be captured in an natural way using fuzzy rules. IF L-variable-1 = term-1 and L-variable-2 = term-2 THEN zm. L-variable-3 = term-3 IF Temperature = hot and air-condition price = low THEN cooling = strong What does it mean for fuzzy rules: IF x is A then y is B ?

15 Fuzzy implication If => means correlation T-norma T(A,B) is sufficient. A=>B has many realizations.

16 Interpretation of implication If x is A then y is B: correlation or implication. A=>B  not A or B A entails B A B x y A B y x A=>B  A and B

17 Types of rules FMR, Fuzzy Mapping Rules. Functional dependencies, fuzzy graphs, approximation problems. Mamdani type: IF MF A (x)=high then MF B (y)=medium. Takagi-Sugeno type: IF MF A (x)=high then y=f A (x) FIR, Fuzzy Implication Rules. Logic of implications between fuzzy facts. Linear f A (x) – first order Sugeno type. FIS, Fuzzy Inference Systems. Combine rules fuzzy rules to calculate final decisions.

18  Fuzzy systems F:  n   p use m rules to map vector x on the output F(x), vector or scalar. Fuzzy approximation Singleton model: R i : IF x is A i Then y is b i

19 IF Temperatura=chilly and Heating-price=expensive THEN heating=no IF Temperature=freezing and Heating-price=cheap THEN heating=full full full medium full medium weak medium weakno Rules base Temperature freezing cold chilly Price cheap so-so expensive Heating

20 1. Fuzzification t 1 0  chilly (T)=0.5 IF Temperature = chilly 15C p 1 0  cheap (p)=0.3 and Heating-price = cheap... 48 Euro/MBtu 0.5 0.3 Fuzzification: from measured values to MF: Determine membership degrees for all fuzzy sets (linguistic variables): Temperature: T=15 C Heating-price: p=48 Euro/MBtu

21 2. Term composition Calculate the degree of rule fulfillment for all conditions combining terms using fuzzy AND, ex. MIN operator.  A (X) =  A1 (X 1 )   A2 (X 2 )   AN (X N ) for rules R A  all (X) = min{  chilly (t),  cheap (p)} = min{0.5,0.3} = 0.3 t 1 0  chilly (T)=0.5 IF Temperature=chilly 15 C p 1 0  cheap (p)=0.3 and Heat-price=cheap... 48 Euro/MBtu 0.5 0.3

22 3. Inference Calculate the degree of truth of rule conclusion: use T- norms such as MIN or product to combine the degree of fulfillment of conditions and the MF of conclusion.  full (h) THEN Heating=full  conclusions (h) h 1 0  cond =0.3... h 1 0  mocno (h)  cond =0.3...  konkl (h) Inference MIN  concl =min{  cond,  full } Inference  concl. =  cond  full

23 4. Aggregation h 1 0 THEN Heating=full THEN Heating =medium THEN Heating =no Aggregate all possible rule conclusion using MAX operator to calculate the sum.

24 5. Defuzzification Calculate crisp value/decision using for example the “Center of Gravity” (COG) method: h 1 0  concl (h) COG 73 For discrete sets a „center of singletons”, for continuous:  i  i A i c i  i  i A i  i = degree of membership in i A i = area under MF for the set i c i = center of gravity for the set i. h =

25 FIS for heating Fuzzification Inference Defuzzification T  freeze  cold  warm Measured temperature 0.2 0.7 if temp=freezing then valve=open if temp=cold then valve=half open if temp=warm then valve=closed Rule base  freeze =0.7  cold =0.2  hot =0.0 v  full  half  closed Output that controls the valve position 0.2 0.7

26 Takagi-Sugeno rules Mamdani rules: conclude that IF X 1 = A 1 i X 2 =A 2 … X n = A n Then Y = B TS rules: conclude some functional dependence f(x i ) IF X 1 = A 1 i X 2 = A 2 …. X n = A n Then Y=f(x 1,x 2,..x n ) TS rules are usually based on piecewise linear functions (equivalent to linear splines approximation): IF X 1 = A 1 i X 2 = A 2 …X n = A n Then Y=a 0 + a 1 x 1 … +a n x n

27 Fuzzy system in Matlab rulelist=[ 1 1 3 1 1 1 2 3 1 1 1 3 2 1 1 2 1 3 1 1 2 2 2 1 1 2 3 1 1 1 3 1 2 1 1 3 2 3 1 1 3 3 3 1 1]; fis=addrule(fis,rulelist); showrule(fis) gensurf(fis); Surfview(fis); first input second input output rule weight operator (1=AND, 2=OR) 1. If (temperature is cold) and (oilprice is normal) then (heating is high) (1) 2. If (temperature is cold) and (oilprice is expensive) then (heating is medium) (1) 3. If (temperature is warm) and (oilprice is cheap) then (heating is high) (1) 4. If (temperature is warm) and (oilprice is normal) then (heating is medium) (1) 5. If (temperature is cold) and (oilprice is cheap) then (heating is high) (1) 6. If (temperature is warm) and (oilprice is expensive) then (heating is low) (1) 7. If (temperature is hot) and (oilprice is cheap) then (heating is medium) (1) 8. If (temperature is hot) and (oilprice is normal) then (heating is low) (1) 9. If (temperature is hot) and (oilprice is expensive) then (heating is low) (1)

28 Fuzzy Inference System (FIS) IF speed is slow then break = 2 IF speed is medium then break = 4* speed IF speed is high then break = 8* speed R1: w 1 =.3; r 1 = 2 R2: w 2 =.8; r 2 = 4*2 R3: w 3 =.1; r 3 = 8*2 speed 2.3.8.1 slowmediumhigh Break =  (w i *r i ) /  w i = 7.12 MF(speed)

29 First-order TS FIS Rules IF X is A 1 and Y is B 1 then Z = p 1 *x + q 1 *y + r 1 IF X is A 2 and Y is B 2 then Z = p 2 *x + q 2 *y + r 2 Fuzzy inference A1A1 B1B1 A2A2 B2B2 x=3 X X Y Y y=2 w1w1 w2w2 z 1 = p 1 *x+q 1 *y+r 1 z = z 2 = p 2 *x+q 2 *y+r 2 w 1 +w 2 w 1 *z 1 +w 2 *z 2 

30 Induction of fuzzy rules All this may be presented in form on networks. Choices/adaptive parameters in fuzzy rules: The number of rules (nodes).The number of rules (nodes). The number of terms for each attribute.The number of terms for each attribute. Position of the membership function (MF).Position of the membership function (MF). MF shape for each attribute/term.MF shape for each attribute/term. Type of rules (conclusions).Type of rules (conclusions). Type of inference and composition operators.Type of inference and composition operators. Induction algorithms: incremental or refinement.Induction algorithms: incremental or refinement. Type of learning procedure.Type of learning procedure.

31 Feature space partition Regular gridIndependent functions

32 MFs on a grid Advantage: simplest approachAdvantage: simplest approach Regular grid: divide each dimension in a fixed number of MFs and assign an average value from all samples that belong to the region.Regular grid: divide each dimension in a fixed number of MFs and assign an average value from all samples that belong to the region. Irregular grid: find largest error, divide the grid there in two parts adding new MF.Irregular grid: find largest error, divide the grid there in two parts adding new MF. Mixed method: start from regular grid, adapt parameters later.Mixed method: start from regular grid, adapt parameters later. Disadvantages: for k dimensions and N MFs in each N k areas are created ! Poor quality of approximation.Disadvantages: for k dimensions and N MFs in each N k areas are created ! Poor quality of approximation.

33 Optimized MP Advantages: higher accuracy, better approximation, less functions, context dependent MPs.Advantages: higher accuracy, better approximation, less functions, context dependent MPs. Optimized MP may come from:Optimized MP may come from: Neurofuzzy systems – equivalent to RBF network with Gaussian functions (several proofs). FSM models with triangular or trapezoidal functions. Modified MLP networks with bicentral functions, etc.Neurofuzzy systems – equivalent to RBF network with Gaussian functions (several proofs). FSM models with triangular or trapezoidal functions. Modified MLP networks with bicentral functions, etc. Decision trees, fuzzy decision trees.Decision trees, fuzzy decision trees. Fuzzy machine learning inductive systems.Fuzzy machine learning inductive systems. Disadvantages: extraction of rules is hard, optimized MPs are more difficult to create.Disadvantages: extraction of rules is hard, optimized MPs are more difficult to create.

34 Improving sets of rules. How to improve known sets of rules?How to improve known sets of rules? Use minimization methods to improve parameters of fuzzy rules: usually non-gradient methods are used; most often genetic algorithms.Use minimization methods to improve parameters of fuzzy rules: usually non-gradient methods are used; most often genetic algorithms. change rules into neural network, train the network and convert it into rules again.change rules into neural network, train the network and convert it into rules again. Use heuristic methods for local adaptation of parameters of individual rules.Use heuristic methods for local adaptation of parameters of individual rules. Fuzzy logic – good for modeling imprecise knowledge but...Fuzzy logic – good for modeling imprecise knowledge but... How do the decision borders of FIS look like? Is it worthwhile to make input fuzzy and output crisp?How do the decision borders of FIS look like? Is it worthwhile to make input fuzzy and output crisp? Is it the best approximation method?Is it the best approximation method?

35 Fuzzy rules and data uncertainty Data has been measured with unknown error. Assume Gaussian distribution: x – fuzzy number with Gaussian membership function. A set of logical rules R is used for fuzzy input vectors: Monte Carlo simulations for arbitrary system => p(C i |X) Analytical evaluation p(C|X) is based on cumulant: Error function is identical to logistic f. < 0.02

36 Fuzzification of crisp rules Rule R a (x) = {x  a} is fulfilled by G x with probability Error function is approximated by logistic function; assuming error distribution  (x)  x)), for s 2 =1.7 approximates Gauss < 3.5% Rule R ab (x) = {b> x  a} is fulfilled by G x with probability:

37 Soft trapezoids and NN Conclusion: fuzzy logic with  (x)   (x  b) m.f. is equivalent to crisp logic + Gaussian uncertainty. Gaussian classifiers (RBF) are equivalent to fuzzy systems with Gaussian membership functions. The difference between two sigmoids makes a soft trapezoidal membership functions.

38 Optimization of rules Fuzzy: large receptive fields, rough estimations. G x – uncertainty of inputs, small receptive fields. Minimization of the number of errors – difficult, non- gradient, but now Monte Carlo or analytical p(C|X;M). Gradient optimization works for large number of parameters. Parameters s x are known for some features, use them as optimization parameters for others! Probabilities instead of 0/1 rule outcomes. Vectors that were not classified by crisp rules have now non- zero probabilities.

39 SummarySummary Fuzzy sets/logic is a useful form of knowledge representation, allowing for approximate but natural expression of some types of knowledge.Fuzzy sets/logic is a useful form of knowledge representation, allowing for approximate but natural expression of some types of knowledge. An alternative way is to include uncertainty of input data while using crisp logic rules.An alternative way is to include uncertainty of input data while using crisp logic rules. Adaptation of fuzzy rule parameters leads to neurofuzzy systems; the simplest are the RBF networks and Separable Function Networks (SFN), equivalent to any fuzzy inference systems.Adaptation of fuzzy rule parameters leads to neurofuzzy systems; the simplest are the RBF networks and Separable Function Networks (SFN), equivalent to any fuzzy inference systems. Results may sometimes be better than with other systems since it is easier to include a priori knowledge in fuzzy systems.Results may sometimes be better than with other systems since it is easier to include a priori knowledge in fuzzy systems.

40 DisclaimerDisclaimer A few slides/figures were taken from various presentations found in the Internet; unfortunately I cannot identify original authors at the moment, since these slides went through different iterations; one source seems to be J.-S. Roger Jang from NTHU, Taiwan. I have to apologize for that.


Download ppt "Data mining II The fuzzy way Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP."

Similar presentations


Ads by Google