Outline Soft computing Fuzzy logic and fuzzy inference systems

Outline Soft computing Fuzzy logic and fuzzy inference systems
2018/9/11 Outline Soft computing Fuzzy logic and fuzzy inference systems Neural networks Neuro-fuzzy integration: ANFIS Derivative-free optimization Genetic algorithms Simulated annealing Random search Examples and demos Specifically, this is the outline of the talk. Wel start from the basics, introduce the concepts of fuzzy sets and membership functions. By using fuzzy sets, we can formulate fuzzy if-then rules, which are commonly used in our daily expressions. We can use a collection of fuzzy rules to describe a system behavior; this forms the fuzzy inference system, or fuzzy controller if used in control systems. In particular, we can can apply neural networks?learning method in a fuzzy inference system. A fuzzy inference system with learning capability is called ANFIS, stands for adaptive neuro-fuzzy inference system. Actually, ANFIS is already available in the current version of FLT, but it has certain restrictions. We are going to remove some of these restrictions in the next version of FLT. Most of all, we are going to have an on-line ANFIS block for SIMULINK; this block has on-line learning capability and it ideal for on-line adaptive neuro-fuzzy control applications. We will use this block in our demos; one is inverse learning and the other is feedback linearization.

Neuro-Fuzzy and Soft Computing
Neural networks Fuzzy inf. systems Model space Adaptive networks Soft Computing Derivative-free optim. Derivative-based optim. Approach space

Fuzzy Sets Sets with fuzzy boundaries A = Set of tall people
2018/9/11 Fuzzy Sets Sets with fuzzy boundaries A = Set of tall people Heights (cm) 170 1.0 Crisp set A Membership function Heights (cm) 170 180 .5 .9 Fuzzy set A 1.0 A fuzzy set is a set with fuzzy boundary. Suppose that A is the set of tall people. In a conventional set, or crisp set, an element is either belong to not belong to a set; there nothing in between. Therefore to define a crisp set A, we need to find a number, say, 5??, such that for a person taller than this number, he or she is in the set of tall people. For a fuzzy version of set A, we allow the degree of belonging to vary between 0 and 1. Therefore for a person with height 5??, we can say that he or she is tall to the degree of 0.5. And for a 6-foot-high person, he or she is tall to the degree of .9. So everything is a matter of degree in fuzzy sets. If we plot the degree of belonging w.r.t. heights, the curve is called a membership function. Because of its smooth transition, a fuzzy set is a better representation of our mental model of all? Moreover, if a fuzzy set has a step-function-like membership function, it reduces to the common crisp set.

Membership Functions (MFs)
2018/9/11 Membership Functions (MFs) About MFs Subjective measures Not probability functions “tall” in Taiwan MFs “tall” in the US “tall” in NBA .8 Here I like to emphasize some important properties of membership functions. First of all, it subjective measure; my membership function of all?is likely to be different from yours. Also it context sensitive. For example, I 5?1? and I considered pretty tall in Taiwan. But in the States, I only considered medium build, so may be only tall to the degree of .5. But if I an NBA player, Il be considered pretty short, cannot even do a slam dunk! So as you can see here, we have three different MFs for all?in different contexts. Although they are different, they do share some common characteristics --- for one thing, they are all monotonically increasing from 0 to 1. Because the membership function represents a subjective measure, it not probability function at all. .5 .1 180 Heights (cm)

Fuzzy If-Then Rules Mamdani style
2018/9/11 Fuzzy If-Then Rules Mamdani style If pressure is high then volume is small high small Sugeno style If speed is medium then resistance = 5*speed By using fuzzy sets, we can formulate fuzzy if-then rules that are commonly used in our daily expressions. Basically, we have two types of fuzzy rules. For Mamdani style, for instance, if pressure is high then volume is small, where igh?and mall?are described by fuzzy sets. For Sugeno style, if the speed of a moving object is medium then the resistance due to atmosphere is 5 times the speed. The basic difference between these two rules is in their THEN part, where Madman style has a fuzzy but Surgeon style has a linear equation. Madman style fuzzy rules were first proposed in the literature; they are more appealing to human intuition. Surgeon style fuzzy rules are proposed later, but they are more suited for mathematical design and analysis. In this talk, wel concentrate on Surgeon style fuzzy if-then rules. medium resistance = 5*speed

Fuzzy Inference System (FIS)
2018/9/11 Fuzzy Inference System (FIS) If speed is low then resistance = 2 If speed is medium then resistance = 4*speed If speed is high then resistance = 8*speed MFs low medium high .8 .3 A single fuzzy rule is not very interesting. But if we have a collection of fuzzy rules, we can use them to describe a system behavior. This leads to a fuzzy inference system. For instance, we can describe the resistance experienced by a moving object by the following three rules: .... Then given a crisp speed value, how do we find the resistance value from these three rules? It quite simple and can be done in three steps. In the first step, we find the membership grades for ow? edium? and igh? For instance, if speed is 2, the membership grades for ow? edium?and igh?are .3, .8, and .1, respectively. These numbers also represent how the given input condition peed = 2?satisfies the IF part of the rules. Sometimes these numbers are called the firing strengths of the rules. In the second step, we find the output of each rule, given speed is 2. In the third step, we apply a weighted average method to find the overall resistance, where the weighting factors are equal to the firing strengths of the rules. The whole process to derive the output from a given input condition is called fuzzy reasoning. For a two-input FIS, the process of fuzzy reasoning is better represented by the following diagram. .1 2 Speed Rule 1: w1 = .3; r1 = 2 Rule 2: w2 = .8; r2 = 4*2 Rule 3: w3 = .1; r3 = 8*2 Resistance = S(wi*ri) / Swi = 7.12

First-Order Sugeno FIS
2018/9/11 First-Order Sugeno FIS Rule base If X is A1 and Y is B1 then Z = p1*x + q1*y + r1 If X is A2 and Y is B2 then Z = p2*x + q2*y + r2 Fuzzy reasoning A1 B1 A2 B2 x=3 X Y y=2 w1 w2 z1 = p1*x+q1*y+r1 z = z2 = p2*x+q2*y+r2 w1+w2 w1*z1+w2*z2 P In this talk, we are going to use first-order Sugeno fuzzy inference system exclusively, where the output equation of each rule is a linear equation. For example, if we have two fuzzy rules ... We can express the process of fuzzy reasoning by this diagram. First we find the membership grades of the IF parts of the rules; the heights of the dashed line represent these values. Since the pre-conditions in the IF part are connected by AND, so we use multiplication to find the firing strength of each rule. For instance, firing strength w1 for rule 1 is the product of the heights of these two dashed lines. Similar for w2. Once we have w1 and w2, the overall output is again derived by weighted average.

Neural Networks Supervised Learning Unsupervised Learning Others
2018/9/11 Neural Networks Supervised Learning Multilayer perceptrons Radial basis function networks Modular neural networks LVQ (learning vector quantization) Unsupervised Learning Competitive learning networks Kohonen self-organizing networks ART (adaptive resonant theory) Others Hopfield networks

Single-Layer Perceptrons
2018/9/11 Single-Layer Perceptrons Network architecture x1 w1 w0 w2 x2 y = signum(Swi xi + w0) w3 x3 Dwi = k t xi Learning rule

Single-Layer Perceptrons
2018/9/11 Single-Layer Perceptrons Example: Gender classification h v w1 w2 w0 Network Arch. y = signum(hw1+vw2+w0) -1 if female 1 if male = y Training data h (hair length) v (voice freq.)

Multilayer Perceptrons (MLPs)
Network architecture x1 x2 y1 y2 hyperbolic tangent or logistic function Learning rule: Steepest descent (Backprop) Conjugate gradient method All optim. methods using first derivative Derivative-free optim.

Multilayer Perceptrons (MLPs)
Example: XOR problem Training data x1 x2 y Network Arch. x1 x2 y x1 x2 x1 x2 y

MLP Decision Boundaries
XOR Interwined General 1-layer: Half planes A B 2-layer: Convex A B 3-layer: Arbitrary

Adaptive Networks Architecture: Goal: Basic training method: x z y
2018/9/11 Adaptive Networks x z y Architecture: Feedforward networks with diff. node functions Squares: nodes with parameters Circles: nodes without parameters Goal: To achieve an I/O mapping specified by training data Basic training method: Backpropagation or steepest descent

Derivative-Based Optimization
Based on first derivatives: Steepest descent Conjugate gradient method Gauss-Newton method Levenberg-Marquardt method And many others Based on second derivatives: Newton method

Fuzzy Inference system
2018/9/11 Fuzzy Modeling x1 xn . . . Unknown target system y Fuzzy Inference system y* Given desired i/o pairs (training data set) of the form (x1, ..., xn; y), construct a FIS to match the i/o pairs Given an input condition, it very easy to derive the overall output. However, our task is more than that. What we want to do is to construct an appropriate fuzzy inference system that can match a desired input/output data set of a unknown target system. This is called fuzzy modeling and it a branch of nonlinear system identification. In general, fuzzy modeling involves two step. The first step is structure identification, where we have to find suitable numbers of fuzzy rules and membership functions. In FLT, this is accomplished by subtractive clustering proposed by Steve Chiu in Rockwell Science Research Center. The second step is parameter identification, where we have to find the optimal parameters for membership functions as well as output linear equations. In the FLT, this is done by ANFIS, which was proposed by me back in Since ANFIS uses some neural network training techniques, it is also classified as one of the neuro-fuzzy modeling approaches. Two steps in fuzzy modeling structure identification --- input selection, MF numbers parameter identification --- optimal parameters

Neuro-Fuzzy Modeling Basic approach of ANFIS Adaptive networks
2018/9/11 Neuro-Fuzzy Modeling Basic approach of ANFIS Adaptive networks Generalization Specialization Neural networks Fuzzy inference systems Our approach of using ANFIS as a neuro-fuzzy modeling tool is like this. First we generalize neural networks?architectures to obtain adaptive networks, and then we do a specialization to derive fuzzy inference systems represented by adaptive networks, and that ANFIS. During the processes of generalization and specialization, the backpropagation techniques used for training neural networks can be carried over directly, so we can train ANFIS using the same techniques. (In fact backpropagation is too slow, so we have some other technique to speed up training in ANFIS.) ANFIS

ANFIS Fuzzy reasoning ANFIS (Adaptive Neuro-Fuzzy Inference System) A1
2018/9/11 ANFIS Fuzzy reasoning A1 B1 A2 B2 w1 w2 z1 = p1*x+q1*y+r1 z2 = p2*x+q2*y+r2 z = w1+w2 w1*z1+w2*z2 x y This slide explains the basic architecture of an ANFIS. In the upper part, we have to graphical representation of the process of fuzzy reasoning. For each operation of fuzzy reasoning, we can put it into a node in an adaptive network. For instance, given two input x and y, first we need to find the membership grades; this is represented by the four blocks in the first layer and each of them generates a membership grades. The firing strengths are computed by nodes in the second layer; ANFIS (Adaptive Neuro-Fuzzy Inference System) A1 A2 B1 B2 P S / x y w1 w2 w1*z1 w2*z2 Swi*zi Swi z

Four-Rule ANFIS Input space partitioning
B1 A2 B2 x y ANFIS (Adaptive Neuro-Fuzzy Inference System) A1 A2 B1 B2 S / x y w1 w4 w1*z1 w4*z4 Swi*zi Swi z P

ANFIS: Parameter ID Hybrid training method x z y S / S nonlinear
parameters linear parameters w1 A1 P w1*z1 x A2 P S Swi*zi B1 P / z y B2 P w4*z4 Swi w4 S forward pass backward pass MF param. (nonlinear) fixed steepest descent Coef. param. (linear) least-squares fixed

Parameter ID: Gauss-Newton Method
Synonyms: linearization method extended Kalman filter method Concept: general nonlinear model: y = f(x, q) linearization at q = qnow: y = f(x, qnow)+a1(q1 - q1,now)+a2(q2 - q2,now) + ... LSE solution: qnext = qnow + h(A A) A B T -1 T

Param. ID: Levenberg-Marquardt
Formula: qnext = qnow + h(A A + lI) A B Effects of l: l small Gauss-Newton method l big steepest descent How to update l: greedy policy make l small cautious policy make l large T -1 T

Param. ID: Comparisons Steepest descent (SD) Hybrid learning (SD+LSE)
treats all parameters as nonlinear Hybrid learning (SD+LSE) distinguishes between linear and nonlinear Gauss-Newton (GN) linearizes and treat all parameters as linear Levenberg-Marquardt (LM) switches smoothly between SD and GN

ANFIS: Structure ID Input selection Input space partitioning
To select relevant input for efficient modeling Grid partitioning Tree partitioning Scatter partitioning C-means clustering mountain method hyperplane clustering CART method

Derivative-Free Optimization
Genetic algorithms (GAs) Simulated annealing (SA) Random search Downhill simplex search Tabu search

Genetic Algorithms Motivation Look at what evolution brings us? Vision
Hearing Smelling Taste Touch Learning and reasoning Can we emulate the evolutionary process with today's fast computers?

Genetic Algorithms Terminology: Fitness function Polulation
Encoding schemes Selection Crossover Mutation Elitism

Genetic Algorithms Binary encoding Crossover Mutation
Chromosome (11, 6, 9) Gene Crossover Crossover point Mutation Mutation bit

Genetic Algorithms Flowchart Current generation Next generation
. . . . . . Elitism Selection Crossover Mutation Current generation Next generation

Genetic Algorithms Example: Find the max. of the “peaks” function
z = f(x, y) = 3*(1-x)^2*exp(-(x^2) - (y+1)^2) - 10*(x/5 - x^3 - y^5)*exp(-x^2-y^2) -1/3*exp(-(x+1)^2 - y^2).

Genetic Algorithms Derivatives of the “peaks” function
dz/dx = -6*(1-x)*exp(-x^2-(y+1)^2) - 6*(1-x)^2*x*exp(-x^2-(y+1)^2) - 10*(1/5-3*x^2)*exp(-x^2-y^2) + 20*(1/5*x-x^3-y^5)*x*exp(-x^2-y^2) - 1/3*(-2*x-2)*exp(-(x+1)^2-y^2) dz/dy = 3*(1-x)^2*(-2*y-2)*exp(-x^2-(y+1)^2) + 50*y^4*exp(-x^2-y^2) + 20*(1/5*x-x^3-y^5)*y*exp(-x^2-y^2) + 2/3*y*exp(-(x+1)^2-y^2) d(dz/dx)/dx = 36*x*exp(-x^2-(y+1)^2) - 18*x^2*exp(-x^2-(y+1)^2) - 24*x^3*exp(-x^2-(y+1)^2) + 12*x^4*exp(-x^2-(y+1)^2) + 72*x*exp(-x^2-y^2) - 148*x^3*exp(-x^2-y^2) - 20*y^5*exp(-x^2-y^2) + 40*x^5*exp(-x^2-y^2) + 40*x^2*exp(-x^2-y^2)*y^5 -2/3*exp(-(x+1)^2-y^2) - 4/3*exp(-(x+1)^2-y^2)*x^2 -8/3*exp(-(x+1)^2-y^2)*x d(dz/dy)/dy = -6*(1-x)^2*exp(-x^2-(y+1)^2) + 3*(1-x)^2*(-2*y-2)^2*exp(-x^2-(y+1)^2) + 200*y^3*exp(-x^2-y^2)-200*y^5*exp(-x^2-y^2) + 20*(1/5*x-x^3-y^5)*exp(-x^2-y^2) - 40*(1/5*x-x^3-y^5)*y^2*exp(-x^2-y^2) + 2/3*exp(-(x+1)^2-y^2)-4/3*y^2*exp(-(x+1)^2-y^2)

Genetic Algorithms GA process: Initial population 5th generation

Genetic Algorithms Performance profile

Simulated Annealing Analogy

Simulated Annealing Terminology:
Objective function E(x): function to be optiimized Move set: set of next points to explore Generating function: to select next point Acceptance function h(DE, T): to determine if the selected point should be accept or not. Usually h(DE, T) = 1/(1+exp(DE/(cT)). Annealing (cooling) schedule: schedule for reducing the temperature T

Simulated Annealing Flowchart Select a new point xnew in the move sets
via generating function Compute the obj. function E(xnew) Set x to xnew with prob. determined by h(DE, T) Reduce temperature T

Simulated Annealing Example: Travel Salesperson Problem (TSP)
How to transverse n cities once and only once with a minimal total distance?

Simulated Annealing Move sets for TSP Translation Inversion Switching
12 12 10 10 3 3 Translation 1 Inversion 1 6 6 7 7 2 2 11 9 11 9 8 8 4 5 4 5 12 12 10 10 3 Switching 3 1 1 6 6 7 7 2 2 11 9 11 9 8 8 4 5 4 5

Simulated Annealing A 100-city TSP using SA Initial random path
During SA process Final path

Simulated Annealing 100-city TSP with penalities when crossing the circle Penalty = 0 Penalty = 0.5 Penalty = -0.3

Conclusions Contributing factors to successful applications of neuro-fuzzy and soft computing: Sensor technologies Cheap fast microprocessors Modern fast computers

References and WWW Resources
“Neuro-Fuzzy and Soft Computing”, J.-S. R. Jang, C.-T. Sun and E. Mizutani, Prentice Hall, 1996 “Neuro-Fuzzy Modeling and Control”, J.-S. R. Jang and C.-T. Sun, the Proceedings of IEEE, March 1995. “ANFIS: Adaptive-Network-based Fuzzy Inference Systems,”, J.-S. R. Jang, IEEE Trans. on Systems, Man, and Cybernetics, May 1993. Internet resources: This set of slides is available at WWW resouces about neuro-fuzzy and soft-computing

Outline Soft computing Fuzzy logic and fuzzy inference systems

Similar presentations

Presentation on theme: "Outline Soft computing Fuzzy logic and fuzzy inference systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Outline Soft computing Fuzzy logic and fuzzy inference systems

Similar presentations

Presentation on theme: "Outline Soft computing Fuzzy logic and fuzzy inference systems"— Presentation transcript:

Similar presentations

About project

Feedback