Download presentation
1
Function Approximation
Fariba Sharifian Somaye Kafi Function Approximation spring 2006
2
Contents Introduction to Counterpropagation Full Counterpropagation
Architecture Algorithm Application example Forward only Counterpropagation Function Approximation spring 2006
3
Contents Function Approximation Using Neural Network
Introduction Development of Neural Network Weight Equations Algebra Training Algorithms Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data Function Approximation spring 2006
4
Introduction to Counterpropagation
are multilayer networks based on combination of input, clustering and output layers can be used to compress data, to approximate functions, or to associate patterns approximate its training input vectors pair by adoptively constructing a lookup table Function Approximation spring 2006
5
Introduction to Counterpropagation (cont.)
training has two stages Clustering Output weight updating There are two types of it Full Forward only Function Approximation spring 2006
6
Full Counterpropagation
Produces an approximation x*:y* based on input of an x vector input of a y vector only input of an x:y ,possibly with some distorted or missing elements in either or both vectors. Function Approximation spring 2006
7
Full Counterpropagation (cont.)
Phase 1 The units in the cluster layer compete. The learning rule for weight updates on the winning cluster unit is (only the winning unit is allowed to learn) Function Approximation spring 2006
8
Full Counterpropagation (cont.)
Phase 2 The weights from the winning cluster unit J to the output units are adjusted so that the vector of activations of the units in the Y output layer, y*, is an approximation to the input vector y; x*, is an approximation to the input vector x. The weight updates for the units in the Y output and X output layers are Function Approximation spring 2006
9
Architecture of Full Counterpropagation
X1 w u Y1 Hidden layer Xi Yk Z1 Xn Ym Zj t v Zp Y1* X1* Yk* Xi* Cluster layer Ym* Xn* Function Approximation spring 2006
10
Full Counterpropagation Algorithm
Function Approximation spring 2006
11
Full Counterpropagation Algorithm (phase 1)
Step 1. Initialize weights, learning rates, etc. Step 2. While stopping condition for Phase 1 is false, do Step 3-8 Step 3. For each training input pair x:y, do Step 4-6 Step 4. Set X input layer activations to vector x ; set Y input layer activations to vector y. Step 5. Find winning cluster unit; call its index J Step 6. Update weights for unit ZJ: Step 7. Reduce learning rate and . Step 8. Test stopping condition for Phase 1 training Function Approximation spring 2006
12
Full Counterpropagation algorithm (phase 2)
Step 9. While stopping condition for Phase 2 is false, do Step 10-16 (Note: and are small, constant values during phase 2) Step 10. For each training input pair x:y, do Step 11-14 Step 11. Set X input layer activations to vector x ; set Y input layer activations to vector y. Step 12. Find winning cluster unit; call its index J Step 13. Update weights for unit ZJ: Function Approximation spring 2006
13
Full Counterpropagation Algorithm (phase 2)(cont.)
Step 14. Update weights from unit ZJ to the output layers Step 15. Reduce learning rate a and b. Step 16. Test stopping condition for Phase 2 training. Function Approximation spring 2006
14
Which cluster is the winner?
dot product (find the cluster with the largest net input) Euclidean distance (find the cluster with smallest square distance from the input) Function Approximation spring 2006
15
Full Counterpropagation Application
The application for counterpropagation is as follows: Step0: initialize weights. step1: for each input pair x:y, do step 2-4. Step2: set X input layer activation to vector x set Y input layer activation to vector Y; Function Approximation spring 2006
16
Full Counterpropagation Application (cont.)
Step3: find cluster unit Z, that is closest to the input pair Step4: compute approximations to x and y: X*i=tji Y*k=ujk Function Approximation spring 2006
17
Full counterpropagation example
Function approximation of y=1/x After training phase we have Cluster unit v w z z z z z z z z z z Function Approximation spring 2006
18
Full counterpropagation example (cont.)
Y1 0.11 9.0 Z1 7.0 0.14 5.0 0.2 Z2 . 9.0 0.14 0.11 7.0 5.0 Z10 Y1* 0.2 X1* Function Approximation spring 2006
19
Full counterpropagation example (cont.)
To approximate value for y for x=0.12 As we don’t know any thing about y compute D just by means of x D1=( )2 =.0001 D2=.0004 D3=.064 D4=.032 D5=.23 D6=2.2 D7=10.1 D8=23.8 D9=47.3 D10=81 Function Approximation spring 2006
20
Forward Only Counterpropagation
Is a simplified version of the full counterpropagation Are intended to approximate y=f(x) function that is not necessarily invertible It may be used if the mapping from x to y is well defined, but the mapping from y to x is not. Function Approximation spring 2006
21
Forward Only Counterpropagation Architecture
XY XY w u X1 Y1 Z1 Xi Zj Yk Zp Xn Ym Input layer Cluster layer Output layer Function Approximation spring 2006
22
Forward Only Counterpropagation Algorithm
Step 1. Initialize weights, learning rates, etc. Step 2. While stopping condition for Phase 1 is false, do Step 3-8 Step 3. For each training input x, do Step 4-6 Step 4. Set X input layer activations to vector x Step 5. Find winning cluster unit; call its index j Step 6. Update weights for unit ZJ: Step 7. Reduce learning rate Step 8. Test stopping condition for Phase 1 training. Function Approximation spring 2006
23
Step 9. While stopping condition for Phase 2 is false, do Step 10-16
(Note: is small, constant values during phase 2) Step 10. For each training input pair x:y, do Step 11-14 Step 11. Set X input layer activations to vector x ; set Y input layer activations to vector y. Step 12. Find winning cluster unit; call its index J Step 13. Update weights for unit ZJ ( is small) Step 14. Update weights from unit ZJ to the output layers Step 15. Reduce learning rate a. Step 16. Test stopping condition for Phase 2 training. Function Approximation spring 2006
24
Forward Only Counterpropagation Application
Step0: initialize weights (by training in previous subsection). Step1: present input vector x. Step2: find unit J closest to vector x. Step3: set activation output units: yk=ujk Function Approximation spring 2006
25
Forward only counterpropagation example
Function approximation of y=1/x After training phase we have Cluster unit w u z z z z z z z z z z Function Approximation spring 2006
26
Function Approximation Using Neural Network
Introduction Development of Neural Network Weight Equations Algebra Training Algorithms Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data Function Approximation spring 2006
27
Introduction analytical description for a set of data
referred to as data modeling or system identification problem Function Approximation spring 2006
28
standard tools Splines Wavelets Neural network
Function Approximation spring 2006
29
Why Using Neural Network
Splines & Wavelets not generalize well to higher 3 dimensional spaces universal approximators parallel architecture trained to map multidimensional nonlinear functions 4- such that the weight equations can be treated as sets of algebraic systems, while maintaining their original functional form Function Approximation spring 2006
30
Why Using Neural Network (cont)
Central to the solution of differential equations. Provide differentiable closed-analytic- form solutions have very good generalization properties widely applicable translates into a set of nonlinear, transcendental weight equations cascade structure nonlinearity of the hidden nodes linear operations in the input and output layers 4- such that the weight equations can be treated as sets of algebraic systems, while maintaining their original functional form Function Approximation spring 2006
31
Function Approximation Using Neural Network
functions not known analytically have a set of precise input–output samples functions modeled using an algebraic approach design objectives: exact matching approximate matching feedforward neural networks Data: Input Output And/or gradient information 1-function to be approximated is…, p = number of training Data2- referred as training Data …3-algebraic training can achieve …6-of the data at the training points, with or without derivative information 8- Data is noise free Function Approximation spring 2006
32
Objective exact solutions sufficient degrees of freedom
retaining good generalization properties synthesize a large data set by a parsimonious network synthesize means amikhtan, parsimonious(sarfe joo) Function Approximation spring 2006
33
Input-to-node values algebraic training base
if all sigmoidal functions inputs are known weight equations become algebraic input-to-node values, sigmoidal functions inputs determine the saturation level of each sigmoid at a given data point Inputs: y, u, c.1- Algebraic training is based on the key observation that if all 3-… and, often, linear Function Approximation spring 2006
34
weight equations structure
analyze & train a nonlinear neural network means linear algebra controlling the distribution controlling the saturation level of the active nodes 1- title+allows the designer to… a nonlinear neural network by 4- partly by … 5-which determine the network generalization properties. Function Approximation spring 2006
35
Function Approximation Using Neural Network
Introduction Development of Neural Network Weight Equations Algebra Training Algorithms Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data Function Approximation spring 2006
36
Development of Neural Network Weight Equations
Objective approximate a smooth scalar function of q Inputs using a feedforward sigmoidal network Function Approximation spring 2006
37
Derivative information
can improve network’s generalization properties partial derivatives with input can be incorporated in the training set Function Approximation spring 2006
38
Network Output z: computed as a nonlinear transformation
w: input weight p: input b: bias d: output bias v: output weight :sigmoid functions such as: input-to-node variables Scalar output & weighted sum of p with bias d Function Approximation spring 2006
39
Scalar OutPut of Network
such as Function Approximation spring 2006
40
Exactly Match of the Function’s Outputs
output weighted equation b: s-dimensional vector composed of the scalar output bias S: is a matrix of sigmoid functions evaluated at input-to-node values n(I,k) each representing the magnitude of the input-to-node variable to the ith node for the training pair k The nonlinearity of the output weight equations arises purely from these sigmoid function If u unknown ignore 9 Function Approximation spring 2006
41
Gradient Equations derivative of the network output with respect to its inputs I can’t understand second summation, interconnection weight between the jth input and the jth node Function Approximation spring 2006
42
Exact Matching of the Function’s Derivatives
gradient weight equations denotes element-wise vector multiplication, w(e) represents the first e columns of w containing the weights associated with inputs p(1) through p(e) Function Approximation spring 2006
43
Input-to-node Weight Equations
rewriting 12 If c unknown 15 ignore, when u, c known 9 & 15 are algebric & linear Function Approximation spring 2006
44
Four Algebraic Algorithms
Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data Add 4 algebraic algorithms Function Approximation spring 2006
45
Function Approximation Using Neural Network
Introduction Development of Neural Network Weight Equations Algebra Training Algorithms Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data Function Approximation spring 2006
46
A.Exact Matching of Function Input-Output Data
S is known matrix ps strategy for producing a well-conditioned S input weights o random number N(0,1) L scaling factor user-defined scalar input-to-node values that do not saturate the sigmoids 4- consists of generating the …. according to the following rule: normal distribution, mean = zero, variance = unit, end- that can be adjusted to obtain… Function Approximation spring 2006
47
Input bias The input bias d is computed to center each sigmoid at one of the training pairs from With n(k) = 0, i = k, “diag” operator extracts the diagonal of its argument & reshapes it into a column vector. Pakhshe sigmoid ha dar sarasare fazaie vorodi Function Approximation spring 2006
48
Finally, the linear system in (9) is solved for v by inverting S
output bias b is an extra variable; thus, the vector b can be set equal to zero. Function Approximation spring 2006
49
17 produced an ill-conditioned S => computation repeated
(typically, one computation suffices). Function Approximation spring 2006
50
Exact Input-Output-Based Algebraic Algorithm
Fig. 2-a. Exact input–output-based algebraic algorithm (typically, one computation suffices). Function Approximation spring 2006
51
Exact Input-Output-Based Algebraic Algorithm with gradient information.
Fig. 2-b. Exact input–output-based algebraic algorithm with added p-steps for incorporating gradient information. (typically, one computation suffices). Function Approximation spring 2006
52
solved exactly simultaneously for the neural parameters.
Exact matching Input output gradient information solved exactly simultaneously for the neural parameters. 4-when the dimension (q-e) equals p, or when the training set has the special form to be discussed in Section IV-D. Function Approximation spring 2006
53
Function Approximation Using Neural Network
Introduction Development of Neural Network Weight Equations Algebra Training Algorithms Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data Function Approximation spring 2006
54
B.Approximate Matching of Gradient Data in Algebra Training
estimate output weights input-to-node values first soluation: use randomized W all parameters refined by a p-step node-by-node update algorithm. Function Approximation spring 2006
55
Approximate Matching of Gradient Data in Algebra Training (cont)
d and can be computed solely from Function Approximation spring 2006
56
Approximate Matching of Gradient Data in Algebra Training (cont)
kith gradient equations solved for the input weights associated with the ith node End- … the input weights associated with it 23-The remaining variables are obtained from the initial estimate of the weights. Function Approximation spring 2006
57
Approximate Matching of Gradient Data in Algebra Training (cont)
end of each step Solve terminate user-specified gradient tolerance error enters through v and through the input weights error adjusted in later steps basic idea ith node input weights mainly contribute to the kth partial derivatives 2- The gradient equations are solved within a …3- even if k<p 6-… w(l,i) with l = (i+1),…,p. end-… , because the ith sigmoid is centered at I=k and v can be kept bounded for a well-conditioned S. Function Approximation spring 2006
58
Function Approximation Using Neural Network
Introduction Development of Neural Network Weight Equations Algebra Training Algorithms Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data Function Approximation spring 2006
59
C.Approximate Matching of Function Input-Output Data
algebraic approach approximate parsimonious network exact sulotion s<p satisfy rank(S|u)= rank(S)= s example linear system in (9) not square sp inverse relationship between u and v (9) will be overdetermined 2-…when the number of training pairs p is large.6-…can be defined using the generalized inverse or pseudoinverse matrix 26-S(PI)=constitutes the left pseudoinverse, and b = 0, rank consistent=>exact value, not consistent=>no solution & estimate that minimizes the mean-square error (MSE) in the estimate of and can be used to obtain an approximate solution for the output weight equations. Function Approximation spring 2006
60
Approximate Matching of Function Input-Output Data (cont)
superimposes technique networks that individually map the nonlinear function over portions of its input space training set, covering entire input space input space divided into m subsets Function Approximation spring 2006
61
Approximate Matching of Function Input-Output Data (cont)
J Fig. 3.Superposition of node neural networks into one s-node network Function Approximation spring 2006
62
Approximate Matching of Function Input-Output Data (cont)
the gth neural network approximates the vector by the estimate Function Approximation spring 2006
63
Approximate Matching of Function Input-Output Data (cont)
full network matrix of input-to-node values with the element in the ith column and kth row Terms main diagonal terms input-to-node value matrices for m sub-networks off-diagonal terms, columnwise linearly dependent on the elements in Function Approximation spring 2006
64
Approximate Matching of Function Input-Output Data (cont)
output weights S constructed to be of rank s rank of = s or s+1 zero or small error during the superposition error does not increase with m several subnetworks can be algebraically superimposed to model one large training Function Approximation spring 2006
65
Approximate Matching of Function Input-Output Data (cont)
key to developing algebraic training techniques construct a matrix S, through N display the desired characteristics desired characteristics S must be of rank s s is kept small to produce a parsimonious network. Function Approximation spring 2006
66
Function Approximation Using Neural Network
Introduction Development of Neural Network Weight Equations Algebra Training Algorithms Exact Matching of Function Input –Output Data Approximate Matching of Gradient Data in Algebra Training Approximate Matching of Function Input-Output Data Exact Matching of Function Gradient Data Function Approximation spring 2006
67
D.Exact Matching of Function Gradient Data
Gradient-based training sets At every training point k is known for e of the neural network inputs denoted by x remaining (q-e) denoted by a Input–output information & Function Approximation spring 2006
68
Exact Matching of Function Gradient Data (cont)
input weight output weight gradient weight input-to-node weight equation Equations (34)–(36) can be treated as three linear systems by assuming that all input-to-node values n(I,k) [in (36)] are known. Function Approximation spring 2006
69
First Linear System(36) by reorganizing all values
s=p => is a known -dimensional column vector rewritten f A is a ps(q-e+1)s matrix computed from all –input vectors 2-when… end-…superscript indicates at which training pair each element has been evaluated Function Approximation spring 2006
70
Second Linear System(34)
known (34) system Becomes linear always can be solved for v provided s = p S nonsingular v can be treated as a constant Function Approximation spring 2006
71
Third Linear System(35) (35) becomes linear
unknowns consist of x-input weights known gradients in training set X is a known epes End-…sparse matrix composed of p block-diagonal sub-matrices (ees) End-…The solution order of the above linear equations is key, input-to-node values determine the nature of S and X, values in will render(tahvil dadan) their determinants (determinan)zero. Function Approximation spring 2006
72
Exact Matching of Function Gradient Data (cont)
algorithm goals determines effective distribution for elements weight equations solved in one step first solved strategy with probability=1, produce well-conditioned S consists of generating according to 2-…In , 5- A and are determined from the training set based on 39&42, choose p = s…8-…rule Function Approximation spring 2006
73
Input-to-Output Values
Substituted in (38) 1-left pseudoinverse A(PI) 2- W^(a)is the best approximation to the solution, as this overdetermined system is not likely to have a solution. Function Approximation spring 2006
74
Input-to-Output Values (cont)
sigmoids are very nearly centered desirable one sigmoid be centered for a given input prevent ill-conditioning S same sigmoid should close to saturation for any other known input need a factor absolute value of the largest element in 4-…Considering that the sigmoids come close to being saturated for an input whose absolute value is greater than 5, it is found desirable for the input-to-node values in to have variance of about from the … Function Approximation spring 2006
75
Exact Matching of Function Gradient Data (cont)
4-…Considering that the sigmoids come close to being saturated for an input whose absolute value is greater than 5, it is found desirable for the input-to-node values in to have variance of about from the … Function Approximation spring 2006
76
Example: Neural Network Modeling of the Sine Function
A sigmoidal neural network is trained to approximate the sine function u=sin(y) over the domain 0≤ y ≤π The training set is comprised of the gradient and output information shown in the table1.{yk, uk , ck} k=1,2,3 q=e=1 Function Approximation spring 2006
77
Function Approximation spring 2006
78
Function Approximation spring 2006
79
Suppose the input-to-node values and are chosen such that
It is shown that the data is matched exactly by a network with two nodes Suppose the input-to-node values and are chosen such that Function Approximation spring 2006
80
Function Approximation spring 2006
81
Function Approximation spring 2006
82
equations. In this example, is chosen to make the above weight equations consistent and to meet the assumptions in (57) and (60)–(61). It can be easily shown that this corresponds to computing the elements of ( and ) from the equation Function Approximation spring 2006
83
Function Approximation spring 2006
84
Function Approximation spring 2006
85
Function Approximation spring 2006
86
Conclusion algebraic training vs optimization-based techniques.
faster execution speeds better generalization properties reduced computational complexity can be used to find a direct correlation between the number of network nodes needed to model a given data set and the desired accuracy of representation. Function Approximation spring 2006
87
Function Approximation
Fariba Sharifian Somaye Kafi Function Approximation spring 2006
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.