Download presentation
Presentation is loading. Please wait.
Published byDiana Boone Modified over 9 years ago
1
Computacion Inteligente Least-Square Methods for System Identification
2
2 Contents System Identification: an Introduction Least-Squares Estimators Statistical Properties of least-squares estimators Maximum likelihood (ML) estimator Maximum likelihood estimator for linear model LSE for Nonlinear Models Developing Dinamic models from Data Example: Tank level modeling
3
3 System Identification: Introduction Goal –Determine a mathematical model for an unknown system (or target system) by observing its input-output data pairs
4
4 System Identification: Introduction Purposes –To predict a system’s behavior, –As in time series prediction & weather forecasting –To explain the interactions & relationships between inputs & outputs of a system
5
5 System Identification: Introduction Context example –To design a controller based on the model of a system, –as an aircraft or ship control –Simulate the system under control once the model is known
6
6 Why cover System Identification System Identification It is a well established and easy to use technique for modeling a real life system. It will be needed for the section on fuzzy-neural networks.
7
7 Spring Example ExperimentForce(newtons)Length(inches) 11.11.5 21.92.1 33.22.5 44.43.3 55.94.1 67.44.6 79.25.0 What will the length be when the force is 5.0 newtons? Experimental data
8
8 Components of System Identification There are 2 main steps that are involved –Structure identification –Parameter identification
9
9 Structure identification Structure identification Apply a-priori knowledge about the target system to determine a class of models within which the search for the most suitable model is to be conducted This class of model is denoted by a function y = f(u, ) where: y is the model output u is the input vector is the parameter vector
10
10 Structure identification Structure identification f(u, ) depends on –the problem at hand –the designer’s experience –the laws of nature governing the target system
11
11 Parameter identification –Training data is used for both system and model. –Difference between Target System output, y i, and Mathematical Model output, y i, is used to update parameter vector, θ. ^
12
12 Parameter identification Parameter identification –The structure of the model is known, however we need to apply optimization techniques –In order to determine the parameter vector such that the resulting model describes the system appropriately:
13
13 System Identification Process The data set composed of m desired input-output pairs –(u i, y i ) (i = 1,…,m) is called the training data System identification needs to do both structure & parameter identification repeatedly until satisfactory model is found
14
14 System Identification: Steps –Specify & parameterize a class of mathematical models representing the system to be identified –Perform parameter identification to choose the parameters that best fit the training data set –Conduct validation set to see if the model identified responds correctly to an unseen data set –Terminate the procedure once the results of the validation test are satisfactory. Otherwise, another class of model is selected & repeat step 2 to 4
15
15 System Identification Process Structure and parameter identification may need to be done repeatedly
16
16 Least-Squares Estimators
17
17 Objective of Linear Least Squares fitting Given a training data set {(u i, y i ), i = 1, …, m} and the general form function: Find the parameters 1, …, n, such that estimate
18
18 The linear model The linear model : y = 1 f 1 (u) + 2 f 2 (u) + … + n f n (u) = f T (u, ) where: –u = (u 1, …, u p ) T is the model input vector –f 1, …, f n are known functions of u – 1, …, n are unknown parameters to be estimated
19
19 Least-Squares Estimators The task of fitting data using a linear model is referred to as linear regression where: –u = (u 1, …, u p ) T is the input vector –f 1 (u), …, f n (u)regressors – 1, …, n parameter vector
20
20 Least-Squares Estimators We collect training data set {(u i, y i ), i = 1, …, m} System’s equations becomes: Which is equivalent to: A = y
21
21 Least-Squares Estimators Which is equivalent to: A = y –where A = y = A -1 y (solution) m*n matrixn*1 vectorm*1 vector unknown
22
22 Least-Squares Estimators We have – m outputs, and – n fitting parameters to find Or – m equations, and – n unknown variables Usually m is greater than n
23
23 Least-Squares Estimators Since the model is just an approximation of the target system & the data observed might be corrupted, Therefore –an exact solution is not always possible! To overcome this inherent conceptual problem, an error vector e is added to compensate A + e = y
24
24 Least-Squares Estimators Our goal consists now of finding that reduces the errors between and The problem: Find, estimate
25
25 Least-Squares Estimators If e = y - A then: We need to compute:
26
26 Least-Squares Estimators Theorem [least-squares estimator] The squared error is minimized when satisfies the normal equation if is nonsingular, is unique & is given by is called the least-squares estimators, LSE
27
27 Spring Example –Structure Identification can be done using domain knowledge. –The change in length of a spring is proportional to the force applied. Hooke’s law length = k 0 + k 1 *force
28
28 Spring Example
29
29 Statistical Properties of least-squares estimators
30
30 Statistical qualities of LSE Definition [unbiased estimator] An estimator of the parameter is unbiased if where E[.] is the statistical expectation
31
31 Statistical qualities of LSE Definition [minimal variance] –An estimator is a minimum variance estimator if for any other estimator *: where Cov( ) is the covariance matrix of the random vector
32
32 Statistical qualities of LSE Theorem [Gauss-Markov]: –Gauss-Markov conditions: The error vector e is a vector of m uncorrelated random variables, each with zero mean & the same variance 2. This means that:
33
33 Statistical qualities of LSE Theorem [Gauss-Markov] LSE is unbiased & has minimum variance. Proof:
34
34 Maximum likelihood (ML) estimator
35
35 Maximum likelihood (ML) estimator The problem –Suppose we observe m independent samples x 1, x 2, …, x m, –coming from a probability density function with parameters 1, …, r
36
36 Maximum likelihood (ML) estimator The criterion for choosing is: –Choose parameters that maximize data probability Which one do you prefer? Why?
37
37 Maximum likelihood (ML) estimator Likelihood function definition: –For a sample of n observations x 1, x 2, …, x m –with independent probability density function f, –the likelihood function L is defined by L is the joint probability density
38
38 Maximum likelihood (ML) estimator ML estimator is defined as the value of which maximizes L: or equivalently:
39
39 Maximum likelihood (ML) estimator Example: ML estimation for normal distribution –Suppose we have m indipendent samples x 1, x 2, …, x m, coming from a Gaussian distribution with parameters μ and σ 2. Which is the MLE for μ and σ 2 ?
40
40 Maximum likelihood (ML) estimator Example: ML estimation for normal distribution –For m observations x 1, x 2, …, x m, we have:
41
41 Maximum likelihood (ML) estimator Example: ML estimation for normal distribution –For m observations x 1, x 2, …, x m, we have:
42
42 Maximum likelihood estimator for linear model
43
43 Maximum likelihood estimator for linear model –Let a linear model be given as –Then –here e has PDF p e (u,θ) (independent). The likelihood function is given by
44
44 Maximum likelihood estimator for linear model –Asume a regression model where errors are distributed normally with zero mean. –The likelihood function is given by
45
45 Maximum likelihood estimator for linear model The maximum likelihood model –Any algorithm that maximizes –gives de Maximum likelihood model with respect to a given family of possible models
46
46 Maximum likelihood estimator for linear model –Same as maximizing –Same as minimizing
47
47 Connection to Least Squares Conclusion –The least-squares fitting criterion can be understood as emerging from the use of the maximum likelihood principle for estimating a regression model where errors are distributed normally. –The applicability of the least-squares method is, however, not limited to the normality assumption.
48
48 LSE for Nonlinear Models
49
49 LSE for Nonlinear Models Nonlinear models are divided into 2 families –Intrinsically linear –Intrinsically nonlinear Through appropriate transformations of the input- output variables & fitting parameters, an intrinsically linear model can become a linear model By this transformation into linear models, LSE can be used to optimize the unknown parameters
50
50 LSE for Nonlinear Models Examples of intrinsically linear systems
51
51 Developing Dinamic models from Data
52
52 Dynamical System? Input u(t) Output y(t) System
53
53 The ARX model In dynamic systems analysis, the independent variable is often time (k) –A ARX model (AutoRegressive with eXogenous input model) is often used where
54
54 The ARX model Or equivalently –writing
55
55 The ARX model as a linear regressor Input-output relationship can take the form –where Regression vector Parameter vector to estimate
56
56 Prediction error model estimation The problem –Assume input-output data –Build the predictor –Such that minimizes Prediction Error
57
57 Prediction error model estimation –The model is fitted to the data by minimizing the criterion function Which gives the least squares criterion
58
58 Prediction error model estimation Solution –Normal equation –Estimates
59
59 Prediction error model estimation In matrix form, the solution is the standard linear least squares formula
60
60 Example: Tank level modeling
61
61 Example: Tank level modeling
62
62 Example Tank level modeling The identification goal –To explain how the voltage u(t) (the input) afects the water level h(t) (the output) of the tank Experimetal data
63
63 Simple ARX modeling A plausible first identification attempt is to try a simple linear regression model –The parameters can easily be estimated using linear least squares, resulting in
64
64 ARX model results –Simulated water level follows the true level but at levels close to zero the linear model produces negative levels.
65
65 Semiphysical modeling Model equation is based on dynamic conservation of mass –Accumulation of mass in the tank is equal to: the mass flow rate into the tank the mass flow rate out. minus
66
66 Semiphysical modeling While the inflow is roughly proportional to u(t) the outflow can be approximated using Bernoulli’s law –The parameters can easily be estimated using linear least squares, resulting in
67
67 Semiphysical model results The RMS error of this model is lower and more importantly no simulated output is negative which indicates that the model is physically sound
68
68 Sources J-Shing Roger Jang, Chuen-Tsai Sun and Eiji Mizutani, Slides for Ch. 5 of “Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence”, First Edition, Prentice Hall, 1997. Djamel Bouchaffra. Soft Computing. Course materials. Oakland University. Fall 2005 Henrik Melgaard, Identication of Physical Models. Institute of Mathematical Modelling, Technical University of Denmark. Ph.D. THESIS. 1994 Lucidi delle lezioni, Soft Computing. Materiale Didattico. Dipartimento di Elettronica e Informazione. Politecnico di Milano. 2004 Peter Lindskog, Fuzzy Identification from a Grey Box Modeling Point of View. Department of Electrical Engineering, Linkoping University. 1997 Jacob Roll, Local and Piecewise Afinne Approaches to System Identification. Department of Electrical Engineering, Linkoping University, Linkoping, Sweden. 2003
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.