SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.

Slides:



Advertisements
Similar presentations
Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small.
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Newton’s Method Application to LMS Recursive Least Squares Exponentially-Weighted.
Visual Recognition Tutorial
1cs542g-term High Dimensional Data  So far we’ve considered scalar data values f i (or interpolated/approximated each component of vector values.
PHYS2020 NUMERICAL ALGORITHM NOTES ROOTS OF EQUATIONS.
280 SYSTEM IDENTIFICATION The System Identification Problem is to estimate a model of a system based on input-output data. Basic Configuration continuous.
Chapter 5 Orthogonality
Function Optimization Newton’s Method. Conjugate Gradients
Tutorial 12 Unconstrained optimization Conjugate gradients.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
Curve-Fitting Regression
SYSTEMS Identification
Chapter 4 Multiple Regression.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
SYSTEMS Identification
SYSTEMS Identification
Multivariable Control Systems
Development of Empirical Models From Process Data
The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
Linear and generalised linear models
Function Optimization. Newton’s Method Conjugate Gradients Method
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems
Linear and generalised linear models
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
SYSTEMS Identification
Maximum likelihood (ML)
Multivariable Control Systems Ali Karimpour Assistant Professor Ferdowsi University of Mashhad.
Adaptive Signal Processing
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
Algorithms for a large sparse nonlinear eigenvalue problem Yusaku Yamamoto Dept. of Computational Science & Engineering Nagoya University.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
1 Chapter 2 1. Parametric Models. 2 Parametric Models The first step in the design of online parameter identification (PI) algorithms is to lump the unknown.
CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Elementary Linear Algebra Anton & Rorres, 9th Edition
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Section 2.3 Properties of Solution Sets
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION ASEN 5070 LECTURE 11 9/16,18/09.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
Numerical Methods.
Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Normal Equations The Orthogonality Principle Solution of the Normal Equations.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory.
SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad.
ECE 530 – Analysis Techniques for Large-Scale Electrical Systems Prof. Hao Zhu Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Computacion Inteligente Least-Square Methods for System Identification.
LINEAR CLASSIFIERS The Problem: Consider a two class task with ω1, ω2.
Going Backwards In The Procedure and Recapitulation of System Identification By Ali Pekcan 65570B.
Chapter 2 Minimum Variance Unbiased estimation
Numerical Analysis Lecture 16.
Generally Discriminant Analysis
16. Mean Square Estimation
Presentation transcript:

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart Ljung

lecture 10 Ali Karimpour Dec 2010 Lecture 10 Topics to be covered include: v Linear Regression and Least Squares. v Numerical Solution by Iterative Search Method. v Computing Gradients. v Two-Stage and Multistage Method. v Local Solutions and Initial Values. v Subspace Methods for Estimating State Space Models. 2 Computing the estimate

lecture 10 Ali Karimpour Dec 2010 Introduction In pervious chapters three basic parameter estimation method considered 3 1- The Prediction-Error Approach in which a certain function V N (θ,Z N ) is minimized with respect to θ. 2- The Correlation Approach in which a certain equation f N (θ,Z N )=0 is solved for θ. 3- The Subspace Approach to estimating state space models. In this chapter we shall discuss how these problems are best solved numerically. Convergence Asymptotic Distribution of Parameter Estimators In pervious chapters we study

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. Topics to be covered include: v Linear Regression and Least Squares. v Numerical Solution by Iterative Search Method. v Computing Gradients. v Two-Stage and Multistage Method. v Local Solutions and Initial Values. v Subspace Methods for Estimating State Space Models. 4

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 5 For linear regression we have: Least-squares criterion leads to An alternative form is: Normal equations Remember that the basic equation for IV method is quite analogous so most of what is said in this section about LS method also applied to IV method.

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 6 Normal equations R(N) may be ill-conditioned specially when its dimension is high. The underlying idea in these methods is that the matrix R(N) should not be formed, instead a matrix R is constructed with the property This class of methods is commonly known as “square-root algorithm” But the term “quadratic methods” is more appropriate. How to derive R? Householder Gram-Schmidt procedure Bjorck and Cholesky decomposition QR decomposition

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 7 Solving for the LS estimates by QR factorization. The QR-factorization of an n d matrix A is defined as: Here Q is an unitary n n and R is n d.

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 8 Solving for the LS estimates by QR factorization.

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 9 Solving for the LS estimates by QR factorization. Let define Let Q as an unitary matrix, then

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 10 Solving for the LS estimates by QR factorization. Now, introduce QR-factorization This means that which clearly is minimized for

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 11 Solving for the LS estimates by QR factorization. There are three important advantages with this way of solving the LS estimate: 2- R 1 is a triangular matrix, so the equation is easy to solve. 3- If the QR-factorization is performed for a regressor size d*, then the solutions for all models with fewer parameter are easily obtained from R 0. Remark2: Note that the big matrix Q is never required to find. All the information are contained in the “small” matrix R 0 Therefore R 1 is much better conditioned than R(N). 1- The condition number of R 1 is the square root of R(N). Remark1: If one find a regressor size d*, then the solutions for models with more parameters are easily obtained from Levinson Algorithm.

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 12 Levinson Algorithm Remark1: If one find a regressor size d*, then the solutions for models with more parameters are easily obtained from Levinson Algorithm. ……

lecture 10 Ali Karimpour Dec Consider the simple model for system Exercise1 : Suppose for t=1 to 11 the value of u and y are: 1) Derive from eq. (I) and find the condition number of R(N) 2) Derive from eq. (II) and find the condition number of R 1 Linear Regression and Least Squares.

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 14 Initial condition: “Windowed” Data The regression vector φ(t) is: Here z(t-1) is an r-dimensional vector. For example, the for ARX model For example, the for AR model R(N) will be:

lecture 10 Ali Karimpour Dec 2010 Linear Regression and Least Squares. 15 Initial condition: “Windowed” Data R(N) will be: If we have knowledge only of z(t) for 1 ≤ t ≤ N the question arises of how to deal with the unknown initial condition 1 - Start the summation at t=n+1 rather than t= Replace the unknown initial condition by zeros.

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method Topics to be covered include: v Linear Regression and Least Squares. v Numerical Solution by Iterative Search Method. v Computing Gradients. v Two-Stage and Multistage Method. v Local Solutions and Initial Values. v Subspace Methods for Estimating State Space Models. 16

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 17 Numerical minimization Methods for numerical minimization of a function V(θ) update the minimizing point iteratively by: In general neither the function nor cannot be minimized or solved by analytical methods. f (i) is a search direction based on information about V(θ) α is a positive constant Depending on the information to determine f (i) there is 3 groups 1- Methods using function values only. 2- Methods using values of the function as well as of its gradient. 3- Methods using values of the function, its gradient and of its Hessian.

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 18 Depending on the information to determine f (i) there is 3 groups Methods using function values only. Methods using values of the function V as well as of its gradient. Methods using values of the function, its gradient and of its Hessian.. Newton algorithms Quasi Newton algorithms An estimate of Hessian is find and then: An estimate of gradient is used then Quasi Newton algorithm applied.

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 19 In general consider the function The gradient is: Here, Ψ(t,θ) is:

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 20 Some explicit search schemes Consider the special case The gradient is: A general family of search routines is given by

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 21 Some explicit search schemes Consider the special case

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 22 Some explicit search schemes Consider the special case Let then we have This is the gradient or steepest-descent method. This method is fairly inefficient close to the minimum.

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 23 Gradient or steepest-descent method for solving f(x)=0. This method is fairly inefficient close to the minimum. Make an initial guess: x 0. x0x0 Draw the tangent line. Its equation is: Let x 1 be x-intercept of the tangent line. x1x1 This intercept is given by the formula: Now repeat x 1 as the initial guess. x2x2

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 24 Gradient or steepest-descent method for solving f(x)=0. Some difficulties of steepest-descent method. Zero derivatives. Diverging. x0x0 x1x1 x2x2 x2x2

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 25 Gradient or steepest-descent method for finding minimum of f(x)

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 26 Gradient or steepest-descent method for finding minimum of f(x)

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 27 Some explicit search schemes Consider the special case The gradient or steepest-descent method is fairly inefficient close to the minimum. The gradient and the Hessian of V is: Let then we have This is the Newton method. But it is not an easy task to compute Hessian since of.

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 28 Some explicit search schemes Consider the special case This is the Newton method. But it is not an easy task to compute Hessian since of. Suppose that there is a value θ 0 s.t. ε(t, θ 0 ) = e 0 (t) are independent so

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 29 Newton method So choose of in the vicinity of minimum is a good estimate of Hessian. This is known as the Gauss-Newton Method. In the statistical literature it is called the “Method of scoring”. In the control literature the terms “modified Newton-Raphson” and “quasi linearization” have also been used.

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 30 Newton method and for the term “damped Guess-Newton” has been used. Dennis and Schnabel reserve the term “Guess-Newton” for

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 31 Newton method Even though R N is assured to be positive semi definite, it may be singular or close to singular. (for example, if the model is over-parameterized or the data are not informative enough) Various ways to overcome this problem exist and are known as “regularization techniques” Goldfeld, Quandt and Trotter suggest Levenberg and Marquardt suggest With λ = 0 we have the Guess-Newton case, increasing λ means that the step size is decreased and the search direction is turned towards the gradient.

lecture 10 Ali Karimpour Dec 2010 Numerical Solution by Iterative Search Method 32 Remember that we want to or Newton method to solve (I) This leads to Newton-Raphson method to solve (II) Correlation Equation Solving equation (II) is quite analogous to the minimization of (I) Substitution method to solve (II)

lecture 10 Ali Karimpour Dec 2010 Computing Gradients Topics to be covered include: v Linear Regression and Least Squares. v Numerical Solution by Iterative Search Method. v Computing Gradients. v Two-Stage and Multistage Method. v Local Solutions and Initial Values. v Subspace Methods for Estimating State Space Models. 33

lecture 10 Ali Karimpour Dec 2010 Computing Gradients 34 The amount of work required to compute ψ(t,θ) highly dependent on model structure, and sometimes one may have to resort to numerical differentiation. Example 10.1 Consider the ARMAX model the predictor is: Differentiation with respect to a k is: similarly now

lecture 10 Ali Karimpour Dec 2010 Computing Gradients 35 now

lecture 10 Ali Karimpour Dec 2010 Computing Gradients 36 SISO black box model General model structure and its predictor is: so we have

lecture 10 Ali Karimpour Dec 2010 Computing Gradients 37 SISO black box model General model structure and its predictor is: As an special case consider OE model now

lecture 10 Ali Karimpour Dec 2010 Computing Gradients 38 SISO black box model As an special case consider OE model now

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method Topics to be covered include: v Linear Regression and Least Squares. v Numerical Solution by Iterative Search Method. v Computing Gradients. v Two-Stage and Multistage Method. v Local Solutions and Initial Values. v Subspace Methods for Estimating State Space Models. 39

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 40 Numerical Solution by Iterative Search Method Linear Regression and Least Squares Efficient methods with analytic solution. Guaranteed convergence to a local minimum. Efficiently. Applicability to general model structure. Combined Two or several LS (IV) stages applied to different substructures.

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 41 Why we interest in this topic: It helps to understand the identification literature. It is useful to providing initial estimates to use in iterative methods. Some important Two-Stage or Multistage Method 1- Bootstrap Methods. 2- Bilinear Parameterization. 3- Separate Least Squares. 4- High Order AR(X) Models. 5- Separating Dynamics And Noise Models. 6- Determining ARMA Models. 7- Subspace Methods For Estimating State Space Models.

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 42 Bootstrap Methods Consider the correlation formulation This formulation contains a number of common situation IV (Instrument variable) methods with: PLR (Pseudo linear regression) methods: Minimizing the quadratic criterion:

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 43 Bootstrap Methods Consider the correlation formulation It is called Bootstrap Method since it alternate between: It does not necessarily converge to a solution. A convergence analysis is given by: With a at hand it is natural to determine the next step by: It is linear so: Stoica and Soderstrom (1981b), and Stoica et.al. (1985)

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 44 Bilinear Parameterization. For some models, the predictor is bilinear in the parameters, for example consider ARARX model Now the estimator is Let Bilinear means that is linear in ρ for fixed η and linear in η for fixed ρ.

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 45 Bilinear Parameterization. In ARARX model With this situation, a natural way of minimizing would be to treat it as a sequence of LS problems. Let Exercise2: Exercise 10T.3 Show that this minimization problem is an special case of According to exercise 10T.3 Bilinear parameterization is thus indeed a descent method. It converges to a local minimum.

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 46 Separate Least Squares. The identification criterion then becomes For given η this criterion is an LS criterion and minimized w.r.t. θ by We can thus insert it to V N and define the problem as A more general situation than the bilinear case is when one set of parameters enter linearly and another set nonlinearly in the predictor:

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 47 Separate Least Squares. The identification criterion then becomes 2- where 1- The method is called separate least squares since the LS-part has been separated out, and the problem reduced to a minimization problem of lower dimensions. Separate least squares is known to give numerically well-conditioned calculations, but does not necessary give faster convergence than applying a damped Gauss-Newton method to: without utilizing the particular structure.

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 48 High Order AR(X) Models. Suppose the true system is: An order M, ARX structure is used Hannan and Kavalieris and Ljung and Wahlberg show that So high-order ARX model is capable of approximating any linear system arbitrary well.

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 49 High Order AR(X) Models. So high-order ARX model is capable of approximating any linear system arbitrary well. It is of course desirable to reduce this high-order to more tractable versions:

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 50 Separating Dynamics And Noise Models. General model structure is: Use IV method to determine the dynamic part from u to y we can then determine This noise is a measured signal so an ARMA model can be solved as a separate step. How ??

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 51 Determining ARMA Models. The parameters of ARMA model can be estimated using PEM. There is two alternatives to avoid search procedure as: 1- Apply a high-order AR model to in ARMA model to form the innovations. Then form ARX model, and estimate D and C with LS method. 2- Estimate the AR parameters D(q) using the IV method as explained in Problem 7E.1. Then solve following MA model: Exercise3: Exercise 7E.1

lecture 10 Ali Karimpour Dec 2010 Two-Stage and Multistage Method 52 Subspace Methods For Estimating State Space Models. The Subspace methods can also be regarded as a two- stage method, being built up from two LS-steps.

lecture 10 Ali Karimpour Dec 2010 Local Solutions and Initial Values Topics to be covered include: v Linear Regression and Least Squares. v Numerical Solution by Iterative Search Method. v Computing Gradients. v Two-Stage and Multistage Method. v Local Solutions and Initial Values. v Subspace Methods for Estimating State Space Models. 53

lecture 10 Ali Karimpour Dec 2010 Local Solutions and Initial Values 54 Local Minima The iterative methods typically have the property that, with suitably chosen step length μ, they will converge to a solution i.e. While for positive definite R, we have local minimum of V N (θ,Z) The global minimum interests us. They may have several solutions. Remark 1: To find the global solution, start at different feasible initial values. Remark 2: Use some preliminary estimation procedure to produce a good initial value. Remark 3: Local minima do not necessary create problem in practice, if a model passes the validation tests.

lecture 10 Ali Karimpour Dec Local Solutions and Initial Values Remember from chapter 8

lecture 10 Ali Karimpour Dec 2010 Local Solutions and Initial Values 56 Results from SISO Black-box Models General model structure is: Consider the assumption that the system can be described within the model set: SεM The results are listed below for the general SISO model set and refer to

lecture 10 Ali Karimpour Dec 2010 Local Solutions and Initial Values 57 Results from SISO Black-box Models For ARMA models (B=0, D=F=1) all stationary point of are global minima. For ARARX models (C=F=1) there are no false local minima if SNR is large enough. Otherwise false local minima do exist. If A=1 there are no false local minima if n f =1. If A=C=D=1, there are no false local minima if the input is white noise. For other inputs, however, false local minima can exist. For ARMAX models (F=D=1), it is not known whether false local minima exist. For the pseudolinear regression approach it can, however, be shown that The practical experience with different model structures is that the global minimum is usually found without too much problem for ARMAX models. For OE model strictures, on the other hand, convergence to false local minima is not uncommon.

lecture 10 Ali Karimpour Dec 2010 Local Solutions and Initial Values 58 Initial parameter values Duo to the possible occurrence of undesired local minima in the criterion function, it is worthwhile to put some effort on producing good initial values. Also Newton-type method has good local convergence rate, it is again worthwhile to put some effort on producing good initial values. 1- For a physical parameterized model structure: Use your physical insight. 2- For a linear black-box model structure:

lecture 10 Ali Karimpour Dec 2010 Local Solutions and Initial Values 59 Initial filter condition In some configuration we need initial values φ(0,θ). 1 - Start the summation at t=n+1 rather than t= Consider initial condition by:

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models Topics to be covered include: v Linear Regression and Least Squares. v Numerical Solution by Iterative Search Method. v Computing Gradients. v Two-Stage and Multistage Method. v Local Solutions and Initial Values. v Subspace Methods for Estimating State Space Models. 60

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 61 Let us now consider how to estimate the system matrices A, B, C and D in the ss model Let the output y(t) is a p-dimensional column vector, the input u(t) is a m-dimensional column vector. Also the order of system is n. We also assume that this ss representation is a minimal realization. We know that many different representation can also described the system. They are: Where T is any invertible matrix. We also have

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 62 Let the ss as: ► Estimating B and D ► Finding A and C from Observability matrix ► Estimating the Extended Observability matrix ► Finding the States and Estimating the noise Statistics.

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 63 Let the ss as: ► Estimating B and D ► Finding A and C from Observability matrix ► Estimating the Extended Observability matrix ► Finding the States and Estimating the noise Statistics. Subspace procedure.

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 64 ► Estimating B and D For given and fixed the model structure: It is clearly linear in B and D. If the system operates in open loop. We can thus consistently estimate B and D according to theorem 8.4 even if the noise sequence is non-white.

lecture 10 Ali Karimpour Dec Consistency and Identifiability lecture 8 Ali Karimpour Nov 2009

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 66 ► Estimating B and D Let us write the predictor in the standard linear regression form

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 67 ► Estimating B and D Clearly B and D derived by simple LS method.

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 68 ► Estimating x 0 If desired, also the initial state x 0 =x(0) can be estimated in an analogous way, since the predictor with initial values taken into account is Which is linear also in x 0. Here is the unit pulse at time 0.

lecture 10 Ali Karimpour Dec ► Finding A and C from Observability matrix Suppose G that is: Known System Order. Suppose first we know that So that n * =n. To find C is then immediate: Subspace Methods for Estimating State Space Models Unknown System Order. Known System Order. There is two situation:

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 70 ► Finding A and C from Observability matrix Similarly, we can find from the equation Under the observability assumption, O r-1 has rank n so can be determined uniquely.

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 71 ► Finding A and C from Observability matrix Role of the State Space Basis The extended obsevability matrix is depends on the choice of basis in the state-space representation. It is easy to verify that the observability matrix would be So, multiplying the extended observability matrix from right, just changes the basis representation.

lecture 10 Ali Karimpour Dec Unknown system order. Suppose now the true orders of the system is unknown. And that n * -the number of columns of G is just an upper bound for the order. Subspace Methods for Estimating State Space Models where also n is unknown to us. The rank of G is n. A straightforward way is reduce the column of G by n.

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 73

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 74

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 75 Now multiplying this by V 1 from right. Now multiplying this by S 1 -1 from right. Or for some invertible matrix R:

lecture 10 Ali Karimpour Dec Subspace Methods for Estimating State Space Models Using a Noisy Estimate of the Extended Observability Matrix Let us now assume that the given matrix G is a noisy estimate of the true obsevability matrix Due to the noise, S will typically have all singular non-zero values It is reasonable to proceed as above and perform an SVD on G: Where E N is small and tends to zero as. The rank of O r is not known. While the noise matrix E N is likely to be full rank.

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 77 The first n will be supported by O r, while the remaining ones will stem from E N. So this system of equations should be solved in a least-squares sense. Then use to determine, as before. However in the noisy case, will not be exactly subject to the shift structure If the noise is small, one should expected that the latter are significantly smaller than the former. Therefore determine n as the number of singular values that are significantly larger than 0.

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 78 Using Weighting Matrices in the SVD For more flexibility we could pre- and post- multiply G as before performing the SVD Here R is an arbitrary matrix, that will the coordinate basis for the state representation. In the noiseless case E=0, these weightings are without consequence. However, when noise is present, they have an important influence on the space spanned by U 1. and hence on the quality of the estimates and. Remark. The post-multiplying W 2 by an orthogonal matrix does not effect the U 1 - matrix in the decomposition. And then use the below equation to determine and The post-multiplication by W 2 just corresponds to a change of basis in the state-space and the pre-multiplication by W 1 is eliminated. Exercise4: Proof the mentioned remark.(10.E10).

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 79 ► Estimating the Extended Observability Matrix. Remember Now,

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 80 ► Estimating the Extended Observability Matrix. Now, form the vectors

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 81 ► Estimating the Extended Observability Matrix. And the Kth block component of V(t)

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 82 ► Esimating the Extended Observability Matrix. ? We must eliminate the U term and make the noise influence disappear asymptotically. ? ?

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 83 ► Estimating the Extended Observability Matrix. We must eliminate the U term and make the noise influence disappear asymptotically. Removing the U-term. Form the matrix Multiplying from the right by will leads to: Now ? Since this term is made up of noise contributions, the idea is to correlate is away with a suitable matrix.

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 84 ► Estimating the Extended Observability Matrix. Removing the Noise Term. Since the last term is made up of noise contributions. The idea is to correlate it away with a suitable matrix. Define matrix. Here acts as an instrument and we must define it such that

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 85 ► Estimating the Extended Observability Matrix. Here acts as an instrument and we must define it such that then The matrix G can thus be seen as a noisy estimate of the extended observability matrix. But we need to define. so

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 86 Finding Good Instruments. The only remaining question is how to achieve to the following equations Remember instrument variable: Remember: The law of large numbers states that the sample sums converges to their respective expected values, so

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 87 Finding Good Instruments. The only remaining question is how to achieve to the following equations Assume the input u is generated in open loop, so that it is independent of the noise V. Now since V(t) is made of white noise term from time t and onwards so:

lecture 10 Ali Karimpour Dec 2010 Subspace Methods for Estimating State Space Models 88 Finding Good Instruments. The only remaining question is how to achieve to the following equations A formal proof that has full rank is not immediate and will involve properties of the input. Similarly we have: Problem 10G.6 show the suitable input.

lecture 10 Ali Karimpour Dec Finding the States and Estimating the Noise statistics Subspace Methods for Estimating State Space Models Some part of chapter 7 Let a system given by the impulse response representation Formal k-step ahead predictors be defined: Define Then the following is true as (see chapter 4 appendix A) 1- The system (I) has an nth order minimal state space description if and only if the rank is equal to n for all r ≥ n 2- The state vector of any minimal realizations form can be chosen as linear

lecture 10 Ali Karimpour Dec Finding the States and Estimating the Noise statistics Subspace Methods for Estimating State Space Models Let a system given by the impulse response representation For practical reason we have This predictor can be determined effectively by or, dealing with all r predictors simultaneously

lecture 10 Ali Karimpour Dec Finding the States and Estimating the Noise statistics Subspace Methods for Estimating State Space Models By LS we have By inverse lemma Remember So we have

lecture 10 Ali Karimpour Dec Finding the States and Estimating the Noise statistics Subspace Methods for Estimating State Space Models Remember So let: With the states given, we can estimate the process and measurement noises as

lecture 10 Ali Karimpour Dec Putting It All Together The family of subspace algotithm 1. From the input-output data form Subspace Methods for Estimating State Space Models Remember: The scalar r, is the maximal prediction horizon and in many algorithms use r = s Many algorithms choose φ s (t) to consist of past inputs and outputs with s 1 =s 2 =s. So scalar s is a design variable.

lecture 10 Ali Karimpour Dec Putting It All Together The family of subspace algotithm 2. Select weighting matrices W 1 and W 2 and perform SVD Subspace Methods for Estimating State Space Models The weighting matrices W 1 and W 2. This is the perhaps most important choice. Existing algorithms employ the following choices:

lecture 10 Ali Karimpour Dec Putting It All Together The family of subspace algotithm Subspace Methods for Estimating State Space Models 3. Select a full rank matrix R and define the matrix solve For and. The latter equation should be solved in a least square sense. Typical choices for R, are R=I, R=S 1 or 4. Estimate, and from the linear regression problem:

lecture 10 Ali Karimpour Dec Putting It All Together The family of subspace algotithm Subspace Methods for Estimating State Space Models 5. If a noise model is sought, form as in And estimate the noise contributions as in