SYSTEMS Identification

Slides:

Advertisements

Similar presentations

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.

Advertisements

Properties of Least Squares Regression Coefficients

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.

Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.

CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

Lecture 3 Today: Statistical Review cont’d:

The General Linear Model. The Simple Linear Model Linear Regression.

AGC DSP AGC DSP Professor A G Constantinides©1 A Prediction Problem Problem: Given a sample set of a stationary processes to predict the value of the process.

Visual Recognition Tutorial

The Simple Linear Regression Model: Specification and Estimation

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.

Ch 7.9: Nonhomogeneous Linear Systems

SYSTEMS Identification

SYSTEMS Identification

SUMS OF RANDOM VARIABLES Changfei Chen. Sums of Random Variables Let be a sequence of random variables, and let be their sum:

Chapter 4 Multiple Regression.

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.

SYSTEMS Identification

Development of Empirical Models From Process Data

Prediction and model selection

1 EE 616 Computer Aided Analysis of Electronic Networks Lecture 9 Instructor: Dr. J. A. Starzyk, Professor School of EECS Ohio University Athens, OH,

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.

Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.

1 10. Joint Moments and Joint Characteristic Functions Following section 6, in this section we shall introduce various parameters to compactly represent.

Relationships Among Variables

1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,

Adaptive Signal Processing

RLSELE Adaptive Signal Processing 1 Recursive Least-Squares (RLS) Adaptive Filters.

CORRELATION & REGRESSION

ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.

Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.

ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.

HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 9 Samples.

CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.

Predicting Output from Computer Experiments Design and Analysis of Computer Experiments Chapter 3 Kevin Leyton-Brown.

Multiple Random Variables Two Discrete Random Variables –Joint pmf –Marginal pmf Two Continuous Random Variables –Joint Distribution (PDF) –Joint Density.

Vector Norms and the related Matrix Norms. Properties of a Vector Norm: Euclidean Vector Norm: Riemannian metric:

Elements of Stochastic Processes Lecture II

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:

Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory For The User” Lennart.

Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.

Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.

Chapter 8: Simple Linear Regression Yang Zhenlin.

Discrete-time Random Signals

Joint Moments and Joint Characteristic Functions.

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad Reference: “System Identification Theory.

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad.

Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,

Estimating standard error using bootstrap

Chapter 7. Classification and Prediction

12. Principles of Parameter Estimation

Outline Introduction Signal, random variable, random process and spectra Analog modulation Analog to digital conversion Digital transmission through baseband.

CH 5: Multivariate Methods

Tutorial 9: Further Topics on Random Variables 2

Undergraduated Econometrics

SYSTEMS Identification

12. Principles of Parameter Estimation

Further Topics on Random Variables: Derived Distributions

16. Mean Square Estimation

Further Topics on Random Variables: Derived Distributions

Further Topics on Random Variables: Derived Distributions

Presentation transcript:

SYSTEMS Identification Ali Karimpour Assistant Professor Ferdowsi University of Mashhad <<<1.1>>> ###Control System Design### {{{Control, Design}}}

Asymptotic Distribution of Parameter Estimators Lecture 9 Asymptotic Distribution of Parameter Estimators Topics to be covered include: Central Limit Theorem The Prediction-Error approach: Basic theorem Expression for the Asymptotic Variance Frequency-Domain Expressions for the Asymptotic Variance Distribution of Estimation for the correlation Approach Distribution of Estimation for the Instrumental Variable Methods

Overview If convergence is guaranteed, then But, how fast does the estimate approach the limit? What is the probability distribution of ? The variance analysis of this chapter will reveal: a) The estimate converges to at a rate proportional to b) Distribution converges to a Gaussian distribution: N(0,Q) c) Covariance matrix Q, depends on - The number of samples/data set size: N, - The parameter sensitivity of the predictor: - The noise variance

Central Limit Theorem Topics to be covered include: The Prediction-Error approach: Basic Theorem Expression for the Asymptotic Variance Frequency-Domain Expressions for the Asymptotic Variance Distribution of Estimation for the correlation Approach Distribution of Estimation for the Instrumental Variable Methods

Central Limit Theorem The mathematical tool needed for asymptotic variance analysis is “Central Limit” theorems. Example: Consider two independent random variable, X and Y, with the same uniform distribution, shown in Figure below. Define another random variable Z as the sum of X and Y: Z=X+Y. we can obtain the distribution of Z, as :

Central Limit Theorem Further, consider W=X+Y+Z. The resultant PDF is getting close to a Gaussian distribution The resultant PDF is getting close to a Gaussian distribution. In general, the PDF of a random variable approaches a Gaussian distribution, regardless of the PDF of each , as N gets larger.

Central Limit Theorem Let be a d-dimensional random variable with : Mean Cov Consider the sum of given by: Then, as N tends to infinity, the distribution of converges to the Gaussian distribution given by PDF:

The Prediction-Error approach: Basic Theorem Topics to be covered include: Central Limit Theorem The Prediction-Error approach: Basic Theorem Expression for the Asymptotic Variance Frequency-Domain Expressions for the Asymptotic Variance Distribution of Estimation for the correlation Approach Distribution of Estimation for the Instrumental Variable Methods

The Prediction-Error Approach Applying the Central Limit Theorem, we can obtain the distribution of estimate as N tends to infinity. Let be an estimate based on the prediction error method Then, with prime denoting differentiation with respect to , Expanding around gives: is a vector “between”

The Prediction-Error Approach Assume that is nonsingular, then: To obtain the distribution of , and must be computed as N tends to infinity. Where as usual:

The Prediction-Error Approach For simplicity, we first assume that the predictor is given by a linear regression: The actual data is generated by ( is the parameter vector of the true system) So: Therefore:

The Prediction-Error Approach Let us treat as a random variable. Its mean is zero, since: The covariance is Consider: Appling the central limit Theorem:

The Prediction-Error Approach Next, compute : And:

The Prediction-Error Approach We obtain: So:

The Prediction-Error Approach The extended result of estimate distribution is summarized in the following theorem, i.e. Ljun’g Textbook Theorem 9-1. Theorem 1 Consider the estimate determined by: Assume that the model structure is linear and uniformly stable and that the data set satisfies the quasi stationary requirements. Assume also that converges with probability 1 to a unique parameter vector involved in : Also we have: And that: Converge to with probability 1

The Prediction-Error Approach Where is the ensemble mean given by: Then, the distribution of converges to the Gaussian distribution given by

The Prediction-Error Approach As stated formally in Theorem 1, the distribution of converges to a Gaussian distribution for the broad class of system identification problems. This implies that the covariance of asymptotically converges to: This is called the asymptotic covariance matrix and it depends not only on (a) the number of samples/data set size: N, but also on (b) the parameter sensitivity of the predictor: (c) Noise variance

Expression for the Asymptotic Variance Topics to be covered include: Central Limit Theorem The Prediction-Error approach: Basic Theorem Expression for the Asymptotic Variance Frequency-Domain Expressions for the Asymptotic Variance Distribution of Estimation for the correlation Approach Distribution of Estimation for the Instrumental Variable Methods

Quadratic Criterion Let us compute the covariance once again for the general case: Unlike the linear regression, the sensitivity is a function of θ, Assume that is a white noise with zero mean and variances . We have:

Quadratic Criterion Similarly Hence: The asymptotic variance is therefore a) inversely proportional to the number of samples, b) proportional to the noise variance, and c) inversely related to the parameter sensitivity. The more a parameter affects the prediction, the smaller the variance becomes.

Quadratic Criterion Since is not known, the asymptotic variance cannot be determined. A very important and useful aspect of expressions for the asymptotic covariance matrix is that it can be estimated from data. Having N data points and determined we mat use: sufficient data samples needed for assuming the model accuracy may be obtained.

Example : Covariance of LS Estimates Consider the system and are two independent white noise with variances and respectively Suppose that the coefficient for is known and the system is identified in the model structure Or We have:

Example : Covariance of LS Estimates Hence To compute the covariance, square the first equation and take the expectation: Multiplying the first equation by and taking expectation gives: The last equality follows, since does not affect (due to the time delay ) Hence:

Example : Covariance of LS Estimates Assume a=0.1, Estimated values for parameter a, for 100 independent experiment using LSE, is shown in the bellow Figure. Cov (a) = 0.0471

Example : Covariance of LS Estimates Now, assume a=0.1, Cov (a) = 0.0014

Example : Covariance of LS Estimates Now, assume a=0.1, Cov (a) = 0.1204

Example : Covariance of an MA(1) Parameter Consider the system is white noise with variance . The MA(1) model structure is used: Given the predictor 4.18: Differentiation w.r.t c gives At we have : If is the PEM estimate of c:

Asymptotic Variance for general Norms. We have: Similarly: We can use the asymptotic normality result in this more general form whenever required. The expression for the asymptotic covariance matrix is rather complicated in general.

Asymptotic Variance for general Norms. Assume that . Then under assumption after straightforward calculations: Clearly for for quadratic The choice of in the criterion only acts as scaling of the covariance matrix

Frequency-Domain Expressions for the Asymptotic Variance Topics to be covered include: Central Limit Theorem The Prediction-Error approach: Basic Theorem Expression for the Asymptotic Variance Frequency-Domain Expressions for the Asymptotic Variance Distribution of Estimation for the correlation Approach Distribution of Estimation for the Instrumental Variable Methods

Frequency-Domain Expressions for the Asymptotic Variance. The asymptotic variance has different expression in the frequency domain, which we will find useful for variance analysis and experiment design. Let transfer function and noise model be consolidated into a matrix The gradient of T, that is, the sensitivity of T to θ, is For a predictor, we have already defined W(q,θ,) and z(t), s.t.

Frequency-Domain Expressions for the Asymptotic Variance. Therefore the predictor sensitivity is given by Where Substituting in the first equation:

Frequency-Domain Expressions for the Asymptotic Variance. At (the true system), note and where Let be the spectrum matrix of : Using the familiar formula:

Frequency-Domain Expressions for the Asymptotic Variance. For the noise spectrum, Using this in equation below: We have: The asymptotic variance in the frequency domain.

Distribution of Estimation for the correlation Approach Topics to be covered include: Central Limit Theorem The Prediction-Error approach: Basic Theorem Expression for the Asymptotic Variance Frequency-Domain Expressions for the Asymptotic Variance Distribution of Estimation for the correlation Approach Distribution of Estimation for the Instrumental Variable Methods

The Correlation Approach We shall confine ourselves to the case study in Theorem 8.6, that is, and linearly generated instruments. We thus have: By Taylor expansion we have: This is entirely analogous with the previous one obtained for PE approach, with he difference that in is replaced with in .

The Correlation Approach Theorem : consider by Assume that is computed for a linear, uniformly stable model structure And that: is a uniformly stable family of filters. Assume also that that is nonsingular and that Then

The Correlation Approach

Example : Covariance of an MA(1) Parameter Consider the system is white noise with variance . Using PLR method: At we have :

Distribution of Estimation for the Instrumental Variable Methods Topics to be covered include: Central Limit Theorem The Prediction-Error approach: Basic Theorem Expression for the Asymptotic Variance Frequency-Domain Expressions for the Asymptotic Variance Distribution of Estimation for the correlation Approach Distribution of Estimation for the Instrumental Variable Methods

Instrumental Variable Methods We have: Suppose the true system is given as Where e(t) is white noise with variance independent of {u(t)}. then is independent of {u(t)} and hence of if the system operates in open loop. Thus is a solution to:

Instrumental Variable Methods To get an asymptotic distribution, we shall assume it is the only solution to . Introduce also the monic filter Intersecting into these Eqs.,

Example : Covariance of LS Estimates Consider the system and are two independent white noise with variances and respectively Suppose that the coefficient for is known and the system is identified in the model structure. Let a be estimated by the IV method using: By comparing the above system with We have

Instrumental Variable Methods