12.215 Modern Navigation Thomas Herring

Slides:



Advertisements
Similar presentations
The Maximum Likelihood Method
Advertisements

ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Experimental Design, Response Surface Analysis, and Optimization
The General Linear Model Or, What the Hell’s Going on During Estimation?
Integration of sensory modalities
Visual Recognition Tutorial
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Point estimation, interval estimation
Motion Analysis (contd.) Slides are from RPI Registration Class.
Curve-Fitting Regression
Mobile Intelligent Systems 2004 Course Responsibility: Ola Bengtsson.
AGC DSP AGC DSP Professor A G Constantinides© Estimation Theory We seek to determine from a set of data, a set of parameters such that their values would.
Parametric Inference.
Linear and generalised linear models
Linear and generalised linear models
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Modern Navigation Thomas Herring
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Principles of Least Squares
Principles of the Global Positioning System Lecture 13 Prof. Thomas Herring Room A;
Principles of the Global Positioning System Lecture 11 Prof. Thomas Herring Room A;
Colorado Center for Astrodynamics Research The University of Colorado STATISTICAL ORBIT DETERMINATION Project Report Unscented kalman Filter Information.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Modern Navigation Thomas Herring MW 11:00-12:30 Room
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Modern Navigation Thomas Herring MW 11:00-12:30 Room A
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Curve-Fitting Regression
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
University of Colorado Boulder ASEN 5070: Statistical Orbit Determination I Fall 2014 Professor Brandon A. Jones Lecture 26: Singular Value Decomposition.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION ASEN 5070 LECTURE 11 9/16,18/09.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Modern Navigation Thomas Herring MW 11:00-12:30 Room
How Errors Propagate Error in a Series Errors in a Sum Error in Redundant Measurement.
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
Principles of the Global Positioning System Lecture 12 Prof. Thomas Herring Room ;
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Machine Learning 5. Parametric Methods.
1 SVY 207: Lecture 6 Point Positioning –By now you should understand: F How receiver knows GPS satellite coordinates F How receiver produces pseudoranges.
Principles of the Global Positioning System Lecture 09 Prof. Thomas Herring Room A;
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Computacion Inteligente Least-Square Methods for System Identification.
University of Colorado Boulder ASEN 5070: Statistical Orbit Determination I Fall 2015 Professor Brandon A. Jones Lecture 19: Examples with the Batch Processor.
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
LECTURE 06: MAXIMUM LIKELIHOOD ESTIMATION
LECTURE 11: Advanced Discriminant Analysis
Ch3: Model Building through Regression
The Maximum Likelihood Method
Statistical Methods For Engineers
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
6.5 Taylor Series Linearization
OVERVIEW OF LINEAR MODELS
Principles of the Global Positioning System Lecture 11
Principles of the Global Positioning System Lecture 13
Parametric Methods Berlin Chen, 2005 References:
Presentation transcript:

Modern Navigation Thomas Herring

10/26/ Modern Naviation L132 Summary of last class Basic Statistics –Statistical description and parameters Probability distributions Descriptions: expectations, variances, moments Covariances Estimates of statistical parameters Propagation of variances –Methods for determining the statistical parameters of quantities derived other statistical variables

10/26/ Modern Naviation L133 Today’s class Estimation methods –Restrict to basically linear estimation problems (also non-linear problems that are nearly linear) –Restrict to parametric, over determined estimation –Concepts in estimation: Mathematical models Statistical models Least squares and Maximum likelihood estimation Covariance matrix of estimates parameters Covariance matrix of post-fit residual

10/26/ Modern Naviation L134 Concepts in estimation Given multiple observations of a quantity or related to a set of quantities how do you obtain a “best” estimate. What do we mean by “best” How do you quantify of quality of observations and the relationship between errors in observations. The complete answers to the above questions are complex We will limit our discussion to parametric estimation mostly for Gaussian distributed random errors in measurements. In parametric estimation, mathematical relationships between observations and parameters that can be used to model the observations is used (e.g., GPS measures pseudoranges to satellites: These observations can be related to the positions of the ground station and satellites plus other parameters that we discuss later).

10/26/ Modern Naviation L135 Basics of parametric estimation All parametric estimation methods can be broken into a few main steps: –Observation equations: equations that relate the parameters to be estimated to the observed quantities (observables). Mathematical model. Example: Relationship between pseudorange, receiver position, satellite position (implicit in  ), clocks, atmospheric and ionosphere delays –Stochastic model: Statistical description that describes the random fluctuations in the measurements and maybe the parameters. In some forms the stochastic model is not explicit. –Inversion that determines the parameters values from the mathematical model consistent with the statistical model.

10/26/ Modern Naviation L136 Observation model Observation model are equations relating observables to parameters of model: –Observable = function (parameters) –Observables should not appear on right-hand-side of equation –The observed values are the observable plus noise of some stochastic nature Often function is non-linear and most common method is linearization of function using Taylor series expansion. Sometimes log linearization for f=a.b.c ie. Products fo parameters

10/26/ Modern Naviation L137 Taylor series expansion In most common Taylor series approach: The estimation is made using the difference between the observations and the expected values based on apriori values for the parameters. The estimation returns adjustments to apriori parameter values The observations are y+noise

10/26/ Modern Naviation L138 Linearization Since the linearization is only an approximation, the estimation should be iterated until the adjustments to the parameter values are zero. For GPS estimation: Convergence rate is :1 typically (ie., a 1 meter error in apriori coordinates could results in 1-10 mm of non-linearity error). To assess, the level on non-linear contribution, the Taylor series expansion is compared to the non-linear evaluation. If the differences are similar in size to the noise in the measurements, then a new Taylor series expansion, about the better estimates of the parameters, is needed.

10/26/ Modern Naviation L139 Estimation Most common estimation method is “least-squares” in which the parameter estimates are the values that minimize the sum of the squares of the differences between the observations and modeled values based on parameter estimates. For linear estimation problems, direct matrix formulation for solution For non-linear problems: Linearization or search technique where parameter space is searched for minimum value Care with search methods that local minimum is not found (will not treat in this course)

10/26/ Modern Naviation L1310 Least squares estimation Originally formulated by Gauss. Basic equations:  y is vector of observations; A is linear matrix relating parameters to observables;  x is vector of parameters; v is residual

10/26/ Modern Naviation L1311 Weighted Least Squares In standard least squares, nothing is assumed about the residuals v except that they are zero expectation. One often sees weight-least-squares in which a weight matrix is assigned to the residuals. Residuals with larger elements in W are given more weight.

10/26/ Modern Naviation L1312 Statistical approach to least squares If the weight matrix used in weighted least squares is the inverse of the covariance matrix of the residuals, then weighted least squares is a maximum likelihood estimator for Gaussian distributed random errors. This choice maximizes the probability density (called a maximum likelihood estimate, MLE) This latter form of least-squares is most statistically rigorous version. Sometimes weights are chosen empirically

10/26/ Modern Naviation L1313 Data covariance matrix If we use the inverse of the covariance matrix of the noise in the data, we obtain a MLE if data noise is Gaussian distribution. How do you obtain data covariance matrix? Difficult question to answer completely For sextant measurements: –Index error measurements –Individual observers Issues to be considered for GPS specifically: –Thermal noise in receiver gives on component –Multipath could be treated as a noise-like quantity –Signal-to-noise ratio of measurements allows an estimate of the noise (discussed later in course). –In-complete mathematical model of observables can sometimes be treated as noise-like. –Gain of GPS antenna will generate lower SNR at low elevation angles

10/26/ Modern Naviation L1314 Data covariance matrix In practice in GPS (as well as many other fields), the data covariance matrix is somewhat arbitrarily chosen. Largest problem is temporal correlations in the measurements. Typical GPS data set size for 24-hours of data at 30 second sampling is 8x2880=23000 phase measurements. Since the inverse of the covariance matrix is required, fully accounting for correlations requires the inverse of 23000x23000 matrix. To store the matrix would require, 4Gbytes of memory Even if original covariance matrix is banded (ie., correlations over a time short compared to 24-hours), the inverse of banded matrix is usually a full matrix (However, remember LU decomposition in linear algebra lecture)

10/26/ Modern Naviation L1315 Covariance matrix of parameter estimates Propagation of covariance can be applied to the weighted least squares problem: Notice that the covariance matrix of parameter estimates is a natural output of the estimator if ATV -1 A is inverted (does not need to be)

10/26/ Modern Naviation L1316 Covariance matrix of estimated parameters Notice that for the rigorous estimation, the inverse of the data covariance is needed (time consuming if non-diagonal) To compute to parameter estimate covariance, only the covariance matrix of the data is needed (not the inverse) In some cases, a non-rigorous inverse can be done with say a diagonal covariance matrix, but the parameter covariance matrix is rigorously computed using the full covariance matrix. This is a non-MLE but the covariance matrix of the parameters should be correct (just not the best estimates that can found). This techniques could be used if storage of the full covariance matrix is possible, but inversion of the matrix is not because it would take too long or inverse can not be performed in place.

10/26/ Modern Naviation L1317 Covariance matrix of post-fit residuals Post-fit residuals are the differences between the observations and the values computed from the estimated parameters Because some of the noise in the data are absorbed into the parameter estimates, in general, the post-fit residuals are not the same as the errors in the data. In some cases, they can be considerably smaller. The covariance matrix of the post-fit residuals can be computed using propagation of covariances.

10/26/ Modern Naviation L1318 Covariance matrix of post-fit residuals This can be computed using propagation on covariances: e is the vector of true errors, and v is vector of residuals

10/26/ Modern Naviation L1319 Post-fit residuals Notice that we can compute the compute the covariance matrix of the post-fit residuals (a large matrix in generate) Eqn 1 on previous slide gives an equation of the form v=Be; why can we not compute the actual errors with e=B-1v? B is a singular matrix which has no unique inverse (there is in fact one inverse which would generate the true errors) Note: In this case, singularity does not mean that there is no inverse, it means there are an infinite number of inverses.

10/26/ Modern Naviation L1320 Example Consider the case shown below: When a rate of change is estimated, the slope estimate will absorb error in the last data point particularly as  t increases. (Try this case yourself)

10/26/ Modern Naviation L1321 Summary Estimation methods –Restrict to basically linear estimation problems (also non-linear problems that are nearly linear) –Restrict to parametric, over determined estimation –Concepts in estimation: Mathematical models Statistical models Least squares and Maximum likelihood estimation Covariance matrix of estimated parameters Statistical properties of post-fit residuals