Rudolf Žitný, Ústav procesní a zpracovatelské techniky ČVUT FS 2010 Error analysis Statistics Regression Experimental methods E181101 EXM8.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

The Maximum Likelihood Method
Instabilities of SVD Small eigenvalues -> m+ sensitive to small amounts of noise Small eigenvalues maybe indistinguishable from 0 Possible to remove small.
1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Managerial Economics in a Global Economy
The Multiple Regression Model.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Chapter 10 Curve Fitting and Regression Analysis
Ch11 Curve Fitting Dr. Deshi Ye
P M V Subbarao Professor Mechanical Engineering Department
The loss function, the normal equation,
The Simple Linear Regression Model: Specification and Estimation
Chapter 10 Simple Regression.
Curve-Fitting Regression
Chapter 11 Multiple Regression.
Linear and generalised linear models
Statistical Treatment of Data Significant Figures : number of digits know with certainty + the first in doubt. Rounding off: use the same number of significant.
Linear and generalised linear models
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Maximum likelihood (ML)
Lorelei Howard and Nick Wright MfD 2008
Linear regression models in matrix terms. The regression function in matrix terms.
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Remark: foils with „black background“ could be skipped, they are aimed to the more advanced courses Rudolf Žitný, Ústav procesní a zpracovatelské techniky.
Calibration & Curve Fitting
Least-Squares Regression
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Chapter 15 Modeling of Data. Statistics of Data Mean (or average): Variance: Median: a value x j such that half of the data are bigger than it, and half.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Basic Probability (Chapter 2, W.J.Decoursey, 2003) Objectives: -Define probability and its relationship to relative frequency of an event. -Learn the basic.
Curve-Fitting Regression
CHAPTER 4 Adaptive Tapped-delay-line Filters Using the Least Squares Adaptive Filtering.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
LECTURE 3: ANALYSIS OF EXPERIMENTAL DATA
Data Modeling Patrice Koehl Department of Biological Sciences National University of Singapore
Module 1: Measurements & Error Analysis Measurement usually takes one of the following forms especially in industries: Physical dimension of an object.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Surveying II. Lecture 1.. Types of errors There are several types of error that can occur, with different characteristics. Mistakes Such as miscounting.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Computacion Inteligente Least-Square Methods for System Identification.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Econometrics III Evgeniya Anatolievna Kolomak, Professor.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
The Maximum Likelihood Method
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Chapter 7. Classification and Prediction
The Maximum Likelihood Method
Evgeniya Anatolievna Kolomak, Professor
The Maximum Likelihood Method
Statistical Methods For Engineers
Introduction to Instrumentation Engineering
REGRESSION.
Modelling data and curve fitting
Nonlinear regression.
5.2 Least-Squares Fit to a Straight Line
Propagation of Error Berlin Chen
Propagation of Error Berlin Chen
Presentation transcript:

Rudolf Žitný, Ústav procesní a zpracovatelské techniky ČVUT FS 2010 Error analysis Statistics Regression Experimental methods E EXM8

E VALUATION OF E XPERIMENTAL D ATA EXM8 Distribution of errors. It is assumed that a true value of a quantity is distorted by n-small effects of the same magnitude (positive or negative). Superposition of these effect results to a random error, having binomial distribution. As soon as the number of effects goes to infinity, this distribution reduces to the normal Gauss distribution of errors where  is the mean quadratic error called standard deviation. Probability, that an error is somewhere within the range is the integral distribution Example P(  )=0.68 P(3  )=0.997 Gauss integral  (  )d  is the probability that an error is within interval

E VALUATION OF E XPERIMENTAL D ATA EXM8 Arithmetic average of repeated measurement this is the best estimate of expected value. Standard deviation of single measurement can be estimated using this average (the best estimate of standard deviation) Please notice the fact, than n-1 and not just n is used in denominator. This is because we do not know the expected value, estimated as arithmetic average, and therefore number of degrees of freedom is reduced by 1 (n-1). The set of recorded data x 1,…. x n enables to evaluate also standard deviation of the calculated arithmetic average (which is obviously smaller than the standard deviation of measured data).

Measuring chain EXM8. The measured quantity x (e.g. temperature) is usually measured by a chain of different instruments (e.g. by thermocouple and voltage amplifier), with generally nonlinear characteristics (voltage is not exactly linear function of temperature for thermocouple) and instrument transforms input signal according to its characteristics. There are always some random errors superposed. f(x) thermocouple g(x) amplifier x (actual value) f(x)+  fi y=g(f(x)+  fi )+  gi Random noise with normal distribution (  f ) and zero mean value Random noise with normal distribution (  g ) and zero mean value

Measuring chain EXM8. Expected mean value for n repeated experiments Therefore the mean value (even for a very large number of experiments n) is distorted in the case that the function g(x) is nonlinear and this deviation is proportional to variance of errors applied to instrument f(x) (thermocouple): Variance of thermocouple noise Mean value of noise is zero

Measuring chain EXM8. Expected variance of y for repeated measurement of the same value x Variance of thermocouple noise Variance of amplifier noise

Taylor expansion EXM8. Taylor expansion of function of M variables

Variance of evaluated property EXM8. Variance of property calculated from M measured values (independent variables)

Variance of evaluated property EXM8. Proof of variance of arithmetic average

Variance of evaluated property EXM8. Example related to the project of capillary rheometer ( syringe): Evaluation of viscosity from the following capillary rheometer data Geometry:  D diameter of needle,  L-length of needle,   p pressure drop,  V volumetric flowrate Variance of individual measured parameters can be estimated from repeated measurement, e.g. from repeated measurement of the needle length L The variance can be sometimes estimated from instrument data sheets Hagen Poiseuille relation for laminar flow

Data Regression EXM8. Hopper

Data Regression EXM8. Regression analysis: Approximation of relationship between independent variables x (there can be more than one independent variable) and dependent variable y. Let us assume that data are arranged in the matrix of observation points (each row describes one point x,y). For example this is a matrix with two columns and N rows if there is one independent variable x and N-pairs of x,y. The relationship y(x) is represented by model where is vector of model parameters. where  i is standard deviation of dependent variable y at the point x. Regression analysis looks for the model parameters giving the best approximation of observation points, i.e. minimising the goal function Chi square criterion

Data Regression EXM8. A good model f(x,p) (that reasonably approximates the unknown relationship y(x)) should give chi square value of about N-M (N is number of points and M is number of identified parameters p). Another indicator of quality of the selected regression model is correlation index r The correlation index r=1 in the case of absolutely perfect fit (model reproduces all observation points exactly), the worst case is r=0, because than the function f would be better approximated by a constant, the mean value of dependent variable.

Linear regression analysis EXM8. In this case only the models f(x,p) which are linear with respect to the model parameters p k are used g m (x) are design functions, which can be selected more or less arbitrarily, they must be only linearly independent. Example g 1 =1, g 2 =x, g 3 =x 2,… For N observation points the design matrix A is defined as A ij =g j (x i )

Linear regression analysis EXM8. Parameters p are identified in such a way that the sum of squares will be minimized (it corresponds to minimization of chi square criterion for the case, that standard deviation error of all data points is the same). The sum of squares can be expressed also in matrix notation as a scalar product of two vectors (residual vectors of differences between measured values of y and prediction by linear model) Design matrix (function of x i ) Vector of data y i

Linear regression analysis EXM8. Looking for minimum of sum of squares ( zero gradient at minimum ) This is system of linear algebraic equations for unknown vector of model parameter p Right hand side vector Square matrix M x M This system is called NORMAL EQUATIONS and inverted matrix [[C]] -1 is called COVARIANCE MATRIX.

Linear regression analysis EXM8. The covariance matrix C -1 is closely related to probable uncertainties (standard deviations) of calculated parameters: Variance of measured data Variance of calculated parameters Proof:

NonLinear regression EXM8. In this case the model can’t be decomposed to linear combination of design functions, ane has a general form y=f(x,p 1,…,p M ) – this model can be in form of an algebraic expression, but it can be for example solution of differential equation. The parameters p should be again calculated from the requirement, that the sum of squares of deviations (or weighted sum of squares) is the least possible. The Marquardt Levenberg method is based upon linearisation of optimised model f(x i,p 1,…,p M )=f i, where x i are independent variables of the i-th observation point and p 1,…,p M are optimised parameters of model. The least squares criterion is used for optimisation Increment of k-th parameter in iteration step Weight of i-th data point

NonLinear regression EXM8. Each iteration of Marquardt Levenberg method consists in solution of linear algebraic equations for vector of parameter increments Concergency of iterations is improved by artificial increase of C matrix diagonal, by adding a constant to C 11, C 22,…C MM. For very large the algorithm reduces to the steepest discent method (gradient method) – slow, but reliable, while for very small iterations approach Gauss method – faster but sensitive to initial estimate of searched parameters.

Example Regression EXM8. Regression model

Example CalibrationCalibration EXM8. Simultaneous calibration of multiple thermocouples or pressure transducers A/D converter T TrTr U1U1 U2U2 UMUM Consider linear characteristics of individual channels Measured data are represented by matrix of observation points Reference temperature Voltage 1Volage 2…Voltage M T r1 u 11 u 12 …u 1M T r2 u 21 u 22 u 2M …… T rN u N1 u N2 u NM 1(of 5)

Example Calibration EXM8. Calibration means identification of constants k j and t j of all transducers. As soon as the reference values T r are accurate (recorded by a standard instrument with better accuracy than the accuracy of calibrated probes) the problem is quite simple: Parameters k j,t j can be identified by linear regression for each probe separately. Recorded voltage Evaluated temperature [[C]] [B] 2(of 5)

Example Calibration COVARIANCE EXM8. Covariance matrix C is inverted matrix of normal equations Estimated variances of transducers Variances of k j and t j 3(of 5)

Example Calibration SIMULTANEOUS EXM8. Actual temperature T i in the i-th measurement is not exactly the recorded reference value T r (due to inaccuracy of standard instrument) but T i is the same for all probes assuming a good mixing of liquid in the bath (this assumption is fulfiled even better with simultaneous calibration of pressure transducers). Question is how to use this information for improvement of identified constants accuracy? The best estimate of actual temperature of bath in the i-th measurement is based upon minimisation of deviation with respect T ri and deviations of the predicted temperatures from M-probes (assuming that their characteristics are known) Weight of standard instrument (select high w if accuracy of standard is high) result 4(of 5)

Example Calibration SIMULTANEOUS EXM8. The best approximation of bath temperature T i can be used instead of T ri, and the whole procedure repeated until convergency is achieved j=1,2,…, M i=1,2,…, N j=1,2,…, M converge yes no Result k j, t j Data: w,u ij,T ri 5(of 5)

Example Laser scanner (1 of 2) EXM8. How to identify a circle, given set of points x i y i x y x 0 y 0 x i y i But this is a system of 3 nonlinear equations

Example Laser scanner (2 of 2) EXM8. How to identify a circle, given set of points x i y i x y x 0 y 0 x 1 y 1 x 2 y 2 x 3 y 3 3 points define a circle. So you can evaluate triplets (for n=100 this is radii) and estimate radius by average.