Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rudolf Žitný, Ústav procesní a zpracovatelské techniky ČVUT FS 2010 Error analysis Statistics Regression Experimental methods E181101 EXM8.

Similar presentations


Presentation on theme: "Rudolf Žitný, Ústav procesní a zpracovatelské techniky ČVUT FS 2010 Error analysis Statistics Regression Experimental methods E181101 EXM8."— Presentation transcript:

1 Rudolf Žitný, Ústav procesní a zpracovatelské techniky ČVUT FS 2010 Error analysis Statistics Regression Experimental methods E181101 EXM8

2 E VALUATION OF E XPERIMENTAL D ATA EXM8 Distribution of errors. It is assumed that a true value of a quantity is distorted by n-small effects of the same magnitude (positive or negative). Superposition of these effect results to a random error, having binomial distribution. As soon as the number of effects goes to infinity, this distribution reduces to the normal Gauss distribution of errors where  is the mean quadratic error called standard deviation. Probability, that an error is somewhere within the range is the integral distribution Example P(  )=0.68 P(3  )=0.997 Gauss integral  (  )d  is the probability that an error is within interval

3 E VALUATION OF E XPERIMENTAL D ATA EXM8 Arithmetic average of repeated measurement this is the best estimate of expected value. Standard deviation of single measurement can be estimated using this average (the best estimate of standard deviation) Please notice the fact, than n-1 and not just n is used in denominator. This is because we do not know the expected value, estimated as arithmetic average, and therefore number of degrees of freedom is reduced by 1 (n-1). The set of recorded data x 1,…. x n enables to evaluate also standard deviation of the calculated arithmetic average (which is obviously smaller than the standard deviation of measured data).

4 Measuring chain EXM8. The measured quantity x (e.g. temperature) is usually measured by a chain of different instruments (e.g. by thermocouple and voltage amplifier), with generally nonlinear characteristics (voltage is not exactly linear function of temperature for thermocouple) and instrument transforms input signal according to its characteristics. There are always some random errors superposed. f(x) thermocouple g(x) amplifier x (actual value) f(x)+  fi y=g(f(x)+  fi )+  gi Random noise with normal distribution (  f ) and zero mean value Random noise with normal distribution (  g ) and zero mean value

5 Measuring chain EXM8. Expected mean value for n repeated experiments Therefore the mean value (even for a very large number of experiments n) is distorted in the case that the function g(x) is nonlinear and this deviation is proportional to variance of errors applied to instrument f(x) (thermocouple): Variance of thermocouple noise Mean value of noise is zero

6 Measuring chain EXM8. Expected variance of y for repeated measurement of the same value x Variance of thermocouple noise Variance of amplifier noise

7 Taylor expansion EXM8. Taylor expansion of function of M variables

8 Variance of evaluated property EXM8. Variance of property calculated from M measured values (independent variables)

9 Variance of evaluated property EXM8. Proof of variance of arithmetic average

10 Variance of evaluated property EXM8. Example related to the project of capillary rheometer ( syringe): Evaluation of viscosity from the following capillary rheometer data Geometry:  D diameter of needle,  L-length of needle,   p pressure drop,  V volumetric flowrate Variance of individual measured parameters can be estimated from repeated measurement, e.g. from repeated measurement of the needle length L The variance can be sometimes estimated from instrument data sheets Hagen Poiseuille relation for laminar flow

11 Data Regression EXM8. Hopper

12 Data Regression EXM8. Regression analysis: Approximation of relationship between independent variables x (there can be more than one independent variable) and dependent variable y. Let us assume that data are arranged in the matrix of observation points (each row describes one point x,y). For example this is a matrix with two columns and N rows if there is one independent variable x and N-pairs of x,y. The relationship y(x) is represented by model where is vector of model parameters. where  i is standard deviation of dependent variable y at the point x. Regression analysis looks for the model parameters giving the best approximation of observation points, i.e. minimising the goal function Chi square criterion

13 Data Regression EXM8. A good model f(x,p) (that reasonably approximates the unknown relationship y(x)) should give chi square value of about N-M (N is number of points and M is number of identified parameters p). Another indicator of quality of the selected regression model is correlation index r The correlation index r=1 in the case of absolutely perfect fit (model reproduces all observation points exactly), the worst case is r=0, because than the function f would be better approximated by a constant, the mean value of dependent variable.

14 Linear regression analysis EXM8. In this case only the models f(x,p) which are linear with respect to the model parameters p k are used g m (x) are design functions, which can be selected more or less arbitrarily, they must be only linearly independent. Example g 1 =1, g 2 =x, g 3 =x 2,… For N observation points the design matrix A is defined as A ij =g j (x i )

15 Linear regression analysis EXM8. Parameters p are identified in such a way that the sum of squares will be minimized (it corresponds to minimization of chi square criterion for the case, that standard deviation error of all data points is the same). The sum of squares can be expressed also in matrix notation as a scalar product of two vectors (residual vectors of differences between measured values of y and prediction by linear model) Design matrix (function of x i ) Vector of data y i

16 Linear regression analysis EXM8. Looking for minimum of sum of squares ( zero gradient at minimum ) This is system of linear algebraic equations for unknown vector of model parameter p Right hand side vector Square matrix M x M This system is called NORMAL EQUATIONS and inverted matrix [[C]] -1 is called COVARIANCE MATRIX.

17 Linear regression analysis EXM8. The covariance matrix C -1 is closely related to probable uncertainties (standard deviations) of calculated parameters: Variance of measured data Variance of calculated parameters Proof:

18 NonLinear regression EXM8. In this case the model can’t be decomposed to linear combination of design functions, ane has a general form y=f(x,p 1,…,p M ) – this model can be in form of an algebraic expression, but it can be for example solution of differential equation. The parameters p should be again calculated from the requirement, that the sum of squares of deviations (or weighted sum of squares) is the least possible. The Marquardt Levenberg method is based upon linearisation of optimised model f(x i,p 1,…,p M )=f i, where x i are independent variables of the i-th observation point and p 1,…,p M are optimised parameters of model. The least squares criterion is used for optimisation Increment of k-th parameter in iteration step Weight of i-th data point

19 NonLinear regression EXM8. Each iteration of Marquardt Levenberg method consists in solution of linear algebraic equations for vector of parameter increments Concergency of iterations is improved by artificial increase of C matrix diagonal, by adding a constant to C 11, C 22,…C MM. For very large the algorithm reduces to the steepest discent method (gradient method) – slow, but reliable, while for very small iterations approach Gauss method – faster but sensitive to initial estimate of searched parameters.

20 Example Regression EXM8. Regression model

21 Example CalibrationCalibration EXM8. Simultaneous calibration of multiple thermocouples or pressure transducers A/D converter T TrTr U1U1 U2U2 UMUM Consider linear characteristics of individual channels Measured data are represented by matrix of observation points Reference temperature Voltage 1Volage 2…Voltage M T r1 u 11 u 12 …u 1M T r2 u 21 u 22 u 2M …… T rN u N1 u N2 u NM 1(of 5)

22 Example Calibration EXM8. Calibration means identification of constants k j and t j of all transducers. As soon as the reference values T r are accurate (recorded by a standard instrument with better accuracy than the accuracy of calibrated probes) the problem is quite simple: Parameters k j,t j can be identified by linear regression for each probe separately. Recorded voltage Evaluated temperature [[C]] [B] 2(of 5)

23 Example Calibration COVARIANCE EXM8. Covariance matrix C is inverted matrix of normal equations Estimated variances of transducers Variances of k j and t j 3(of 5)

24 Example Calibration SIMULTANEOUS EXM8. Actual temperature T i in the i-th measurement is not exactly the recorded reference value T r (due to inaccuracy of standard instrument) but T i is the same for all probes assuming a good mixing of liquid in the bath (this assumption is fulfiled even better with simultaneous calibration of pressure transducers). Question is how to use this information for improvement of identified constants accuracy? The best estimate of actual temperature of bath in the i-th measurement is based upon minimisation of deviation with respect T ri and deviations of the predicted temperatures from M-probes (assuming that their characteristics are known) Weight of standard instrument (select high w if accuracy of standard is high) result 4(of 5)

25 Example Calibration SIMULTANEOUS EXM8. The best approximation of bath temperature T i can be used instead of T ri, and the whole procedure repeated until convergency is achieved j=1,2,…, M i=1,2,…, N j=1,2,…, M converge yes no Result k j, t j Data: w,u ij,T ri 5(of 5)

26 Example Laser scanner (1 of 2) EXM8. How to identify a circle, given set of points x i y i x y x 0 y 0 x i y i But this is a system of 3 nonlinear equations

27 Example Laser scanner (2 of 2) EXM8. How to identify a circle, given set of points x i y i x y x 0 y 0 x 1 y 1 x 2 y 2 x 3 y 3 3 points define a circle. So you can evaluate triplets (for n=100 this is 161700 radii) and estimate radius by average.


Download ppt "Rudolf Žitný, Ústav procesní a zpracovatelské techniky ČVUT FS 2010 Error analysis Statistics Regression Experimental methods E181101 EXM8."

Similar presentations


Ads by Google