Download presentation
Presentation is loading. Please wait.
1
Statistical Methods For Engineers
ChE 477 (UO Lab) Brigham Young University
2
Error of Measured Variable
Some definitions: x = sample mean s = sample standard deviation m = exact mean s = exact standard deviation As the sampling becomes larger: x m s s t chart z chart not valid if bias exists (i.e. calibration is off) Several measurements are obtained for a single variable (i.e. T). What is the true value? How confident are you? Is the value different on different days? Questions
3
t-test in Excel The one-tailed t-test function in Excel is:
=T.INV(,r) Remember to put in /2 for tests (i.e., for 95% confidence interval) The two-tailed t-test function in Excel is: =T.INV.2T(,r) Where is the probability (i.e, .05 for 95% confidence interval for 2-tailed test) and r is the value of the degrees of freedom
4
T-test example = exact mean 40.9 is sample mean
40.9 ± % confident m is somewhere in this range 40.9 ± % confident m is somewhere in this range 40.9 ± % confident m is somewhere in this range Alpha is .05 for first case, for second, and .005 for third case. What is a for each case?
5
Histogram Approximates a Probability Density Function (pdf)
6
All Statistical Info is in pdf
Probabilities are determined by integration. Moments (means, variances, etc.) Are obtained by simple means. Most likely outcomes are determined from values.
7
Student’s t Distribution
Used to compute confidence intervals according to Assumes mean and variance estimated by sample values
8
Typical Numbers Two-tailed analysis
Population mean and variance unknown Estimation of population mean only Calculated for 95% confidence interval Based on number of data points, not degrees of freedom
9
Conversion of SD to CI Example
Five data points with sample mean and standard deviation of 714 and 108, respectively. The estimated population mean and 95% confidence interval is:
10
General Confidence Interval
Degrees of freedom generally = n-p, where n is number of data points and p is number of parameters Confidence interval for parameter given by
11
Linear Fit Confidence Interval
For intercept: For slope:
12
Sum of the squares of the difference
13
An Example Current/A Temperature/ºC 2.5 5 7.5 26.621 10 12.5 15 Assume you collect the seven data points shown at the right, which represent the measured relationship between temperature and a signal (current) from a sensor. You want to know how to determine the temperature from the current.
14
First Plot the Data
15
Fit Data and Determine Residuals
16
Determine Model Parameters
Residuals are easy and accurate means of determining if model is appropriate and of estimating overall variation (standard deviation) of data. The average of the residuals should always be zero. These formulas apply only to a linear regression. Similar formulas apply to any polynomial and approximate formulas apply to any equation.
17
Determine Confidence Interval
18
Two typical datasets
19
Straight-line Regression
Estimate Std Error t-Statistic P-Value intercept e e-10 slope e e e-8 Estimate Std Error t-Statistic P-Value intercept e e-14 slope e e e-10
20
Prediction Bands 95% confidence interval for the correct line
21
Linear vs. Nonlinear Models
Linear and nonlinear refer to the coefficients, not the forms of the independent variable. The derivative of a linear model with respect to a parameter does not depend on any parameters. The derivative of a nonlinear model with respect to a parameter depends on one or more of the parameters.
22
Linear vs. Nonlinear Models
23
Joint Confidence Region
linearized result correct (unknown) result nonlinear result
24
Extension
25
Graphical Summary The linear and non-linear analyses are compared to the original data both as k vs. T and as ln(k) vs. 1/T. As seen in the upper graph, the linearized analysis fits the low-temperature data well, at the cost of poorer fits of high temperature results. The non-linear analysis does a more uniform job of distributing the lack of fit. As seen in the lower graph, the linearized analysis evenly distributes errors in log space
26
Parameter Estimates Best estimate of parameters for a given set of data. Linear Equations Explicit equations Requires no initial guess Depends only on measured values of dependent and independent variables Does not depend on values of any other parameters Nonlinear Equations Implicit equations Requires initial guess Convergence often difficult Depends on data and on parameters
27
Parameter Estimates Nonlinear estimate (blue) is closer to the correct value (black) than the linearized estimate (red). Blue line represents parameter 95% confidence region. It is possible that linear analysis could be closer to correct answer with any random set of data, but this would be fortuitous.
28
For Parameter Estimates
In all cases, linear and nonlinear, fit what you measure, or more specifically the data that have normally distributed errors, rather than some transformation of this. Any nonlinear transformation (something other than adding or multiplying by a constant) changes the error distribution and invalidates much of the statistical theory behind the analysis. Standard packages are widely available for linear equations. Nonlinear analyses should be done on raw data (or data with normally distributed errors) and will require iteration, which Excel and other programs can handle.
29
Recommendations Minimize to sum of squares of differences between measurements and model written in term of what you measured. DO NOT linearize the model, i.e., make it look something like a straight line model. Confidence intervals for parameters can be misleading. Joint/simultaneous confidence regions are much more reliable. Propagation of error formula grossly overestimates error
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.