Statistical Methods For Engineers

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Simple Linear Regression
Chapter 10 Simple Regression.
BA 555 Practical Business Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Simple Linear Regression Analysis
SIMPLE LINEAR REGRESSION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
SIMPLE LINEAR REGRESSION
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Hypothesis Testing in Linear Regression Analysis
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) Each has some error or uncertainty.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Practical Statistical Analysis Objectives: Conceptually understand the following for both linear and nonlinear models: 1.Best fit to model parameters 2.Experimental.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
2014. Engineers often: Regress data  Analysis  Fit to theory  Data reduction Use the regression of others  Antoine Equation  DIPPR We need to be.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Statistical Methods II&III: Confidence Intervals ChE 477 (UO Lab) Lecture 5 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
Statistical Methods II: Confidence Intervals ChE 477 (UO Lab) Lecture 4 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Statistics Presentation Ch En 475 Unit Operations.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
Chapter 4: Basic Estimation Techniques
Statistical analysis.
Chapter 4 Basic Estimation Techniques
Regression Analysis AGEC 784.
ESTIMATION.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Simple Linear Regression
Statistical analysis.
Correlation and Simple Linear Regression
Statistics Presentation
CHAPTER 29: Multiple Regression*
Statistics Review ChE 477 Winter 2018 Dr. Harding.
Hypothesis testing and Estimation
Correlation and Simple Linear Regression
Simple Linear Regression
Simple Linear Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
Regression Statistics
SIMPLE LINEAR REGRESSION
Simple Linear Regression and Correlation
Simple Linear Regression
Product moment correlation
SIMPLE LINEAR REGRESSION
Presentation transcript:

Statistical Methods For Engineers ChE 477 (UO Lab) Brigham Young University

Error of Measured Variable Some definitions: x = sample mean s = sample standard deviation m = exact mean s = exact standard deviation As the sampling becomes larger: x  m s  s t chart z chart not valid if bias exists (i.e. calibration is off) Several measurements are obtained for a single variable (i.e. T). What is the true value? How confident are you? Is the value different on different days? Questions

t-test in Excel The one-tailed t-test function in Excel is: =T.INV(,r) Remember to put in /2 for tests (i.e., 0.025 for 95% confidence interval) The two-tailed t-test function in Excel is: =T.INV.2T(,r) Where  is the probability (i.e, .05 for 95% confidence interval for 2-tailed test) and r is the value of the degrees of freedom

T-test example = exact mean 40.9 is sample mean 40.9 ± 2.4 90% confident m is somewhere in this range 40.9 ± 3.0 95% confident m is somewhere in this range 40.9 ± 4.6 99% confident m is somewhere in this range Alpha is .05 for first case, 0.025 for second, and .005 for third case. What is a for each case?

Histogram Approximates a Probability Density Function (pdf)

All Statistical Info is in pdf Probabilities are determined by integration. Moments (means, variances, etc.) Are obtained by simple means. Most likely outcomes are determined from values.

Student’s t Distribution Used to compute confidence intervals according to Assumes mean and variance estimated by sample values

Typical Numbers Two-tailed analysis Population mean and variance unknown Estimation of population mean only Calculated for 95% confidence interval Based on number of data points, not degrees of freedom

Conversion of SD to CI Example Five data points with sample mean and standard deviation of 714 and 108, respectively. The estimated population mean and 95% confidence interval is:

General Confidence Interval Degrees of freedom generally = n-p, where n is number of data points and p is number of parameters Confidence interval for parameter given by

Linear Fit Confidence Interval For intercept: For slope:

Sum of the squares of the difference

An Example Current/A Temperature/ºC 8.22524 2.5 16.0571 5 21.6508 7.5 26.621 10 27.7787 12.5 38.0298 15 39.9741 Assume you collect the seven data points shown at the right, which represent the measured relationship between temperature and a signal (current) from a sensor. You want to know how to determine the temperature from the current.

First Plot the Data

Fit Data and Determine Residuals

Determine Model Parameters Residuals are easy and accurate means of determining if model is appropriate and of estimating overall variation (standard deviation) of data. The average of the residuals should always be zero. These formulas apply only to a linear regression. Similar formulas apply to any polynomial and approximate formulas apply to any equation.

Determine Confidence Interval

Two typical datasets

Straight-line Regression Estimate Std Error t-Statistic P-Value intercept 0.241001 1.733e-4 139.041 3.6504e-10 slope -3.214e-4 5.525e-6 -58.1739 2.8398e-8 Estimate Std Error t-Statistic P-Value intercept 0.239396 3.3021e-3 72.4977 9.13934e-14 slope -3.264e-4 1.0284e-5 -31.7349 1.50394e-10

Prediction Bands 95% confidence interval for the correct line

Linear vs. Nonlinear Models Linear and nonlinear refer to the coefficients, not the forms of the independent variable. The derivative of a linear model with respect to a parameter does not depend on any parameters. The derivative of a nonlinear model with respect to a parameter depends on one or more of the parameters.

Linear vs. Nonlinear Models    

Joint Confidence Region linearized result correct (unknown) result nonlinear result

Extension

Graphical Summary The linear and non-linear analyses are compared to the original data both as k vs. T and as ln(k) vs. 1/T. As seen in the upper graph, the linearized analysis fits the low-temperature data well, at the cost of poorer fits of high temperature results. The non-linear analysis does a more uniform job of distributing the lack of fit. As seen in the lower graph, the linearized analysis evenly distributes errors in log space

Parameter Estimates Best estimate of parameters for a given set of data. Linear Equations Explicit equations Requires no initial guess Depends only on measured values of dependent and independent variables Does not depend on values of any other parameters Nonlinear Equations Implicit equations Requires initial guess Convergence often difficult Depends on data and on parameters

Parameter Estimates Nonlinear estimate (blue) is closer to the correct value (black) than the linearized estimate (red). Blue line represents parameter 95% confidence region. It is possible that linear analysis could be closer to correct answer with any random set of data, but this would be fortuitous.

For Parameter Estimates In all cases, linear and nonlinear, fit what you measure, or more specifically the data that have normally distributed errors, rather than some transformation of this. Any nonlinear transformation (something other than adding or multiplying by a constant) changes the error distribution and invalidates much of the statistical theory behind the analysis. Standard packages are widely available for linear equations. Nonlinear analyses should be done on raw data (or data with normally distributed errors) and will require iteration, which Excel and other programs can handle.

Recommendations Minimize to sum of squares of differences between measurements and model written in term of what you measured. DO NOT linearize the model, i.e., make it look something like a straight line model. Confidence intervals for parameters can be misleading. Joint/simultaneous confidence regions are much more reliable. Propagation of error formula grossly overestimates error