Statistics Presentation

Slides:



Advertisements
Similar presentations
Design of Experiments Lecture I
Advertisements

Welcome to PHYS 225a Lab Introduction, class rules, error analysis Julia Velkovska.
Chapter 7 Statistical Data Treatment and Evaluation
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores.
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistics for Business and Economics
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Types of Errors Difference between measured result and true value. u Illegitimate errors u Blunders resulting from mistakes in procedure. You must be careful.
Simple Linear Regression Analysis
Inferences About Process Quality
Statistical Treatment of Data Significant Figures : number of digits know with certainty + the first in doubt. Rounding off: use the same number of significant.
Relationships Among Variables
Tests of significance: The basics BPS chapter 15 © 2006 W.H. Freeman and Company.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Development of An ERROR ESTIMATE P M V Subbarao Professor Mechanical Engineering Department A Tolerance to Error Generates New Information….
Statistical Methods For UO Lab — Part 1 Calvin H. Bartholomew Chemical Engineering Brigham Young University.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) Each has some error or uncertainty.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Lecture 4 Basic Statistics Dr. A.K.M. Shafiqul Islam School of Bioprocess Engineering University Malaysia Perlis
L Berkley Davis Copyright 2009 MER301: Engineering Reliability Lecture 16 1 MER301: Engineering Reliability LECTURE 16: Measurement System Analysis and.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Estimation: Confidence Intervals Based in part on Chapter 6 General Business 704.
Probability (Ch. 6) Probability: “…the chance of occurrence of an event in an experiment.” [Wheeler & Ganji] Chance: “…3. The probability of anything happening;
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistics Presentation Ch En 475 Unit Operations.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
LECTURE 3: ANALYSIS OF EXPERIMENTAL DATA
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Analysis of Experimental Data; Introduction
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
BME 353 – BIOMEDICAL MEASUREMENTS AND INSTRUMENTATION MEASUREMENT PRINCIPLES.
Statistics Presentation Ch En 475 Unit Operations.
Class 5 Estimating  Confidence Intervals. Estimation of  Imagine that we do not know what  is, so we would like to estimate it. In order to get a point.
1 Chapter 8 Interval Estimation. 2 Chapter Outline  Population Mean: Known  Population Mean: Unknown  Population Proportion.
Introduction to Engineering Calculations Chapter 2.
Statistics for Business and Economics 7 th Edition Chapter 7 Estimation: Single Population Copyright © 2010 Pearson Education, Inc. Publishing as Prentice.
Uncertainties in Measurement Laboratory investigations involve taking measurements of physical quantities. All measurements will involve some degree of.
MECH 373 Instrumentation and Measurements
Topic 11 Measurement and data processing
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Regression Analysis AGEC 784.
MSA / Gage Capability (GR&R)
Physics 114: Exam 2 Review Weeks 7-9
Basic Estimation Techniques
Introduction, class rules, error analysis Julia Velkovska
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Tests of significance: The basics
Basic Estimation Techniques
Correlation and Regression
Hypothesis Tests for a Population Mean in Practice
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Statistics Review ChE 477 Winter 2018 Dr. Harding.
Introduction to Instrumentation Engineering
Significant Figures The significant figures of a (measured or calculated) quantity are the meaningful digits in it. There are conventions which you should.
CONCEPTS OF ESTIMATION
Error Analysis.
Chapter 13 Additional Topics in Regression Analysis
Statistical Data Analysis
Precision & Uncertainties
Objectives 6.1 Estimating with confidence Statistical confidence
Objectives 6.1 Estimating with confidence Statistical confidence
Propagation of Error Berlin Chen
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.
Presentation transcript:

Statistics Presentation Ch En 475 Unit Operations Dr. Randy Lewis September 2007

Statistics of Measured Variables Lesson 1 Statistics of Measured Variables

Quantifying variables (i.e. answering a questions with a number) Directly measure the variable. - noted as “measured” variable ex. Temperature measured with thermocouple Calculate variable from “measured” or “tabulated” variables - noted as “calculated” variable ex. Flow rate (Q) = r A v (measured or tabulated) Each has some error or uncertainty

A. Error of Measured Variable Questions Some definitions: x = sample mean s = sample standard deviation m = exact mean s = exact standard deviation As the sampling becomes larger: x  m s  s t chart z chart not valid if bias exists (i.e. calibration is off) Several measurements are obtained for a single variable (i.e. T). What is the true value? How confident are you? Is the value different on different days?

How do you determine the error? Let’s assume “normal” Gaussian distribution For small sampling: s is known For large sampling: s is assumed we’ll pursue this approach small large (n>30) Use t tables for this approach Don’t often have this much data Use z tables for this approach

Example n Temp 1 40.1 2 39.2 3 43.2 4 47.2 5 38.6 6 40.4 7 37.7

Standard Deviation Summary Normal distribution: 40.9 ± (3.27*1) 1s: 68.3% of data is within this range 40.9 ± (3.27*2) 2s: 95.4% of data is within this range 40.9 ± (3.27*3) 3s: 99.7% of data is within this range If normal distribution is questionable, use Chebyshev's inequality: At least 50% of the data are within 1.4 s from the mean. At least 75% of the data are within 2 s from the mean. At least 89% of the data are within 3 s from the mean. The above ranges don’t state how accurate the mean is! Ranges only state % of where the data is located

Student t-test (gives confidence where m (not data) is located) measured mean 5% 5% t true mean 2a=1- probability r = n-1 = 6 2-tail Prob. a t +- 90% .05 1.943 2.40 95% .025 2.447 3.02 99% .005 3.143 3.88 See http://www.stat.tamu.edu/stat30x/zttables.php for expanded table

T-test Summary = exact mean 40.9 is sample mean 40.9 ± 2.40 90% confident m is somewhere in this range 40.9 ± 3.02 95% confident m is somewhere in this range 40.9 ± 3.88 99% confident m is somewhere in this range

Comparing averages of measured variables Day 1: Day 2: What is your confidence that mx≠my (i.e. they are different)? Larger t: More likely different 1-tail 1-a confident different a confident same nx+ny-2

Review Review Questions: We covered 3 topics Explain what the following refers to (and it usefulness): a) 40.9 ± 3.27 (1s) b) 40.9 ± 2.40 (90% confidence) Explain another use of the t statistic

Class Example Data points Pressure Day 1 Day 2 1 750 730 2 760 3 752 Calculate average and s for both sets of data Find range in which 95.4% of the data falls (for each set). Determine range for m for each set at 95% probability At what confidence are pressures different each day? Data points Pressure Day 1 Day 2 1 750 730 2 760 3 752 762 4 747 749 5 754 737

Lesson 2 Statistics of Calculated Variables: Confidence Interval and Propagation of max error

Uncertainty of Calculated Variable Calculate variable from multiple input (measured, tabulated, …) variables (i.e. m = rAv) What is the uncertainty of your “calculated” value? Example: You take measurements of r, A, v to determine m = rAv. What is the range of m and its associated uncertainty? Value used to make decisions by managers- manager needs to know uncertainty of value Ethics and societal impact is important How do you determine the uncertainty of the value? Details provided in Applied Engineering Statistics, Chapters 8 and 14, R.M. Bethea and R.R. Rhinehart, 1991).

Example: Uncertainty of Calculated Variable (g/s) r (g/cm3) A (cm2) V (cm/s) 1 66.55 1.02 20.2 3.23 2 63.97 20.1 3.12 3 71.91 3.49 4 64.75 19.9 3.19 5 68.95 20.3 3.33 6 68.06 19.8 3.37 Avg 67.36 3.29 Stdev 2.92 0.19 0.13

Note: If n is large (>10), Step 1: Student t-test measured mean Standard error: stdev of means if previous experiment was repeated numerous times. As n  ∞, x  m (the real mean). Std error true mean 2a=1- probability r = n-1=5 Prob. a t +- t∞ 90% .05 2.015 2.40 1.645 95% .025 2.571 3.06 1.96 99% .005 3.365 4.01 2.326 Note: If n is large (>10), t  2 for 95%. Thus, confidence is approx twice std. error

Step 2: Propagation of Maximum Error Plan: obtain max error (d) for each input variable then obtain max error of calculated variable Method 1: Propagation of max error- brute force Method 2: Propagation of max error- analytical Sources of error: Estimation- we guess! Discrimination- device accuracy (single data point) Calibration- may not be exact (error of curve fit) Technique- i.e. measure ID rather than OD Constants and data- not always exact! Noise- which reading do we take? Model and equations- i.e. ideal gas law vs real gas Humans- transposing, …

Estimates of Error (d) for input variables (d’s are propagated to find uncertainty) Measured: measure multiple times; obtain s; d ≈ 2.5s Reason: 99% of data is within ± 2.5s Example: s = 2.3 ºC for thermocouple, d = 5.8 ºC Tabulated : d ≈ 2.5 times last reported significant digit (with 1) Reason: Assumes last digit is ± 2.5 (± 0 assumes perfect, ± 5 assumes next left digit is fuzzy) Example: r = 1.3 g/ml at 0º C, d = 0.25 g/ml Example: People = 127,000 d = 2500 people

Estimates of Error (d) for input variables Manufacturer spec or calibration accuracy: use given spec or accuracy data Example: Pump spec is ± 1 ml/min, d = 1 ml/min Variable from regression (i.e. calibration curve): d ≈ 2.5*standard error (std error is stdev of residual) Example: Velocity is slope with std error = 2 m/s d = 5 m/s Judgment for a variable: use judgment for d Example: Read pressure to ± 1 psi, d = 1 psi

Method 1: Propagation of max error- brute force m = r A v Brute force method: max min r A v r = 1.02 g/cm3 (table) A = 20.08 cm2 (avg) v = 3.29 cm/s (avg) Additional information: sA = 0.19 cm2 sv = 0.13 cm/s All combinations What is d for each input variable? mmin < m < mmax (low 60.0, high 80.4)

Method 2: Propagation of max error- analytical Additional information: r = 1.02 g/cm3 (table) A = 20.8 cm2 (avg) v = 3.29 cm/s (avg) m = r A v sA = 0.19 cm2 sv = 0.13 cm/s y x1 x2 x3 * Remember to take the absolute value!! Av rv rA m = mavg ± dm = rAv ± dm = 67.36 ± 10.2 g/s = (20.8)(3.29)(0.025) = 0.17 (10.2) ferror,r=

Propagation of max error If linear equation, symmetric errors, and input errors are independent  brute force and analytical are same If non-linear equation, symmetric errors, and input errors are independent  brute force and analytical are close if errors are small (<10%). If large errors, use brute force. Must use brute force if errors are dependant on each other and/or asymmetric (i.e. Q = 10 ml/min, “+” 2 ml/min and “-” 0.5 ml/min) Analytical method is easier to assess if lots of inputs. Also gives info on % contribution from each error. All propagation methods assume there is no bias on the inputs (i.e. calibration is not off, etc.- see next example if calibration is off). We will assume all inputs are not biased (ex. computer noise >> computer bias) in lab unless you have evidence to the contrary.

Step 3: Propagation of variance- analytical Maximum error can be calculated from max errors of input variables as shown previously: Brute force Analytical (if assumptions are valid) Probable error is more realistic Errors are independent (some may be “+” and some “-”). Not all will be in same direction and at their largest value. Thus, propagate variance rather than max error to obtain better estimate of error (probable error rather than max error) You need variance (s2) of each input to propagate variance. If unknown, estimate s2 = (d/2.5)2 Same assumptions apply as with propagation of max error: symmetric error, linear assumption (error <10%), errors are independent, and no bias. use propagation of max error if not much data, use propagation of variance if lots of data

Propagation of variance- analytical y = yavg ± 1.96 SQRT(s2y) 95% y = yavg ± 2.57 SQRT(s2y) 99% m = r A v r = 1.02 g/cm3 (table) A = 20.8 cm2 (avg) v = 3.29 cm/s (avg) y x1 x2 x3 Additional information: m = 67.36 ± 5.7 g/s sA = 0.19 cm2 sv = 0.13 cm/s

Summary Step 1: Student t-test helps provide confidence level, but not what contributes to error improves with more experiments (std error) includes interactions among variables limitation: can’t be < than accuracy of table, device, calibration, etc. (doesn’t account for error)- if time, include bias for better estimate m = 67.36 ± 3.06 g/s (95% confidence) Step 2: Propagation of maximum error- brute force gives maximum error easy to use if equation is difficult to take partials equation linearity is not required symmetry of errors is not required m = 67.36 (high 80.4, low 60.0)

Summary Step 2: Propagation of maximum error- analytical helps determine what contributes to error never improves with more experiments equation given does not include interactions among variables assumes linear equation (not always true) m = 67.36 ± 10.2 g/s (17% error from density) Step 3: Propagation of variance- analytical same assumptions as maximum error-analytical accounts for errors not always at maximum value s = d/2.5, accounts for error of tables, calibration, etc. m = 67.36 ± 5.7 g/s (95% confidence) t-test < variance < max error (likely between t-test and variance) The basis for both “analytical” methods is that the error on a measurement must be much less (an order of magnitude or more) than the measurement itself.

Data and Statistical Expectations Summary of raw data (table format) If measured variable: average and standard deviation Table showing estimated errors of input variables If calculated variable: Student t-test to give confidence level (recognizing limitations) and propagation of maximum error (analytical) to show how much error associated with input variables- see instructor for what to propagate Sample calculations– including one example of ALL statistical calculations Summary of all calculations- table format is helpful