Statistical Methods For UO Lab — Part 1 Calvin H. Bartholomew Chemical Engineering Brigham Young University.

Slides:



Advertisements
Similar presentations
Design of Experiments Lecture I
Advertisements

Welcome to PHYS 225a Lab Introduction, class rules, error analysis Julia Velkovska.
Chapter 7 Statistical Data Treatment and Evaluation
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sampling: Final and Initial Sample Size Determination
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Chapter 10 Simple Regression.
Quality Control Procedures put into place to monitor the performance of a laboratory test with regard to accuracy and precision.
Ka-fu Wong © 2003 Chap 9- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
BHS Methods in Behavioral Sciences I
Types of Errors Difference between measured result and true value. u Illegitimate errors u Blunders resulting from mistakes in procedure. You must be careful.
EEM332 Design of Experiments En. Mohd Nazri Mahmud
Experimental Evaluation
Inferences About Process Quality
Statistics for Managers Using Microsoft® Excel 7th Edition
Statistical Methods For Engineers ChE 477 (UO Lab) Larry Baxter & Stan Harding Brigham Young University.
AM Recitation 2/10/11.
Probability Distributions and Test of Hypothesis Ka-Lok Ng Dept. of Bioinformatics Asia University.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
Determining Sample Size
Development of An ERROR ESTIMATE P M V Subbarao Professor Mechanical Engineering Department A Tolerance to Error Generates New Information….
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Error Analysis Accuracy Closeness to the true value Measurement Accuracy – determines the closeness of the measured value to the true value Instrument.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
Estimation of Statistical Parameters
Topic 5 Statistical inference: point and interval estimate
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistics for Data Miners: Part I (continued) S.T. Balke.
Estimation Bias, Standard Error and Sampling Distribution Estimation Bias, Standard Error and Sampling Distribution Topic 9.
Topics: Statistics & Experimental Design The Human Visual System Color Science Light Sources: Radiometry/Photometry Geometric Optics Tone-transfer Function.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
PARAMETRIC STATISTICAL INFERENCE
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) Each has some error or uncertainty.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 Estimation From Sample Data Chapter 08. Chapter 8 - Learning Objectives Explain the difference between a point and an interval estimate. Construct and.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistical Methods II&III: Confidence Intervals ChE 477 (UO Lab) Lecture 5 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
Statistical Methods II: Confidence Intervals ChE 477 (UO Lab) Lecture 4 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Lecture 7 Dustin Lueker. 2  Point Estimate ◦ A single number that is the best guess for the parameter  Sample mean is usually at good guess for the.
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistics Presentation Ch En 475 Unit Operations.
Physics 270 – Experimental Physics. Let say we are given a functional relationship between several measured variables Q(x, y, …) x ±  x and x ±  y What.
Probability = Relative Frequency. Typical Distribution for a Discrete Variable.
LECTURE 3: ANALYSIS OF EXPERIMENTAL DATA
ME Mechanical and Thermal Systems Lab Fall 2011 Chapter 3: Assessing and Presenting Experimental Data Professor: Sam Kassegne, PhD, PE.
Review of fundamental 1 Data mining in 1D: curve fitting by LLS Approximation-generalization tradeoff First homework assignment.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Analysis of Experimental Data; Introduction
1 Estimation of Population Mean Dr. T. T. Kachwala.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Machine Learning 5. Parametric Methods.
Statistics Presentation Ch En 475 Unit Operations.
Lecture 8: Measurement Errors 1. Objectives List some sources of measurement errors. Classify measurement errors into systematic and random errors. Study.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Hypothesis Tests. An Hypothesis is a guess about a situation that can be tested, and the test outcome can be either true or false. –The Null Hypothesis.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Confidence Intervals Cont.
ESTIMATION.
Introduction, class rules, error analysis Julia Velkovska
Statistics Presentation
Statistical Methods For Engineers
CHAPTER 29: Multiple Regression*
Statistics Review ChE 477 Winter 2018 Dr. Harding.
Introduction to Instrumentation Engineering
Presentation transcript:

Statistical Methods For UO Lab — Part 1 Calvin H. Bartholomew Chemical Engineering Brigham Young University

Background  Statistics is the science of problem-solving in the presence of variability (Mason 2003).  Statistics enables us to:  Assess the variability of measurements  Avoid bias from unconsidered causes variation  Determine probability of factors, risks  Build good models  Obtain best estimates of model parameters  Improve chances of making correct decisions  Make most efficient and effective use of resources

Some U.S. Cultural Statistics  58.4% have called into work sick when we weren't.  3 out of 4 of us store our dollar bills in rigid order with singles leading up to higher denominations.  50% admit they regularly sneak food into movie theaters to avoid the high prices of snack foods.  39% of us peek in our host's bathroom cabinet.  17% have been caught by the host.  81.3% would tell an acquaintance to zip his pants.  29% of us ignore RSVP.  35% give to charity at least once a month.  71.6% of us eavesdrop.

Population vs. Sample Statistics  Population statistics  Characterizes the entire population, which is generally the unknown information we seek  Mean generally designated   Variance & standard deviation generally designated as  , and , respectively  Sample statistics  Characterizes a random, hopefully representative, sample – typically data from which we infer population statistics  Mean generally designated  Variance & standard deviation generally designated as s 2 and s, respectively

Point vs. Model Estimation  Point estimation  Characterizes a single, usually global measurement  Generally simple mathematic and statistical analysis  Procedures are unambiguous  Model development  Characterizes a function of dependent variables  Complexity of parameter estimation and statistical analysis depend on model complexity  Parameter estimation and especially statistics are somewhat ambiguous

Overall Approach  Use sample statistics to estimate population statistics  Use statistical theory to indicate the accuracy with which the population statistics have been estimated  Use linear or nonlinear regression methods/statistics to fit data to a model and to determine goodness of fit  Use trends indicated by theory to optimize experimental design

Sample Statistics  Estimate properties of probability distribution function (PDF), i.e., mean and standard deviation using Gaussian statistics  Use student t-test to determine variance and confidence interval  Estimate random errors in the measurement of data  For variables that are geometric functions of several basic variables, use the propagation of errors approach estimate: (a) probable error (PE) and (b) maximum possible error (MPE)  PE and MPE can be estimated by differential method; MPE can also be estimated by brute force method  Determine systematic errors (bias)  Compare estimated errors from measurements with calculated errors from statistics—will reveal whether methods of measurement or quantity of data is limiting

Some definitions: x = sample mean s = sample standard deviation  = exact mean  = exact standard deviation As the sampling becomes larger: x   s    t chart z chart  not valid if bias exists (i.e. calibration is off) Random Error: Single Variable (i.e. T) Several measurements are obtained for a single variable (i.e. T). What is the true value? How confident are you? Is the value different on different days? Questions

 Let’s assume a “normal” Gaussian distribution  For small sample: s is known  For large sample:  is assumed How do you determine bounds of  small large (n>30) we’ll pursue this approach Use z tables for this approach

Example 1 nTemp

Properties of a Normal PDF  About 68.26%, 95.44%, and 99.74% of data lie within 1, 2, and 3 standard deviations of the mean, respectively.  When mean is zero and standard deviation is 1, it is referred to as a standard normal distribution.  Plays fundamental role in statistical analysis because of the Central Limit Theorem.

Central Limit Theorem  Distribution of means calculated from a large data set is approximately normal  Becomes more accurate with larger number of samples  Sample mean approaches true mean as n →   Assumes distributions are not peaked close to a boundary and variances are finite

Student t-Distribution  Widely used in hypothesis testing and determining confidence intervals  Equivalent to normal distribution for large sample size  Student is a pseudonym, not an adjective – actual name was W. S. Gosset who published in early 1900s.

Student t-Distribution  Used to compute confidence intervals according to  Assumes mean and variance are estimated by sample values  Value of t decreases with DOF or number of data points n; increases with increasing % confidence

Student t-test (determine error from s)  = 1- probability r = n -1 error = t s /n 0.5 Prob.  /2 t t s/n % % t e.g. From Example 1: n = 7, s = 3.27

Values of Student t Distribution  Depend on both confidence level desired and amount of data.  Degrees of freedom are n-1, where n = number of data points (assumes mean and variance are estimated from data).  This table assumes two-tailed distribution of area.

Example 2  Five data points with sample mean and standard deviation of and 107.8, respectively.  The estimated population mean and 95% confidence interval is (from previous table t  = ):

Example 3: Comparing Averages Day 1: Day 2: What is your confidence that  x ≠  y  n x +n y -2 99% confident different 1% confident same

Error Propagation: Multiple Variables Example: How much ice cream do you buy for the AIChE event? Ice cream = f (time of day, tests, …) Example: You take measurements of , A, v to determine m =  Av. What is the range of m and its associated uncertainty? Obtain value (i.e. from model) using multiple input variables. What is the uncertainty of your value? Each input variable has its own error

Value and Uncertainty Values are used to make decisions by managers — uncertainty of a value must be specified Ethics and societal impact of values are important How do you determine the uncertainty of a value? Sources of uncertainty: 1. Estimation- we guess! 2. Discrimination- device accuracy (single data point) 3. Calibration- may not be exact (error of curve fit) 4. Technique- i.e. measure ID rather than OD 5. Constants and data- not always exact! 6. Noise- which reading do we take? 7. Model and equations- i.e. ideal gas law vs real gas 8. Humans- transposing, …

Estimates of Error (   ) for Input Variable (Methods or rules) 1. Measured variable (as we just did): measure multiple times; obtain s;  ≈ 2.57 s ( t chart shows > 2.57 s for 99% confidence e.g. s = 2.3 ºC for thermocouple,  = 5.8 ºC 2. Tabulated variable:  ≈ 2.57 times last reported significant digit (e.g.  = 1.0 g/ml at 0º C,  = g/ml)

Estimates of Error (d) for Variable 3. Manufacturer specs: use given accuracy data (ex. Pump is ± 1 ml/min, d = 1 ml/min) 4.Variable from regression (i.e. calibration curve):  ≈ standard error (e.g. Velocity from equation with std error = 2 m/s ) 5. Judgment for a variable: use judgment for  (e.g. graph gives pressure to ± 1 psi,  = 1 psi)

Calculating Maximum or Probable Error 1. Maximum error can be calculated as shown previously: a) Brute force method b) Differential method 2. Probable error is more realistic – positive and negative errors can lower the error. You need standard deviations (  or s) to calculate probable error (PE) (i.e. see previous example). PE =  = 2.57  Ψ = y ± 1.96 SQRT(   y ) 95% Ψ = y ± 2.57 SQRT(   y ) 99%

Calculating Maximum (Worst) Error y = f(a,b,c…, x 1,x 2,x 3,…) Exact constantsIndependent variables Range of y (Ψ) = y ±  y 1. Brute force method: substitute upper and lower limits of all x’s into function to get max and min values of y. Range of y (Ψ ) is between y min and y max. 2. Differential method: from a given model

Example 4: Differential method m =  A v y x 1 x 2 x 3 x 1 =  = 2.0 g/cm 3 (table) x 2 = A = 3.4 cm 2 (measured avg) x 3 = v = 2 cm/s (calibration)  1 = g/cm 3 (Rule 2)  2 = 0.2 cm 2 (Rule 1)  3 = 0.1 cm/s (Rule 4) y = (2.0)(3.4)(2) = 13.6 g/s  y = (6.8)(0.257)+(4.0)(0.2)+(6.8)(0.1) = 3.2 g/s Which product term contributes the most to uncertainty? Ψ = 13.6 ± 3.2 g/s This method works only if errors are symmetrical