Data Analysis Examples Anthony E. Butterfield CH EN 4903-1.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Estimation of Means and Proportions
Objectives 10.1 Simple linear regression
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Chapter 10 Simple Regression.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Statistics: Data Analysis and Presentation Fr Clinic II.
Data Freshman Clinic II. Overview n Populations and Samples n Presentation n Tables and Figures n Central Tendency n Variability n Confidence Intervals.
SIMPLE LINEAR REGRESSION
Statistics: Data Presentation & Analysis Fr Clinic I.
T-test.
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Simple Linear Regression Analysis
Chapter 11: Inference for Distributions
SIMPLE LINEAR REGRESSION
Standard error of estimate & Confidence interval.
SIMPLE LINEAR REGRESSION
Correlation and Linear Regression
Data Analysis II Anthony E. Butterfield CH EN "There is a theory which states that if ever anybody discovers exactly what the Universe is for and.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Chapter 3 Basic Concepts in Statistics and Probability
Statistics for Data Miners: Part I (continued) S.T. Balke.
One Sample Inf-1 If sample came from a normal distribution, t has a t-distribution with n-1 degrees of freedom. 1)Symmetric about 0. 2)Looks like a standard.
1 G Lect 10a G Lecture 10a Revisited Example: Okazaki’s inferences from a survey Inferences on correlation Correlation: Power and effect.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) Each has some error or uncertainty.
Statistical Analysis Topic – Math skills requirements.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
General Statistics Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Statistical Methods II&III: Confidence Intervals ChE 477 (UO Lab) Lecture 5 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Statistical Methods II: Confidence Intervals ChE 477 (UO Lab) Lecture 4 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
5.1 Chapter 5 Inference in the Simple Regression Model In this chapter we study how to construct confidence intervals and how to conduct hypothesis tests.
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistics Presentation Ch En 475 Unit Operations.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Experimental Data Analysis Prof. Terry A. Ring, Ph. D. Dept. Chemical & Fuels Engineering University of Utah
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
[1] Simple Linear Regression. The general equation of a line is Y = c + mX or Y =  +  X.  > 0  > 0  > 0  = 0  = 0  < 0  > 0  < 0.
Mystery 1Mystery 2Mystery 3.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
Data Analysis, Presentation, and Statistics
Statistics Presentation Ch En 475 Unit Operations.
1 Probability and Statistics Confidence Intervals.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
6.3 One- and Two- Sample Inferences for Means. If σ is unknown Estimate σ by sample standard deviation s The estimated standard error of the mean will.
1 Design and Analysis of Experiments (2) Basic Statistics Kyung-Ho Park.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Linear Regression Hypothesis testing and Estimation.
Chapter 4 Basic Estimation Techniques
Regression Analysis AGEC 784.
Chapter 4: Sampling and Statistical Inference
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Statistics Presentation
Basic Estimation Techniques
Statistical Methods For Engineers
BA 275 Quantitative Business Methods
Elementary Statistics
SIMPLE LINEAR REGRESSION
Simple Linear Regression
SIMPLE LINEAR REGRESSION
Statistical Inference for the Mean: t-test
Presentation transcript:

Data Analysis Examples Anthony E. Butterfield CH EN

#1: The Normal PDF Your coworker tells you the temperature fluctuations of the outlet temperature from a certain coal gassifier have an average of 1304 K and keep within 12 K of that mean for 95% of her measurements, over months of operation. If we assume the temperature measurements are normally distributed, what is the standard deviation and what are the odds that a temperature measurement would be above 1310 K? T = 1304 ± 12 K (95% Confidence Level)

Normal Distribution Probability density function (PDF):

#1: The Normal PDF

#2: Error Propagation In a falling bead viscometer, the viscosity may be found by the following equation: Where r is the bead radius, g is gravitational acceleration, V is the terminal velocity,  B is the bead density and  F is the fluid density. If we find, within a 95% confidence level, that the bead density is 2 ± 0.1 g/cm3, the radius is 3 ± 0.1 mm, the fluid density is 1.1 ± 0.2 g/cm3, and, after terminal velocity is achieved, the bead falls 10 ± 0.2 cm in 12 ± 0.5 seconds. What is the calculated viscosity and the uncertainty in its value? Which measurement is the greatest source of error?

#2: Error Propagation A couple options:

ValueCIUnitsValueCIUnitsf g m/s^ cm/s^2f g/cm/s BB 20.1g/cm^320.1g/cm^3f g/cm/s FF g/cm^ g/cm^3f g/cm/s r30.1mm cmf g/cm/s d100.2cm100.2cmf g/cm/s t120.5s120.5sf g/cm/s f g/cm/s i(f0-fi)^ f0sum^.5 sum Viscosity ± g/cm/s sum^ #2: Error Propagation

#3: Log Normal 2. You find the following particle size distributions from a spray dryer experiment: Table of data If we were to assume this distribution of particle sizes is log-normal, what would be the mean and standard deviation for the log- normal pdf? Nonlinear fitting problem, like #6.

#3: Log Normal Range Max (um) CountPercentage

#4: Hypothesis Testing On a certain stage of a distillation column theory predicts the ethanol concentration should be 27%. You take the following measurements over several runs: What is the likelihood that your measurements match theory? Percent Ethanol

#4: Hypothesis Testing Student’s T-Test. Mean = StDev = Degrees of Freedom v = n a – 1 = = 9

#4: Hypothesis Testing T-Statistic:

#4: Hypothesis Testing Use t-statistic in CDF to find probability. Answer = 9.6%

#5: Hypothesis Testing 2 You are measuring the effectiveness of a new catalyst on a reaction with a great deal of normally distributed variability. You measure the time to 99% conversion of your reactants with both your new and old catalyst for several experimental runs and find the following data: Given this data, what is the probability that the new catalyst is more effective than the old? What is the probability that they are equally effective? Old (min) New (min)

#5: Hypothesis Testing 2 Mean A = 10.25, Mean B = 9.50 StDev A = 1.071, StDev B = Number A = 22, Number B = 20 Degrees of Freedom v = n a + n b – 2 = 40

#5: Hypothesis Testing 2 T-Statistic:

#5: Hypothesis Testing 2 Simple rule: – Greater or less than tests use one tail (two unequal areas) and you can easily know which % you want to use by looking at the means. – Equal test uses two equal tails. For T-CDF with v = 40 and at t- statistic of , P = 2.7%. P that new catalyst is more effective is a one tail test. More effective (one tail) = 100% - 2.7% = 97% Equal (two tail) = 2*2.7% = 5%

#6: Non-Linear Fit The rate of population growth in a bacteria culture are found to be: It is thought that this data could be fit to the equation: Rate=b1*sin(b2*t) where b1 and b2 are constants to be determined and t is time. Determine the least squares estimated values for b1 and b2 and give an appropriate confidence interval for a confidence level of 90%. Also, what would you anticipate the rate to be at 24 hr? What would the confidence interval for a 95% confidence level be at 24 hr? Time (hr) Rate (SRU)

#6: Non-Linear Fit

%Anthony Butterfield 2009 %Example of nonlinear fit with CIs clear close all b(1)=1/3; b(2)=1; re=0.1; %random noise strength x=linspace(0,6,20)'; %x data for fitting x2=linspace(0,6,100)'; %x data for plotting n=length(x); y=b(1)*sin(b(2)*x)+re*randn(n,1); %y data for fitting, note the random error added in to make it realistic yt=b(1)*sin(b(2)*x2); %theoretical y data for plotting [beta r 1]); %numerically performs a nonlinear fit bci = nlparci(beta,r,J); %returns the c.i. for the parameters, beta [ypred,delta] = %returns a predicted y and the c.i. for each y [ypred,delta] = %returns a predicted y and the c.i. for each y disp('Fit to equation: y = b1 sin(b2 * x)') disp(' x data y data') for i=1:n txt=sprintf(' %5.3f %5.3f',x(i),y(i)); disp(txt) end txt=sprintf('b1 was %3.1f, and is estimated to be: %f ± %f (95% CL)',b(1),beta(1),abs(beta(1)-bci(1,1))); disp(txt) txt=sprintf('b2 was %3.1f, and is estimated to be: %f ± %f (95% CL)',b(2),beta(2),abs(beta(2)-bci(2,1))); disp(txt) figure(1) hold on grid on scatter(x,y,10,'r') plot(x2,yt,'Color',[ ]) %just wanted to give you an example of how to change the line color to something not preset plot(x2,ypred,'b',x2,ypred+delta,'b:',x2,ypred-delta,'b:') hold off

#6: Non-Linear Fit nlparci In “theory” b1 = 0.3; estimated b1 = 0.35 ± 0.05 (90% CL) In “theory” b2 = 1.0; estimated b2 = 1.04 ± 0.04 (90% CL) nlpredci At 24 hr “theory” predicts: Rate = Fit predicts: Rate = ± (95% CL)