Data Analysis Examples Anthony E. Butterfield CH EN 4903-1.

Data Analysis Examples Anthony E. Butterfield CH EN 4903-1

#1: The Normal PDF Your coworker tells you the temperature fluctuations of the outlet temperature from a certain coal gassifier have an average of 1304 K and keep within 12 K of that mean for 95% of her measurements, over months of operation. If we assume the temperature measurements are normally distributed, what is the standard deviation and what are the odds that a temperature measurement would be above 1310 K? T = 1304 ± 12 K (95% Confidence Level)

Normal Distribution Probability density function (PDF):

#1: The Normal PDF

#2: Error Propagation In a falling bead viscometer, the viscosity may be found by the following equation: Where r is the bead radius, g is gravitational acceleration, V is the terminal velocity,  B is the bead density and  F is the fluid density. If we find, within a 95% confidence level, that the bead density is 2 ± 0.1 g/cm3, the radius is 3 ± 0.1 mm, the fluid density is 1.1 ± 0.2 g/cm3, and, after terminal velocity is achieved, the bead falls 10 ± 0.2 cm in 12 ± 0.5 seconds. What is the calculated viscosity and the uncertainty in its value? Which measurement is the greatest source of error?

#2: Error Propagation A couple options:

ValueCIUnitsValueCIUnitsf g9.806650m/s^2980.6650cm/s^2f121.18236g/cm/s BB 20.1g/cm^320.1g/cm^3f223.53596g/cm/s FF 1.10.2g/cm^31.10.2g/cm^3f316.47517g/cm/s r30.1mm0.30.01cmf422.61806g/cm/s d100.2cm100.2cmf520.76702g/cm/s t120.5s120.5sf622.06496g/cm/s f021.18236g/cm/s i(f0-fi)^2 10 25.539414 322.15766 42.061216 50.172508 60.77898 f0sum^.5 sum30.70977 Viscosity21.18236±5.54164g/cm/s sum^.55.54164 #2: Error Propagation

#3: Log Normal 2. You find the following particle size distributions from a spray dryer experiment: Table of data If we were to assume this distribution of particle sizes is log-normal, what would be the mean and standard deviation for the log- normal pdf? Nonlinear fitting problem, like #6.

#3: Log Normal Range Max (um) CountPercentage 0 0.5000 1.0000 1.5000 2.0000 2.5000 3.0000 3.5000 4.0000 4.5000 5.0000 5.5 0 300 426 352 257 182 129 92 66 48 36 0 15.8898 38.4534 57.0975 70.7097 80.3496 87.1822 92.0551 95.5508 98.0932 100.0000

#4: Hypothesis Testing On a certain stage of a distillation column theory predicts the ethanol concentration should be 27%. You take the following measurements over several runs: What is the likelihood that your measurements match theory? Percent Ethanol 24.6 27.6 21.7 24.1 22.6 24.5 33.2 21.7 17.7 27.5

#4: Hypothesis Testing Student’s T-Test. Mean = 24.52 StDev = 4.2163 Degrees of Freedom v = n a – 1 = 10 -1 = 9

#4: Hypothesis Testing T-Statistic:

#4: Hypothesis Testing Use t-statistic in CDF to find probability. Answer = 9.6%

#5: Hypothesis Testing 2 You are measuring the effectiveness of a new catalyst on a reaction with a great deal of normally distributed variability. You measure the time to 99% conversion of your reactants with both your new and old catalyst for several experimental runs and find the following data: Given this data, what is the probability that the new catalyst is more effective than the old? What is the probability that they are equally effective? Old (min) New (min) 9.67 11.51 11.43 9.76 10.41 10.82 10.05 10.27 10.52 8.66 10.13 12.78 9.54 9.92 8.00 10.63 8.35 10.81 11.18 10.85 10.38 9.90 8.90 9.94 10.52 8.70 9.52 10.80 9.42 10.23 10.61 9.33 9.40 9.14 8.55 11.14 9.41 10.35 9.86 9.57 8.05 6.47

#5: Hypothesis Testing 2 Mean A = 10.25, Mean B = 9.50 StDev A = 1.071, StDev B = 1.066 Number A = 22, Number B = 20 Degrees of Freedom v = n a + n b – 2 = 40

#5: Hypothesis Testing 2 T-Statistic:

#5: Hypothesis Testing 2 Simple rule: – Greater or less than tests use one tail (two unequal areas) and you can easily know which % you want to use by looking at the means. – Equal test uses two equal tails. For T-CDF with v = 40 and at t- statistic of -2.295, P = 2.7%. P that new catalyst is more effective is a one tail test. More effective (one tail) = 100% - 2.7% = 97% Equal (two tail) = 2*2.7% = 5%

#6: Non-Linear Fit The rate of population growth in a bacteria culture are found to be: It is thought that this data could be fit to the equation: Rate=b1*sin(b2*t) where b1 and b2 are constants to be determined and t is time. Determine the least squares estimated values for b1 and b2 and give an appropriate confidence interval for a confidence level of 90%. Also, what would you anticipate the rate to be at 24 hr? What would the confidence interval for a 95% confidence level be at 24 hr? Time (hr) Rate (SRU) 0 0.3158 0.6316 0.9474 1.2632 1.5789 1.8947 2.2105 2.5263 2.8421 3.1579 3.4737 3.7895 4.1053 4.4211 4.7368 5.0526 5.3684 5.6842 6.0000 0.0078 0.2993 0.1895 0.3645 0.3097 0.2532 0.3469 0.3726 0.0260 -0.0107 -0.0246 -0.0623 -0.2936 -0.3387 -0.2570 -0.4667 -0.2095 -0.1778 -0.2522 -0.0271

#6: Non-Linear Fit

%Anthony Butterfield 2009 %Example of nonlinear fit with CIs clear close all b(1)=1/3; b(2)=1; re=0.1; %random noise strength x=linspace(0,6,20)'; %x data for fitting x2=linspace(0,6,100)'; %x data for plotting n=length(x); y=b(1)*sin(b(2)*x)+re*randn(n,1); %y data for fitting, note the random error added in to make it realistic yt=b(1)*sin(b(2)*x2); %theoretical y data for plotting [beta r J]=nlinfit(x,y,@nlinfitsin,[1 1]); %numerically performs a nonlinear fit bci = nlparci(beta,r,J); %returns the c.i. for the parameters, beta [ypred,delta] = nlpredci(@nlinfitsin,x2,beta,r,J); %returns a predicted y and the c.i. for each y [ypred,delta] = nlpredci(@nlinfitsin,x2,beta,r,J); %returns a predicted y and the c.i. for each y disp('Fit to equation: y = b1 sin(b2 * x)') disp(' x data y data') for i=1:n txt=sprintf(' %5.3f %5.3f',x(i),y(i)); disp(txt) end txt=sprintf('b1 was %3.1f, and is estimated to be: %f ± %f (95% CL)',b(1),beta(1),abs(beta(1)-bci(1,1))); disp(txt) txt=sprintf('b2 was %3.1f, and is estimated to be: %f ± %f (95% CL)',b(2),beta(2),abs(beta(2)-bci(2,1))); disp(txt) figure(1) hold on grid on scatter(x,y,10,'r') plot(x2,yt,'Color',[1 0.5 0]) %just wanted to give you an example of how to change the line color to something not preset plot(x2,ypred,'b',x2,ypred+delta,'b:',x2,ypred-delta,'b:') hold off

#6: Non-Linear Fit nlparci In “theory” b1 = 0.3; estimated b1 = 0.35 ± 0.05 (90% CL) In “theory” b2 = 1.0; estimated b2 = 1.04 ± 0.04 (90% CL) nlpredci At 24 hr “theory” predicts: Rate = -0.3019 Fit predicts: Rate = -0.1090 ± 0.3839 (95% CL)

Data Analysis Examples Anthony E. Butterfield CH EN 4903-1.

Similar presentations

Presentation on theme: "Data Analysis Examples Anthony E. Butterfield CH EN 4903-1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Analysis Examples Anthony E. Butterfield CH EN 4903-1.

Similar presentations

Presentation on theme: "Data Analysis Examples Anthony E. Butterfield CH EN 4903-1."— Presentation transcript:

Similar presentations

About project

Feedback