Selected Statistics Examples from Lectures
Matlab: Histograms >> y = [ ]’; >> mean(y) ans = >> var(y) ans = >> std(y) ans = >> hist(y,9) histogram plot with 9 bins >> n = hist(y,9) store result in vector n >> x = [ ]’ >> n = hist(y,x) create histogram with bin centers specified by vector x
Probability Examples l Probability that at least one coin will turn heads up from five tossed coins »Number of outcomes: 2 5 = 32 »Probability of each outcome: 1/32 »Probability of no heads: P(A C ) = 1/32 »Probability at least one head: P(A) = 1-P(A C ) = 31/32 l Probability of getting an odd number or a number less than 4 from a single dice toss »Probability of odd number: P(A) = 3/6 »Probability of number less than 4: P(B) = 3/6 »Probability of both: »Probability of either: complement set
Permutation Examples l Process for manufacturing polymer thin films »Compute the probability that the first 6 films will be too thin and the next 4 films will be too thick if the thickness is a random variable with the mean equal to the desired thickness »n 1 = 6, n 2 = 4 »Probability l Encryption cipher »Letters arranged in five-letter words: n = 26, k = 5 »Total number of different words: n k = 26 5 = 11,881,376 »Total number of different words containing each letter no more than once:
Combination Examples l Effect of repetitions »Three letters a, b, c taken two at a time (n = 3, k = 2) »Combinations without repetition »Combinations with repetitions l 500 thin films taken 5 at a time »Combinations without repetitions
Matlab: Permutations and Combinations >> perms([2 4 6]) all possible permutations of 2, 4, >> randperm(6) returns one possible permutation of >> nchoosek(5,4) number of combinations of 5 things taken 4 at a time without repetitions ans = 5 >> nchoosek(2:2:10,4) all possible combinations of 2, 4, 6, 8, 10 taken 4 at a time without repetitions
Continuous Distribution Example l Probability density function l Cumulative distribution function l Probability of events
Matlab: Normal Distribution l Normal distribution: normpdf(x,mu,sigma) »normpdf(8,10,2) ans = »normpdf(9,10,2) ans = »normpdf(8,10,4) ans = l Normal cumulative distribution: normcdf(x,mu,sigma) »normcdf(8,10,2) ans = »normcdf(12,10,2) ans = l Inverse normal cumulative distribution: norminv(p,mu,sigma) »norminv([ ],10,2) ans = l Random number from normal distribution: normrnd(mu,sigma,v) »normrnd(10,2,[1 5]) ans =
Matlab: Normal Distribution Example l The temperature of a bioreactor follows a normal distribution with an average temperature of 30 o C and a standard deviation of 1 o C. What percentage of the reactor operating time will the temperature be within +/-0.5 o C of the average? l Calculate probability at 29.5 o C and 30.5 o C, then calculate the difference: »p=normcdf([ ],30,1) p = [ ] »p(2) – p(1) l The reactor temperature will be within +/- 0.5 o C of the average ~38% of the operating time
Discrete Distribution Example l Matlab function: unicdf(x,n), >> x = (0:6); >> y = unidcdf(x,6); >> stairs(x,y)
Poisson Distribution Example l Probability of a defective thin polymer film p = 0.01 l What is the probability of more than 2 defects in a lot of 100 samples? Binomial distribution: = np = (100)(0.01) = 1 l Since p <<1, can use Poisson distribution to approximate the solution.
Matlab: Maximum Likelihood l In a chemical vapor deposition process to make a solar cell, there is an 87% probability that a sufficient amount of silicon will be deposited in each cell. Estimate the maximum likelihood of success in 5000 cells. l s = binornd(n,p) – randomly generate the number of positive outcomes given n trials each with probability p of success »s = binornd(5000,0.87) s=4338 l phat = binofit(s,n) – returns the maximum likelihood estimate given s successful outcomes in n trials »phat = binofit(s,5000) phat =
Mean Confidence Interval Example l Measurements of polymer molecular weight (scaled by ) l Confidence interval
Variance Confidence Interval Example l Measurements of polymer molecular weight (scaled by ) l Confidence interval
Matlab: Confidence Intervals >> [muhat,sigmahat,muci,sigmaci] = normfit(data,alpha) l data: vector or matrix of data l alpha: confidence level = 1-alpha l muhat: estimated mean l sigmahat: estimated standard deviation l muci: confidence interval on the mean l sigmaci: confidence interval on the standard deviation >> [muhat,sigmahat,muci,sigmaci] = normfit([ ],0.05) muhat = sigmahat = muci = sigmaci =
Mean Hypothesis Test Example l Measurements of polymer molecular weight Hypothesis: 0 = 1.3 instead of the alternative 1 < 0 Significance level: = 0.10 l Degrees of freedom: m = 9 l Critical value l Sample t l Reject hypothesis
Polymerization Reactor Control l Lab measurements »Viscosity and monomer content of polymer every four hours »Three monomer content measurements: 23%, 18%, 22% l Mean content control »Expected mean and known variance: 0 = 20%, = 1 »Control limits: »Sample mean: 21% process is in control Monomers Catalysts Solvent Hydrogen On-line measurements Lab measurements Polymer Monomer Content Viscosity
Goodness of Fit Example l Maximum likelihood estimates
Goodness of Fit Example cont.
Linear Regression Example l Reaction rate data l Sample variances and covariance l Linear regression Confidence interval for = 0.95 and m = 6 Experiment Reactant Concentration Rate
Matlab: Linear Regression Example >> c = [ ]; >> r = [ ]; >> [k, kint] = regress(r’, [ones(length(c), 1), c’], 0.05) k = kint =
Correlation Analysis Example l Polymerization rate data l Correlation coefficient = 5%, m = 6 c = 1.94 l Compute t-statistic l Reject hypothesis that the hydrogen concentration and the polymerization rate are uncorrelated Experiment Hydrogen Concentration Polymerization rate
Matlab: Correlation Analysis Example >> h = [ ]; >> p = [ ]; >> R = corrcoef(h,p) R= >> t = ttest(h,p) t=1 (reject hypothesis)