Statistics Overview ©2010 Dr. B. C. Paul. Why Are Statistics Important to Engineers Engineers build models (often mathematical models) of systems and.

1 Statistics Overview ©2010 Dr. B. C. Paul

2 Why Are Statistics Important to Engineers Engineers build models (often mathematical models) of systems and things that we cannot screw-up on and learn the hard way.

3 Modeling We build a mathematical model of the situation and then do the math to see if it is going to work for us in the real world We may not think of it but most of our engineering design equations are mathematical models that were fit to actual data long ago – Newtonian physics (we call them laws now) – Darcy’s law and the Bernoulli Equation

4 How do You Decide if a Mathematical Model Fits What You See? Because you usually can’t measure 100% accurate or don’t think of or can’t consider every minor effect – Real results tend to be distributed around our potential mathematical models Statistical models consider a distribution of answers around an underlying trend – You can know the shape and spread of the variation without knowing the cause

5 Example If I have a random number generator that produces numbers between 1 and 100, what value is most likely? If I take 25 of those random numbers what will the average value most likely be close to?

6 What Did You Assume to Get Those Answers? You assumed how those values were distributed – You considered what was called a uniform distribution (all numbers are equally likely to come up) – Statistics begins with a series of standard mathematical distributions We try to pick one that most nearly matches our reality

7 Getting Your Answers You also assumed that the numbers were taken from that distribution at random – ie no one is cherry picking any values preferentially to any other – One of the reasons that statisticians get so crazy if they think someone is Cherry Picking the sample Root of all Statistics is that you assume reality follows a standard mathematical distribution and the part we see was picked at random from that distribution

8 How Do We Come Up With What Distribution Closely Resembles Our Reality? Process Starts with Figuring Out Which of Our Standard Model Distributions it is Three Levels of Effort Say “I Believe” and assume one – Most commonly done with “Normal Distribution” - “Bell Curve” – Many things tend to be normally distributed – Strength of past experience becomes rationale Also have people who do it without having any idea what they have done – Standard statistics is built around normal distribution

9 Levels of Effort Level 2 – Study the distribution to see if we are doing something terrible – Common approach is called a “Histogram” it’s a bar graph that we plot our data on so we can look at it – Also have things like probability paper where you plot your data and see if you get a straight line

10 Effort Level 3 Use statistical techniques to test whether our sample data is like a set that could reasonably be pulled from some standard distribution – Often our goodness of fit tests All three levels of effort have some degree of custom for their use in some practices

11 Measuring Properties of Distributions Put sample data into a standard equation that generates a number – Often actually call that number a statistic – Measures some property of the distribution that the data was taken from Some statistics have obvious tangible meaning – Example - Mean - mathematical average value of the sample or population

12 Calculating a Mean (or simple average) Add up all the numbers and then divide by how ever many numbers you added Example – Numbers 5, 10, 15, 20, 25 – What is the Mean? Calculate – (5 + 10 + 15 + 20 + 25)/5 – Numerator totals to 75 – Denominator is the number of values I put in – Divide the total by the number of values put in – Answer is 15 (the Mean or Average Value)

13 Statisticians Need Confusing Ways to Write Equations X i means a sample value – The i subscript tells you whether it was the first, second, third etc sample From example on last slide we know X 2 was the second number we looked at which was 10 Σ means the sum of a series of values n means the number of samples considered Thus we write the formula for mean as – We of course also have a special symbol for a mean –

14 More Measurements Mode – The value that has the greatest chance of coming up Example – If I have 10 people who are 5’10” – 2 people who are 4’3” – 2 people who are 6’10” – If you pick a person at random from my group what height will person most likely be?

15 More Measures Median – Half of the values are higher - half are lower Mean, Median, and Mode all seem to have somewhat obvious physical meanings Other statistics are less obvious – Variance – A number that comes out of a formula that tells you how spread out the distribution is Square root of variance is Standard Deviation – Average difference between a sample and the mean value

16 The Standard Deviation Standard Deviation is the average difference between individual samples and the mean What does it mean? Take each sample number, subtract the average sample Value from it, square the result, do this for every number And add up the result, then divide the result by one less Than the number of samples you took, and then take the Square root of that value.

17 As a Practical Matter That’s a Pain I have to compute the average before I can do the math for standard deviation Alternative Formula Tells you keep track of two number 1- Take each number square it and then add the squares up 2- Take each number and add them up and then square the total

18 Getting Standard Deviation Statistical Calculators have multiple memories – They add up numbers in one memory – They square and add up numbers in another – They total entries in another – They then apply the standard deviation formula Of course can also use SPSS

19 Types of Distributions Idea is that we try to approximate reality with a mathematically defined distribution – Then we can use mathematical operations to predict our answers Distributions that often fit reality – Normal Distribution (developed in 1733) Bell Curve – Uniform Distribution – Binomial Distribution – T Distribution – Qui Square Distribution – Lognormal Distribution

20 Derived Distributions T distribution, Qui Squared, and Lognormal Distributions are all derived from the Normal Distribution for specific types of situations

21 Normal Distribution Shaped Like Formula

22 Symmetric Distributions with a Central Tendency Normal Distribution is classic example – Most of the chances are right near the center of the distribution Frequency drops off to sides Mode is at the Center of the Distribution – Distribution is mirror image about its center Allows to just compute one side Median is Mean is the Mode A lot of reality has central tendency with relatively symmetric sides – T distribution like that too Sides slope off a little differently

23 Why the Normal Distribution One of the first mathematically defined distributions that was a real good fit – People developed other formulas and distributions from calculations done on the normal distribution T distribution and Qui Square Distribution both result from performing mathematical operations on samples of a normal distribution – Normal Distribution was first to press with a distribution that was heavy at the center and symmetric

24 Reality 101 for Statistical Distributions Probably no such thing as a real normal distribution in life Even if there were we almost never count each and every member of the population so you’d never know if it was Statistical Distributions let us take limited data – see what it approximately is – Then use the defined mathematical model to suddenly know everything about it

25 Back to Why the Normal Distribution Big part of Real World is Central Tendency and Symmetric Found that calculations done with a normal distribution were robust – Minor lack of fit in real world data doesn’t change the answers much – Thus works on almost anything with central tendency and near symmetric

26 Most Common Lack of Fit Not Symmetric Robustness covers a Little skewness This type of shape can be fit with a Distribution adapted from normal called lognormal If you take averages of about 25 samples From this – the averages will be normal (averaging normalizes) Taking logarithms of the data will make The transformed distribution normal Taking square-root will normalize A few others

27 Multi-Modal Distributions These types of distributions are often 3 different normally Distributed families over-lying each other Finding what is causing the three families often helps us To better understand our world

28 Uniform Distribution All values within some range (which may or may not be plus or minus infinity) are equally likely Distribution has no central tendency Tends to be associated with truly random events (or at least events where the underlying cause is eluding our mathematical modeling)

29 Characteristics of Uniform Distribution Because all values are equally likely it has no mode Mean is at the center of the range Uniform is still symmetric about Mean so the Median and Mean are the same Standard Deviation is 1/4th the range (if range is infinite obviously that’s not defined) Variance is Standard Deviation Squared

30 Binomial Distribution Outcomes that are either off or on – Clearly describes computers and digital data Many things either work or they don’t – Mining dealing with whether our trucks are in working order – Water treatment plant – water purification train is working or not working – Coin tosses are heads or tails

31 New Problem Can’t talk about means, modes, and medians because outcome has no continuous distribution Want to know what fraction of the outcomes are “yes” – P = 0.85 85% of members of bimodal population are positive Usually interested in what chances are that we can take 5 members out of the population and have them all positive – Example if I have 5 mining trucks how much of the time will all 5 be running?

32 The Ordinate Problem How continuously distributed are our outcomes? – Our number line is continuous so at first glance we almost assumed everything was continuous When and what if they are not This usually doesn’t take a very smart statistician to figure out Some things are yes or no distributed – Use binomial distribution model Da!

33 Some Things are Integer Distributed Continuity really is a function of observational scale – According to quantum physics everything is made of integer numbers of discrete quanta – At our observation scale the little integer jumps are perhaps so small we cannot even measure them – Many times integer continuity is negligible

34 What If Integer Continuity is Not Negligible? Happens when have small numbers or integer distributed data – How does one deal with teacher rankings in classes of 5 students? Our scale of observation is integer Our sample size is small enough we can’t mask it If it was a class of 500 students we could probably model outcomes rather well as if continuous Non-Parametric Statistical Models

35 Using Statistics Confidence Intervals and Hypothesis Tests What would you say if we did a coin toss and I came up heads and won What if I did it to you 50 times in a row I something differs too much from the expected value you question the things you assumed – Null hypothesis is nothing is going on – Rejecting the null hypothesis means you question the fundamental assumptions.

36 Statistical Tests What is a confidence interval? If I take a sample where is it most Likely to come from. Suppose I pull a sample and its value is from Way out here? What do I know? - that was pretty unlikely to happen – in fact – at some Point I’m going to wonder whether I really got it from that population Confidence Interval Problems all have the flavor of deciding how far out in The tails, how rare, the sample is or would be if you could get it.

37 Too Many Normal Distributions Normal distribution is defined by its mean and standard deviation – There are endless possibilities We start by standardizing our results to a standard normal distribution with a mean of 0 and an stdev of 1. – Has the form

38 Just Any Normal Distribution Our Value X Our formula converts that point To an equal point on the standard Normal distribution. 0 Stdev=1

39 Once We Are On A Standard Normal Distribution we look at how extreme a value we have What % of the Values are More Extreme than this?

40 Preparing for Rainfall Wendy Wetone has just designed a storm sewer system for a new housing project –C–Culverts and intakes will handle a 2.5 inch rainfall in 24 hours –T–The average big rainfall even in the area is only 1.25 inches Wendy is ok Right? If the roads and homes in an area are going to wash out maybe being ready for an average rain isn’t good enough

41 Reality for Major Rainfall Events Average is 1.25 inches, but suppose there is a 1 inch standard deviation μ = 1.25 σ = 1 How would we know Something like this? We built a model From weather Records. Is there enough of a chance up hear that I should be getting heart-burn over this design?

42 We Know How to Solve This One Normal Distribution is fully defined by a formula We only need to know the average (in this case 1.25) and the variance (standard deviation squared – easy when standard deviation is one)

43 What That Formula Does Y is a probability value (chance of occurrence) X in this case is a rainfall event Rather obviously we are interested in rainfall events greater than 2.5 inches –G–Guess that means x is 2.5 Problem – Formula gives probability for only a discrete value – ie it will give us the probability of a 2.5 inch rain event –W–We are in fact worried about any event that exceeds our design capacity

44 That’s not a Problem for Us Smart Engineers Just Integrate the Function from 2.5 inches on up –I–In fact most statistical modeling is done on cumulative probability distributions (ie integrated areas on the probability density function) Just one little problem –N–Normal probability density function is one of those beasts that the math teachers don’t like to talk about – can’t get an analytical integrated solution

45 That’s Only a Problem for Mathematicians We have numeric integration Ok maybe that is a problem if we have to integrate that thing –R–Remember – desk top computers are recent vintage Do you have a numeric integration package on your computer even now? Normal Distribution dates from 1733 so know someone created tables of numeric integration

46 Normal Distribution Table

47 Converting to a Value on Standard Normal Distribution What we want to know is what are the chances of a rainfall event exceeding our drainage system design –I–Ie what percentage of big rainstorms will exceed 2.5 inches (on a distribution with mean of 1.25 and standard deviation of 1) Convert 2.5 inches to an equivalent value on standard normal distribution –T–The area above that value in the curve will be the same as our actual distribution.

48 Magic Conversion Formula

49 Now Its Look Up Time Prob = 0.8944

50 Results Table shows that from minus infinity to 1.25 there is 0.8944 –I–Ie 0.1056 is above 1.25 English Translation –T–There is a 10.56% chance that a large rainfall event will exceed the design capacity of our drainage system –S–Sounds like Wendy might be doing some design work over

51 Basis for Rainfall Events 10% chance called a 10 year storm (distribution of years largest storms) 0.05% chance called a 20 year storm 0.01% chance called a 100 year storm When say it is designed for a 100 year flood it doesn’t mean it only happens every 100 years –I–It means 1% chance in any given year –P–Problem with other thinking is if you had a big flood 5 years ago that must mean there is no chance it will ever happen again in your lifetime (Wrong!)

52 Ore Grade Control Orville Orman is planning a truck fleet to haul his copper ore out of his mine – Some rock will have so little copper in it that it would cost more to process than its worth This stuff is going to get put aside – Other pay rock will be carried to the processing plant Commonly have ore and waste truck fleets but need to know how much of each type of rock you will have to design.

53 Orville’s Ore Orville knows average grade is 0.95% Cu Standard Deviation is say 0.5% Cu Cut-Off Grade (point at which ore costs more to process than Cu will sell for) is 0.25% What percentage of Orville’s ore is below cut- off grade?

54 The Situation μ = 0.95 σ =.5 0.25 How much of My rock is Down here?

55 Oh We Are Hot Our critical x value is 0.25 We will convert this to a “Z score” from the standard normal distribution We will then look up in the table how much of our distribution is from minus infinity to our Z We will then tell our truck planners how much rock to prepare for

56 Crunch Away Go to the Table Table Says! 0.0808 About 8.1% of our Rock is Below Cut-Off

57 Previous Examples Called One Tailed Tests – Our Civil Engineers were concerned about events larger than some amount An upper tail test – Our Mining Engineers were concerned about tonnage below cut-off A lower tail test What if interest in either too much or too little – Typical of a machine tolerance problem

58 Tolerance Benjamin Bidwell would like to bid on a DOD order for machined shafts – The spec says 1 inch +/- 0.005 inches – Benjamin knows his men and equipment can put any chosen part size within a standard deviation of 0.0025 inches – He figures he can put in a winning bid provided no more than 3% of the pieces he makes have to be rejected Can Benjamin put in a winner bid on this order?

59 The Situation σ = 0.0025 μ = 1 1.005 0.995 How many Products are Out here In the Tails?

60 We Know What to Do Convert those limits to Z scores Start with the top limit Table Look Up Says 0.9773 or 2.27% will be too large Now we use our knowledge – this distribution and tolerance is Symmetric - ie 2.27% on the bottom end That Sucks - about 4.54% of products will be out of Spec

61 The Hypothesis Test Hubbert’s Hammers makes clobber balls for use in a doll recycling plant. Hardness is important to determining the longevity of the hammer balls. Herby has been getting some customer complaints about his balls not holding up and pulls a few off the assembly line for testing. He gets values of 3, 3.6, 4.2, 4.1, 2.7, 4.7, and 4.3. The balls are suppose to have a hardness of 4.5. Does Herby have a problem?

62 Herby Runs to SPSS, enters his sample data He gets an average of 3.8 and a Stdev of 0.73.

63 Interpreting the Data Everyone knows not every ball will be 4.5 hardness, but on average they need to be. Herby knows that if he ran to his assembly line and grabbed another 7 balls at random he would get a different number.

64 Herby’s World μ σ Herby knows that 95% of the Time a sample of 7 grab balls Will be within 1.96 standard Deviation units of the true mean. (He’s spent too much time Looking at normal distribution Tables)

65 Herby Formulates a “Hypothesis Test” Herby thinks the endurance of his balls has gone down. The “null hypothesis” is that this one sad looking sample is not enough to conclude the mean ball hardness on the assembly line has changed – If the sample falls within 1.96 standard deviation units of the target mean of 4.5 Herby can be 95% certain the spec on his assembly line is still in tolerance – If not Herby will reject the “null hypothesis” and conclude that his assembly line is screwed Oh gosh – get out the crosses and garlic – where starting to sound like statisticians.

66 The “Alpha Level” In reality Herby could grab 7 balls on a perfectly normal assembly line and get any value – Yet Herby is going to declare a disaster if he does not come out within 1.96 standard deviation units of his target value Because in the real world a sample could come from anywhere, one of the decisions we have to make is how willing are we to be wrong. – This is called setting our Alpha Level – How great is the chance that we will reject the null hypothesis when we shouldn’t have

67 OK – Lets Get on With Herby’s Test Plug into the Equation Wholly Marshmallows! What do we use for standard deviation? – Our standard deviation was the standard deviation for individual samples – not averages

68 What’s the Big Deal About Individual Samples and Averages? In a large general ed class what kind of range do you get on peoples test scores? Ever noticed that certain professors test average scores tend to come out about the same value year after year? Point- In a random sample, the standard deviation of an average will always be less than the standard deviation of individual values.

69 OK- I Believe – Now Get Me the Dogone Standard Deviation For a random sample the standard deviation of the mean is If you think I’m going to try showing you the proof your out of Your mind. Where n= # samples Used in the mean

70 OK – Let Roll Our standard deviation of the mean is Plug into the magic equation Oh Crud – The Assembly Line is Turning Out Weak Balls!

71 What if We Had Set A Higher Alpha Level Plug and Chug for 1% Alpha Level Now we look ok Note from standard deviation formula that larger samples suck in the standard deviation – If there really is a problem with Herby’s balls – how big a sample will it take to see the problem?

72 Figuring Out a Required Sample Size Herby’s assembly line is suppose to turn out balls of 4.5 hardness – How far out of spec can Herby Tolerate Things? Suppose Herby decides he needs his estimates to be good to within 0.5 hardness units. Next Herby has to decide how much of a chance he is willing to take that he will shut down the line and issue recalls when nothing is really wrong at all. – Suppose Herby wants 99% confidence (ie – alpha level is 1%) 99% of a normal distribution is within +/- 2.575 standard deviation unit of the true mean

73 Herby’s Task Herby needs to detect a 0.5 hardness unit departure from the 4.5 target hardness but still have a less than 1% of shutting the line down by mistake. Formula is Note that this is just the plus or minus part of our confidence interval formula Where L is the min error that must be Be detected Z is the Z value for our alpha level

74 Doing the Math First solve for our sample size needed Then plug into the equation and solve N=14.13 as a practical matter means need sample of 15 To actually achieve desired accuracy with an acceptable risk. Note – this also implies that higher confidence requires more money spent On sampling and testing.

75 Herby’s Assembly Line Analysis to Date Herby has grabbed a sample of 7 balls off the assembly line With this sample Herby is 95% sure he has a problem with the hardness of the balls being produced When Herby checked for only a 1% chance that he was going to shut the line down for no reason at all Herby’s sample could not furnish him enough certainty To detect a 0.5 unit departure from the target hardness of 4.5 and doing so with no more than a 1% chance of stopping the line for a quirk of sampling Herby must take a grab of 15 balls off the assembly line

76 Comparing Two Samples Red Rooster Carburetor company would like to claim that their carburetors improve fuel economy by 20% when their replacement carburetors are used. Red Rooster assembles teams of drivers to drive two sets of cars – one that has been retrofit with Red Rooster Carburetors and one that uses the manufactures original carburetors

77 Data Begins Coming In The standard vehicles came in with an average of 21.4 mpg and stdev of 6.1 from 60 car and driver combinations The Rooster Carburetor Vehicles came in with 29.5 mpg and stdev of 6.2 from 41 car and driver combinations

78 Setting Up A Test If the average gas mileage for the no Rooster set is improved 20% its adjusted mean is 25.68 The Null Hypothesis is that the mean of cars gas mileage is the same (after the 20% adjustment) – Set the test up to reject and conclude the Rooster Carburetor set is more than 20% better if the test statistic is extreme enough

79 The Test Statistic We will let Y1 be our Rooster carburetor We will let Y2 be our Standard Vehicles with 20% improvement If Y1 is bigger than Y2 it will cause Z to become increasingly large. If Z is So far out in the upper tail that there is little chance it could be a random Event we will reject the null hypothesis and conclude that the Red Rooster Carburetors do improve fuel economy by 20%

80 A Note on Our Test Statistic The denominator is what we call A pooled estimate of variance Strictly speaking the test is assuming That the two populations have The same variance. If the variances Are close it is accepted practice to To allow the lye as close enough. How much different can the variances be and still be about the same? Actually a bit of a judgment call but I’m not worried about 6.1 and 6.2

81 Plug and Chug Z=3.06 do to the table to look up how much of The normal distribution is beyond 3.06 standard Deviation units

82 Do A Table Look Up Area under the curve is 0.99889 or 0.00111 ie 0.111% of the distribution is Further out. There is about 1/10 th of 1% chance that the observed result is A fluke. Action – Reject the null hypothesis on conclude that the Red Rooster Carburetor Does improve fuel economy by more than 20%

83 Paired Experiments What if Red Rooster Carburetors is a group of students who designed their carburetor in the machine shop at school – The idea that they can go out and build 41 carburetors and send 101 cars and drivers out to burn up a bunch of gas is kind of “iffy” One Way to Get Sample Size Down is to get rid of some of that random variance – What if we used the same car and driver with and without the Red Rooster Carburetor? We just took out two sources of scatter in the data – This is called a Paired Experiment

84 Paired Experiments Needs to be a solid basis for pairing – Can make the numbers crunch pairing up anything Experiment – I want to show that students from Illinois are smarter than students from Missouri. I give a test to 40 SIU seniors that are Illinois residents. I then give the same test to 40 Kindergarteners from Missouri. I match the students up in the order in which tests were turned in and do my test. – If my test statistic shows that my Illinois students scored higher are you willing to believe that Illinois students are smarter than Missouri students?

85 OK that last one raises some concerns about the Intelligence of who ever designed that experiment The basis for pairing should be that we are pairing like items to eliminate variation from what ever we are trying to “write out” of the experiment by pairing. Suppose we make one Red Rooster Carburetor to go on a Dodge Neon and I have 10 students drive the vehicle over the same road course before adding the carburetor. I then add the carburetor and have the same 10 students drive the same car over the same course. I will then pair the results before and after adding the carburetor

86 Looking at My Results Standard Dodge Neon – Don Dork 26.5 – Kurt Kurtosis 25.7 – Angela Airhead 25.2 – Mark Maniac 23.9 – Katty Careful 28.1 – Jim Junkyard 26.2 – Steve Stickshift 25.9 – Burt Bunion 27.1 – Saedy Sadist 26.7 – Melvin Mizer 28.2 Neon with RR Carb – Don Dork 32.1 – Kurt Kurtosis 30.1 – Angela Airhead 31.8 – Mark Maniac 29.8 – Katty Careful 34.2 – Jim Junkyard 30.6 – Steve Stickshift 31.2 – Burt Bunion 33.2 – Saedy Sadist 32.8 – Melvin Mizer 34.5

87 The test requires us to get the differences within our pairing Don Dork Result – 32.1- 26.5 = 5.6 Kurt Kurtosis - 30.1 – 25.7 = 4.4 And so on through the pairing.

88 Tuning in a Little More Red Rooster actually wants to claim a 20% increase in gas mileage so we may be able to normalize out some more variance by directly measuring % improvement. – Results 21.13%, 17.12%, 26.19%, 24.68%, 21.71%, 16.79%, 20.48%, 22.51%, 22.85%, 22.34% We also are interested in how much these values differ from 20% improvement so we can subtract 20% from each value – 1.13%, -2.88%, 6.19%, 4.68%, 1.71%, -3.21%, 0.48%, 2.51%, 2.85%, 2.34% Plug the Data into SPSS to get Mean and Standard Deviation – Could also use Excel and function =average(data range) and =stdev(data range) for standard deviation

89 The Hypothesis H o = there is no difference between our set of numbers and 0 – Specifically means we cannot be sure we have over 20% improvement Rejecting the null hypothesis means we are sure we have over 20% improvement

90 The Test Statistic for a Paired Experiment D with the bar over it is the average Difference (in this case 1.58%) Sd is the standard deviation of the Individual differences as calculated (in this case 2.95%) N is of course the number of samples (in this case 10) Crunching the number we get 1.69

91 Looking Up Our Result We have n-1 degrees of freedom (in this case 9) 1.69 is between 90 and 95% Significant. We cannot reject The null hypothesis at the 95% Level, but we can at about 93% confidence.

92 Limitations of Our Results 93% confidence we have over 20% improvement may fall short of the proof some people would demand – One way to strengthen the conclusion is more samples (the standard deviation shrinks with more samples and since it is in the denominator that makes t bigger) We may also be concerned that all our tests were on a Dodge Neon which furnishes no data on whether the result would be improved on other cars as well

