Welcome to Week 08 College Statistics http://media.dcnews.ro/image/201109/w670/statistics.jpg
http://www. andreabalt http://www.andreabalt.com/wp-content/uploads/2014/02/leonardo-da-vinci-06.jpg http://www.howtodrawjourney.com/images/da-vinci-cats.jpg
Now, for something even more profound…
TRUTH
Question on a true/false test: Truth Question on a true/false test: US presidents are named “Barack” T ___ F ___
Question on a true/false test: Truth Question on a true/false test: US presidents are named “George” T ___ F ___
Question on a true/false test: Truth Question on a true/false test: US presidents are male T ___ F ___
Question on a true/false test: Truth Question on a true/false test: US presidents are at least 35 years old T ___ F ___
So, it is easier to be “false” than to be “true” Truth So, it is easier to be “false” than to be “true”
Truth So, it is easier to be “false” than to be “true” To be “true” a statement must be true in all cases
Truth So, it is easier to be “false” than to be “true” To be “true” a statement must be true in all cases If not, it is “false”
We live in a world where the “truth” is not always known The Bad News… We live in a world where the “truth” is not always known http://www.testically.org/wp-content/uploads/2010/11/hmm.jpg
In a court of law, you never REALLY know if someone is guilty or not www.torontoinjurylawyerblog.com
Even with an eyewitness, the witness could be: Guilt Even with an eyewitness, the witness could be: Mistaken Lying
There is always a level of UNCERTAINTY Guilt There is always a level of UNCERTAINTY
Two standards of proof of guilt in US courts of law: Criminal cases – “beyond a reasonable doubt” Civil cases – “a preponderance of the evidence”
In probability terms, legal authorities estimate: Guilt In probability terms, legal authorities estimate: “beyond a reasonable doubt” is 98-99% likelihood of guilt based on the evidence (Thanks to Ronald B. Standler)
(Thanks to Ronald B. Standler) Guilt In probability terms, legal authorities estimate: “a preponderance of the evidence” is just a hair over 50% likelihood of guilt (Thanks to Ronald B. Standler)
There is always the possibility of being WRONG Guilt There is always the possibility of being WRONG http://www.gettyimages.com/detail/photo/businessman-crying-close-up-high-res-stock-photography/AB34183
Guilt In the US, a suspect is considered “innocent until proven guilty in a court of law” www.coloradospringsdivorceattorneyblog.com
Guilt BUT… if a suspect is not found guilty in court, are they called: - innocent ? - not guilty ? www.legaljuice.com
Guilt The suspect is called “not guilty” because the defense hasn’t proved their innocence… it is just that the prosecution was unable to prove their guilt!
Questions? http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests hypothesis In science, an “educated guess” is called a: hypothesis
Hypothesis Tests hypothesis test In science, using experimental evidence to see if it supports a hypothesis is called a: hypothesis test
Hypothesis Tests Vikings in Newfoundland? http://i.cbc.ca/1.3517691.1459555583!/fileImage/httpImage/image.JPG_gen/derivatives/original_620/digging-at-point-rosee.JPG
Hypothesis Tests Our hypothesis was that we wouldn’t find anything…
Hypothesis Tests We rejected that hypothesis! http://www.cbc.ca/news/canada/newfoundland-labrador/vikings-newfoundland-1.3515747
Hypothesis Tests In practice, we often do hypothesis tests “undercover” as
Hypothesis Tests In practice, we often do hypothesis tests “undercover” as CONFIDENCE INTERVALS
Hypothesis Tests Suppose we had a 95% confidence interval: 5 ≤ µ ≤ 10 Suppose our hypothesis was that µ = 7 Is 7 a likely value for µ given our confidence interval?
Hypothesis Tests Because µ = 7 is in our confidence interval: 5 ≤ µ ≤ 10 It is a possible value given our data
Hypothesis Tests Because µ = 7 is in our confidence interval: 5 ≤ µ ≤ 10 It is a possible value given our data… but so are µ = 6 µ = 8 µ = 9.3 µ = 5.1 µ = 6.79431…
Hypothesis Tests What if the hypothesized value for µ was 11? 5 ≤ µ ≤ 10 We are 95% confident that µ cannot be 11 given our evidence
Hypothesis Tests We reject the hypothesis that µ = 11 if 5 ≤ µ ≤ 10 with 95% confidence
Hypothesis Tests We reject the hypothesis that µ = 11 if 5 ≤ µ ≤ 10 with 95% confidence We will be wrong to do this 5% of the time (100% - 95%)
Hypothesis Tests We reject the hypothesis that µ = 11 if 5 ≤ µ ≤ 10 with 95% confidence We will be wrong to do this 5% of the time (100% - 95%) The amount of time we are willing to be wrong is called our “α-level”
Hypothesis Tests The confidence interval can be used to test hypothesized values of µ using the mean, standard deviation and sample size of our sample data
Hypothesis Tests Whether we can reject a hypothesis or not depends on how variable our data are! Not too different… Very different!
(see why variability is important?) Hypothesis Tests (see why variability is important?)
Questions? http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests Rejecting a hypothesis is a strong statement We have evidence to show µ ≠ 11
Hypothesis Tests If the value is included in the confidence interval, you cannot make a strong statement We haven’t proved µ = 7 (because it could be a wide range of numbers within the interval)
Hypothesis Tests So, we merely “fail to reject” the hypothesis
Hypothesis Tests Our exercise on human temperature last week was a test of the hypothesis that normal human temperature is 98.6°
Hypothesis Tests 98.6°
Hypothesis Tests PROJECT QUESTION You have a hypothesis that normal human body temperature is 98.6° You have experimentally found that measured using an IR thermometer, the inside mouth temperature is between 89.9° and 93.6° with 95% confidence
Hypothesis Tests PROJECT QUESTION 89.9° < temp < 93.6° What do you decide about your hypothesis that human body temperature is 98.6°?
Hypothesis Tests PROJECT QUESTION 89.9° < temp < 93.6° Reject your hypothesis that human body temperature is 98.6° What is the probability that you are wrong to reject this hypothesis?
Hypothesis Tests PROJECT QUESTION 89.9° < temp < 93.6° Reject your hypothesis that human body temperature is 98.6° What is the probability that you are wrong to reject this hypothesis? 5%
Questions? http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests You need to answer an "Is there a difference" question Is there any difference between these two populations? Does some new process improve results?
Hypothesis Tests There is a TRUE (population) answer to your question
Hypothesis Tests You will NEVER find the true answer to most questions because of variability: in your measurements in the data itself in the measuring tool in the samples you get from your population
Hypothesis Tests Are the statistics demons mad at you today?
Hypothesis Tests Reality of life: things aren't clear, certain and constant They are fuzzy, uncertain and variable
Hypothesis Tests This is the basis of statistics - getting a measurement of the fuzziness - "variability"
Hypothesis Tests A hypothesis is a statement about the properties of the population
Hypothesis Tests It may be obtained from theory, hearsay, historical studies, etc.
Hypothesis Tests A null hypothesis states "there is no difference between populations" or "a process has no effect"
Hypothesis Tests It is symbolized: H0
Hypothesis Tests Because it is easier to prove something false than to prove it true… H0 is the hypothesis we want to reject
Hypothesis Tests We want to show the populations are different or the process has an effect - called the alternate hypothesis or Ha
Hypothesis Tests Usually we set Ha before H0, since it is the one we are interested in
Hypothesis Tests Null hypotheses about population means are typically like: μ = some value
(called one-tailed tests) Hypothesis Tests Alternative hypotheses about means can be: μ ≠ some value (called a two-tailed test) μ < some value μ > some value (called one-tailed tests)
Hypothesis Tests A two-tailed test will reject H0 either if the experimental values we get are too high or too low
Hypothesis Tests α is split between the upper and lower tails
Hypothesis Tests A one-tailed test will reject H0 only on the side we think is likely to be true
Hypothesis Tests You will be able to reject H0 more often for a one-tailed test – if you pick the right tail!
Hypothesis Tests PROJECT QUESTION Your owner's manual says you should be getting 30 mpg highway After owning the car for six months, you are only getting 27 mpg highway
Hypothesis Tests PROJECT QUESTION Is that different enough to reject the company's claim? What is your α-level? What is H0? What is Ha?
Hypothesis Tests PROJECT QUESTION Is that different enough to reject the company's claim? What is your α-level? 5% or 0.05 What is H0? What is Ha?
Hypothesis Tests PROJECT QUESTION Is that different enough to reject the company's claim? What is your α-level? 5% or 0.05 What is H0? μ = 30 mpg What is Ha?
Hypothesis Tests PROJECT QUESTION Is that different enough to reject the company's claim? What is your α-level? 5% or 0.05 What is H0? μ = 30 mpg What is Ha? μ < 30 mpg
We could also write it as: H0: μ ≥ 30 mpg Ha: μ < 30 mpg Hypothesis Tests PROJECT QUESTION We could also write it as: H0: μ ≥ 30 mpg Ha: μ < 30 mpg
Is this a one-tailed or a two- tailed test? Hypothesis Tests PROJECT QUESTION Is this a one-tailed or a two- tailed test?
Hypothesis Tests PROJECT QUESTION Is this a one-tailed or a two- tailed test? one-tailed Is it right-tailed or left-tailed?
Is it right-tailed or left-tailed? left-tailed Hypothesis Tests PROJECT QUESTION Is it right-tailed or left-tailed? left-tailed
Questions? http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests The experiment is designed to gather valid information to test the likelihood of that null hypothesis being true
Hypothesis Tests So, since we want to show the null hypothesis is NOT true, we want to show that getting the results we got (if the null hypothesis IS true) is very unlikely
If we get those “unlikely” data Hypothesis Tests If we get those “unlikely” data
Hypothesis Tests Then we reject the null hypothesis and have statistically proved our alternative hypothesis and
Hypothesis Tests CELEBRATE!
Hypothesis Tests But any experiment runs the risk of weird results The objective of hypothesis testing is to estimate the likelihood of weird results
Hypothesis Tests One type of error consists of rejecting a true hypothesis We call this a “Type 1 error”
Hypothesis Tests If this happens, people will accuse us of rigging our data to prove Ha So, we want this to happen very rarely
Hypothesis Tests The probability of a Type 1 error is called
Hypothesis Tests The probability of a Type 1 error is called an α-level
Hypothesis Tests Typically we use α = 0.05 (5%) or 0.01 (1%)
Hypothesis Tests If is crucial to set your α-level before you do the experiment or gather any data
Hypothesis Tests If is crucial to set your α-level before you do the experiment or gather any data Otherwise people will accuse you of setting the level to ensure rejecting H0
Hypothesis Tests You can make the opposite mistake: fail to reject H0 when it is false Called a Type 2 error The probability of this kind of error is denoted by β (beta)
Hypothesis Tests We HATE Type 2 errors because they mean we FAILED to prove what we wanted to prove! (Remember, we want to reject H0)
Hypothesis Tests Usually β is computed after the experiment (not determined in advance by the experimenter)
Hypothesis Tests Generally, the larger α value that you permit, the smaller β value you will end up with Conversely, if you demand a smaller α, you will usually get a larger β
Hypothesis Tests Other factors affecting β: sample size it’s harder to detect a difference if it’s really really tiny
Hypothesis Tests Likelihood of making the right decision and rejecting the (false) null hypothesis is: 1 - β called the “power of the test”
Hypothesis Tests For a given α value, we would like the test to be as "powerful" as possible, give us the best chance of rejecting a false null hypothesis
Which is more powerful, a one-tailed or a two-tailed test? Hypothesis Tests PROJECT QUESTION Which is more powerful, a one-tailed or a two-tailed test?
Hypothesis Tests PROJECT QUESTION Which is more powerful, a one-tailed or a two-tailed test? one-tailed (if you guess the right side)
Hypothesis Tests This setup allows us only to disprove a null hypothesis, never prove it
Hypothesis Tests We either disprove it, or we fail to disprove it
Hypothesis Tests We NEVER accept the null hypothesis
Hypothesis Tests "Fail to reject" the null hypothesis is the default-decision
Hypothesis Testing This results not from evidence in favor of the null hypothesis but from the absence of evidence against it
Hypothesis Tests Rejecting the null hypothesis is a strong conclusion, stating that (with no more than α given chance of error) the null hypothesis is wrong
Hypothesis Tests The confidence interval for the hypothesis test will be kinda the opposite of what we did before Now we will create a confidence interval for 𝒙 based on our hypothesized value for μ and see if our 𝒙 falls in it
Hypothesis Tests How to do it!
Hypothesis Tests How to do it! Set your α-level (how often you are willing to be wrong)
Hypothesis Tests How to do it! Set your α-level Define your Ha and H0
Hypothesis Tests How to do it! Set your α-level Define your Ha and H0 Get your data (for a confidence interval, you need the hypothesized μ, s and n (or se)
Hypothesis Tests How to do it! Set your α-level Define your Ha and H0 Get your data Find your critical value (for two-sided α=5% it is ≈2)
Hypothesis Tests How to do it! Set your α-level Define your Ha and H0 Get your data Find your critical value Calculate the confidence interval using μ rather than 𝒙
Hypothesis Tests How to do it! Set your α-level Define your Ha and H0 Get your data Find your critical value Calculate the confidence interval for 𝒙 The test will be: Is 𝒙 in it?
Hypothesis Tests PROJECT QUESTION Back to our mpg! H0: μ ≥ 30 mpg Ha: μ < 30 mpg x = 27 And suppose we know that: se = 4 mpg
Hypothesis Tests PROJECT QUESTION H0: μ ≥ 30 mpg Ha: μ < 30 mpg x = 27 se = 4 mpg Are we going to reject H0 for values of x greater than 30 or less than 30?
Hypothesis Tests PROJECT QUESTION H0: μ ≥ 30 mpg Ha: μ < 30 mpg x = 27 se = 4 mpg If the critical value for a one-sided confidence interval test at the 5% level is 1.64, create a test of our hypothesis
Hypothesis Tests PROJECT QUESTION H0: μ ≥ 30 mpg Ha: μ < 30 mpg x = 27 se = 4 mpg Reject H0 if x < 30 - (1.64)(4) < 23.44 What is our conclusion?
Hypothesis Tests PROJECT QUESTION H0: μ ≥ 30 mpg Ha: μ < 30 mpg x = 27 se = 4 mpg Reject H0 if x < 30 - (1.64)(4) < 23.44 What is our conclusion? fail to reject H0
Questions? http://i.imgur.com/aliTlT3.jpg
Hypothesis Tests If you reject H0 with an α-level of 0.05, we also say our x value is “significant at the .05 level” or we say we found a “significant difference”
Hypothesis Tests We can make our x more likely to be significant by (as usual): TAKING A LARGER SAMPLE SIZE
Hypothesis Tests Because we can “cheat the system” by taking a huge sample size that will find any teeny, tiny difference to be significant, we have a backup plan
Hypothesis Tests We also set levels of “practical significance” - what numerical difference would convincingly show a significant difference
Hypothesis Tests These levels of practical significance come from our knowledge of the variables we are measuring
Hypothesis Tests If we had taken a sample of 10,000,000 to calculate our mpg average and se, we could easily have had an se of 0.1 mpg Probably we wouldn’t really think that was a significant difference in mileage
Hypothesis Tests A practically significant difference would be the amount in mpg that you would think is different enough from 30 mpg to be important
Hypothesis Tests We set a level of practical significance at the same time we set the α-level
Hypothesis Tests PROJECT QUESTION What would be your level of practically significant difference for mpg?
Questions? http://i.imgur.com/aliTlT3.jpg