Download presentation
Presentation is loading. Please wait.
1
Introduction to Hypothesis Testing Raymond J. Carroll Department of Statistics Faculty of Nutrition Texas A&M University http://stat.tamu.edu/~carroll
2
Outline Series of Examples Data Collection for Examples
3
Example #1 My Hypothesis: Texas A&M Students simply guess when they are asked whether they are drinking diet Pepsi or diet Coke The Experiment: Blind taste test. You are asked which cup you drink is diet Coke Our Goal: Test this hypothesis, using statistical principles and probability statements
4
A Warning Yes or No: No statistician will ever answer a question “yes” or “no” Probabilities: We always say things like “the chance is less than 5% that your hypothesis is correct”
5
Example #1 Data Model: The data model is –Normal? –Gamma? –Binomial? –Poisson?
6
Example #1 Data Model: The data model is –Normal? –Gamma? –Binomial? Because each outcome is yes or no, success or failure –Poisson?
7
Example #1 My Hypothesis in terms of population parameters: I have claimed that you can do no better than guess Each of you is a Binomial(1,p) or Binomial(1 When I say you are guessing, what am I saying about the population?
8
Example #1 My Hypothesis in terms of population parameters: I have claimed that you can do no better than guess Each of you is a Binomial(1,p) or Binomial(1 When I say you are guessing, what am I saying about the population? That the proportion of successes is p = = ½
9
Example #2 My Hypotheses: Keebler used to advertise 17 chocolate chip per cookie More chocolate chips than another brand The Experiment: Get a cookie of each type, count the number of chips, criticize the experiment Our Goal: Test these hypotheses, using statistical principles and probability statements
10
Example #2 Data Model: The data model is –Normal? –Gamma? –Binomial? –Poisson?
11
Example #2 Data Model:The data model is –Normal? –Gamma? –Binomial? –Poisson? It could be Poisson or normal. Poisson is the better choice, because it is a count We’ll use the central limit theorem to make inferences
12
Example #2 My Hypothesis in terms of population parameters: Keebler has claimed that it gives you 17 chips per cookie, on average Each of you is a Poisson with mean When I say Keebler is correct, what am I saying about the population?
13
Example #2 My Hypothesis in terms of population parameters: Keebler has claimed that it gives you 17 chips per cookie, on average Each of you is a Poisson with mean When I say Keebler is correct, what am I saying about the population? That the population mean number of chips is 17
14
Example #3 My Hypotheses: The percentage of regular M&M’s that are green is the same as the percentage of peanut M&M’s that are green The Experiment: Compute the percentage of green M&M’s in each bag Our Goal: Test these hypotheses, using statistical principles and probability statements
15
Example #3 Data Model: The data model is –Normal? –Gamma? –Binomial? –Poisson?
16
Example #3 Data Model:The data model is –Normal? –Gamma? –Binomial? –Poisson? Roughly normal, since each data point is a percentage We’ll use the central limit theorem to make inferences
17
Example #3 My Hypothesis in terms of population parameters: The %-green M&M’s does not depend on the type of M&M’s What am I saying about the two populations?
18
Example #3 My Hypothesis in terms of population parameters: The %-green M&M’s does not depend on the type of M&M’s What am I saying about the two populations? That they have the same population mean.
19
Example #4 My Hypotheses: Women who keep track of their diet by diaries or PDA do not lower their caloric intake in a 6-day period The Experiment: The WISH Study at the National Cancer Institute, with 400 women The data appear to contradict my hypothesis
20
Typical (Median) Values of Reported Caloric Intake Over 6 Diary Days: WISH Study A major point of STAT211 is to prepare you to answer the question as to whether these data, which look convincing, really are convincing in terms of probability statements.
21
Example #4 Data Model: The data model is –Normal? –Gamma? –Binomial? –Poisson?
22
Example #4 Data Model:The data model is –Normal? –Gamma? –Binomial? –Poisson? Lognormal, so most people take logarithms of caloric intake and analyze them as normal
23
Example #4 Data Model: The data that we use is the difference between Day 1 and Day 6, i.e., Day 1 – Day 6
24
Example #4 My Hypothesis in terms of population parameters: What am I saying about the population, when I claim that writing down diets will not lead to a change in reported caloric intake?
25
Example #4 My Hypothesis in terms of population parameters: What am I saying about the population, when I claim that writing down diets will not lead to a change in reported caloric intake? That the population mean difference between Day 1 and Day 6 = 0
26
Some Final Comments Formulating statistical hypothesis testing is really intuitive Don’t let the formulae obscure the fact that all we are doing is –Asking questions about population parameters –Constructing confidence intervals for population parameters –Using these confidence intervals to answer the question
27
The WISH Data I computed a 99% confidence interval for the population mean change in the WISH data. This interval was entirely above 0, and ranged roughly from 75 to 375 In other words, with 99% confidence, Day 1 reported between 75 and 375 more calories than Day 6. Is the hypothesis true?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.