Download presentation
Presentation is loading. Please wait.
Published byMaurice Smith Modified over 9 years ago
1
On Thursday, I’ll provide information about the project Due on Friday after last class. Proposal will be due two weeks from today (April 15 th ) You’re encouraged (but not required) to work in groups of three people Homework: –Due next Tuesday –On web tonight Announcements
2
Hypothesis Testing: 20,000 Foot View 1.Set up the hypothesis to test and collect data Hypothesis to test: H O
3
Hypothesis Testing: 20,000 Foot View 1.Set up the hypothesis to test and collect data 2.Assuming that the hypothesis is true, are the observed data likely? Data are deemed “unlikely” if the test statistic is in the extreme of its distribution when H O is true. Hypothesis to test: H O
4
Hypothesis Testing: 20,000 Foot View 1.Set up the hypothesis to test and collect data 2.Assuming that the hypothesis is true, are the observed data likely? 3.If not, then the alternative to the hypothesis must be true. Data are deemed “unlikely” if the test statistics is in the extreme of its distribution when H O is true. Alternative to H O is H A Hypothesis to test: H O
5
Hypothesis Testing: 20,000 Foot View 1.Set up the hypothesis to test and collect data 2.Assuming that the hypothesis is true, are the observed data likely? 3.If not, then the alternative to the hypothesis must be true. 4.P-value describes how likely the observed data are assuming H O is true. (i.e. answer to Q#2 above) Data are deemed “unlikely” if the test statistics is in the extreme of its distribution when H O is true. “Unlikely” if p-value < Alternative to H O is H A Hypothesis to test: H O
6
Large Sample Test for a Proportion: Taste Test Data 33 people drink two unlabeled cups of cola (1 is coke and 1 is pepsi) p = proportion who correctly identify drink = 20/33 = 61% Question: is this statistically significantly different from 50% (random guessing) at = 10%?
7
Large Sample Test for a Proportion: Taste Test Data H O : p = 0.5 H A : p does not equal 0.5 Test statistic: z = | (p -.5)/sqrt( p(1-p)/n) | = | (.61-.5)/sqrt(.61*.39/33) | = 1.25 Reject if z > z 0.10/2 = 1.645 It’s not, so there’s not enough evidence to reject H O.
8
Large Sample Test for a Proportion: Taste Test Data P-value Pr( |(P-p)/sqrt(P Q/n)| > |(p-p)/sqrt(p q/n)| when H 0 is true) =Pr( |(P-0.5)/sqrt(P Q/n) | > |1.25 | when H 0 is true) =2*Pr( Z > 1.25) where Z~N(0,1) = 21% i.e. “How likely is a test statistic of 1.25 when true p = 50%?”
9
Minitab Minitab computes the test statistic as: z = | (p -.5)/sqrt(.5(1-.5)/n) | = | (.61-.5)/sqrt(.25/33) | = 1.22 Since.25 >= p(1-p) for any p, this is more conservative (larger denominator = smaller test statistic). Either way is fine.
10
Difference between two means PCB Data –Sample 1: Treatment = PCB 156 –Sample 2: Treatment = PCB 156 + estradiol Response = estrogen produced by cells Question: Can we conclude that average estrogen produced in sample 1 is different from average by sample 2 (at = 0.05)?
11
H 0 : 1 – 2 = 0 H A : 1 – 2 does not = 0 Test statistic: |(Estimate – value under H 0 )/Std Dev(Estimate)| z = (x 1 – x 2 )/sqrt(s 1 2 /n 1 + s 2 2 /n 2 ) Reject if |z| > z /2 P-value = 2*Pr[ Z > (x 1 – x 2 )/sqrt(s 1 2 /n 1 + s 2 2 /n 2 )] where Z~N(0,1).
12
nx s PCB156 96 1.931.00 PCB156+E 64 2.161.01 |z| = |-0.23/sqrt(1.00 2 /96 + 1.01 2 /64)| = |-1.42| = 1.42 z /2 = z 0.05/2 = z 0.025 = 1.96 So don’t reject. P-value = 2*Pr(Z > 1.42) = 16% Pr( Test statistic > 1.42 when H O is true)
13
Test statistic: |z| = |(Estimate – value under H 0 )/Std Dev(Estimate)| Reject if |z| > z /2 P-value = 2*Pr( Z > z ) where Z~N(0,1). In General, Large Sample 2 sided Tests:
14
Large Sample Hypothesis Tests: summary for means Single mean HypothesesTest (level 0.05) H O : = kReject H O if |(x-k)/s/sqrt(n)|>1.96 H A : does not = kp-value: 2*Pr(Z>|(x-k)/s/sqrt(n)|) where Z~N(0,1) Difference between two means HypothesesTest (level 0.05) H O : = DLet d = x 1 – x 2 H A : does not = D Let SE = sqrt(s 1 2 /n 2 + s 2 2 /n 2 ) Reject H O if |(d-D)/SE|>1.96 p-value: 2*Pr(Z>|(d-D)/SE|) where Z~N(0,1)
15
Large Sample Hypothesis Tests: summary for proportions Single proportion HypothesesTest (level 0.05) H O : true p = kReject H O if |(p-k)/sqrt(p(1-p)/n)|>1.96 H A : p does not = kp-value: 2*Pr(Z>|(p-k)/sqrt(p(1-p)/n)|) where Z~N(0,1) Difference between two proportions HypothesesTest (level 0.05) H O : p 1 -p 2 = dLet d = p 1 – p 2 H A : p 1 -p 2 does not = d Let p = total “success”/(n 1 +n 2 ) Let SE = sqrt(p(1-p)/n 1 + p(1-p)/n 2 ) Reject H O if |(p-d)/SE|>1.96 p-value: 2*Pr(Z>|(d)/SE|) where Z~N(0,1)
16
A two sided level hypothesis test, H 0 : =k vs H A : does not equal k is rejected if and only if k is not in a 1- confidence interval for the mean. A one sided level hypothesis test, H 0 : k is rejected if and only if a level 1-2 confidence interval is completely to the left of k. Hypothesis tests versus confidence intervals The following is discussed in the context of tests / CI’s for a single mean, but it’s true for all the confidence intervals / tests we have done.
17
The previous slide said that confidence intervals can be used to do hypothesis tests. CI’s are “better” since they contain more information. Fact: Hypothesis tests and p-values are very commonly used by scientists who use statistics. Advice: 1.Use confidence intervals to do hypothesis testing 2.know how to compute / and interpret p-values Hypothesis tests versus confidence intervals
18
Type 1 and Type 2 Errors Truth H 0 True H A True Action Fail to Reject H 0 Reject H 0 correct Type 1 error Type 2 error Significance level = =Pr(Making type 1 error) Power = 1–Pr(Making type 2 error)
19
In terms of our folate example, suppose we repeated the experiment and sampled 333 new people Pr( Type 1 error ) = Pr( reject H 0 when mean is 300 ) = Pr( |Z| > z 0.025 ) = Pr( Z > 1.96 ) + Pr( Z < -1.96 ) = 0.05 = When mean is 300, then Z, the test statistic, has a standard normal distribution. Note that the test is designed to have type 1 error =
20
Power = Pr( reject H 0 when mean is not 300 ) = Pr( reject H 0 when mean is 310) = Pr( |(X-300)/193.4/sqrt(333)| > 1.96) = Pr( (X-300)/10.6 > 1.96 )+Pr( (X-300)/10.6 320.8) + Pr(X < 279.2) = Pr( (X – 310)/10.6 > (320.8-310)/10.6 ) + Pr( (X – 310)/10.6 < (279.2-310)/10.6 ) = Pr( Z > 1.02 ) + Pr( Z < -2.90 ) where Z~N(0,1) = 0.15 + 0.00 = 0.15 In other words, if there true mean is 310, there’s an 85% chance that we will not detect it. If 310 is scientifically significantly different from 300, then this means that our experiment was wasted in some sense. As n increases, power goes up. As standard deviation of x decreses, power goes up. As increases, power goes up.
21
Picture for Power True Mean Power 260280300320340 0.2 0.4 0.6 0.8 1.0 Power for n=333 and = 0.05 “Pr(Reject H O when it’s false)” As n increases and/or increases and/or std dev decreases, these curves become steeper
22
Power calculations are a very important part of planning any experiment: Given: –a certain level of –preliminary estimate of std dev (of x’s that go into x) –difference that is of interest Compute required n in order for power to be at least 85% (or some other percentage...)
23
Power calculations are an integral part of planning any experiment: Bad News: Algebraically messy (but you should know how to do them) Good News: Minitab can be used to do them: Stat: Power and Sample Size… –Inputs: 1.required power 2.difference of interest –Output: Result = required sample size –Options: Change , one sided versus 2 sided tests
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.