Presentation is loading. Please wait.

Presentation is loading. Please wait.

Review Ordering company jackets, different men’s and women’s styles, but HR only has database of employee heights. How to divide people so only 5% of.

Similar presentations


Presentation on theme: "Review Ordering company jackets, different men’s and women’s styles, but HR only has database of employee heights. How to divide people so only 5% of."— Presentation transcript:

1 Review Ordering company jackets, different men’s and women’s styles, but HR only has database of employee heights. How to divide people so only 5% of women get men’s jackets? 5’2” 5’3” 5’4” 5’5” 5’6” 5’7” 5’8” 5’9” 5’10” 5’11” 6’0” 6’1” Women 5% 10% 12% 21% 27% 15% 3% 1% ~0% Men 4% 8% 26% 14% 6% 2%

2 Hypothesis Testing 10/11

3 Where Am I? Wake up after a rough night in unfamiliar surroundings
Still in Boulder? Expected if in Boulder (large likelihood) Couldn’t happen IF in Boulder (likelihood near zero)  Can’t be in Boulder Surprising but not impossible (moderate likelihood)

4 Steps of Hypothesis Testing
State clearly the two hypotheses Determine which is the null hypothesis (H0) and which is the alternative hypothesis (H1) Compute a relevant test statistic from the sample Find the likelihood function of the test statistic according to the null hypothesis Choose alpha level (a): how willing you are to abandon null (usually .05) Find the critical value: cutoff with probability  of being exceeded under H0 Compare the actual result to the critical value Less than critical value  retain null hypothesis Greater than critical value  reject null hypothesis; accept alternative hypothesis

5 Specifying Hypotheses
Both hypotheses are statements about population parameters Null Hypothesis (H0) Always more specific, e.g. 50% chance, mean of 100 Usually the less interesting, "default" explanation Alternative Hypothesis (H1) More interesting – researcher’s goal is usually to support the alternative hypothesis Less precise, e.g. > 50% chance,  > 100

6 Test Statistic Statistic computed from sample to decide between hypotheses Relevant to hypotheses being tested Based on mean if hypotheses are about means Based on number correct (frequency) if hypotheses are about probability correct Sampling distribution according to null hypothesis must be fully determined Can only depend on data and on values assumed by H0 Often a complex formula with little intuitive meaning Inferential statistic: Only used in testing reliability

7 Likelihood Function Probability distribution of a statistic according to a hypothesis Gives probability of obtaining any possible result Usually interested in distribution of test statistic according to null hypothesis Same as sampling distribution, assuming the population is accurately described by the hypothesis Test statistic chosen because we know its likelihood function Binomial test: Binomial distribution t-test: t distribution

8 Critical Value Cutoff for test statistic between retaining and rejecting null hypothesis If test statistic is beyond critical value, null will be rejected Otherwise, null will be retained Before collecting data: What strength of evidence will you require to reject null? How many correct outcomes? How big a difference between M and m0, relative to sM? Critical region Range of values that will lead to rejecting null hypothesis All values beyond critical value Frequency Probability t Probability

9 Types of Errors Goal: Reject null hypothesis when it’s false; retain it when it’s true Two ways to be wrong Type I Error: Null is correct but you reject it Type II Error: Null is false but you retain it Type I Error rate IF H0 is true, probability of mistakenly rejecting H0 Proportion of false theories we conclude are true E.g., proportion of useless treatments that are deemed effective Logic of hypothesis testing is founded on controlling Type I Error rate Set critical value to give desired Type I Error rate

10 Alpha Level Choice of acceptable Type I Error rate
Usually .05 in psychology Higher  more willing to abandon null hypothesis Lower  require stronger evidence before abandoning null hypothesis Determines critical value Under the sampling distribution of the test statistic according to the null hypothesis, the probability of a result beyond the critical value is  Test Statistic Sampling Distribution from H0 Critical Value a

11 Doping Analogy Measure athletes' blood for signs of doping
Cheaters have high RBCs, but even honest people vary What rule to use? Must set some cutoff, and punish anyone above it Will inevitably punish some innocent people H0 likelihood function is like distribution of innocent athletes’ RBCs Cutoff determines fraction of innocent people that get unfairly punished This fraction is alpha Distribution of Innocent Athletes Don’t Punish Punish RBC

12 Power H0 H0 H1 H1 Type II Error rate Power
IF H0 is false, probability of failing to reject it E.g., fraction of cheaters that don’t get caught Power IF H0 is false, probability of correctly rejecting it Equal to one minus Type II Error rate E.g., fraction of cheaters that get caught Power depends on sample size Choose sample size to give adequate power Researchers must make a guess at effect size to compute power H0 Type I error rate (a) H0 H1 H1 Type II error rate Power

13 Two-Tailed Tests Sometimes want to detect effects in either direction Drugs that help or drugs that hurt Formalized in alternative hypothesis m < m0 or m > m0 Two critical values, one in each tail Type I error rate is sum from both critical regions Need to divide errors between both tails Each gets a/2 (2.5%) t tcrit -tcrit M m0 Reject H0 a/2

14 One-Tailed vs. Two-Tailed Tests
tcrit One-tailed a Two-tailed a/2 a/2 -tcrit tcrit t

15 An Alternative View: p-values
Reversed approach to hypothesis testing After you collect sample and compute test statistic How big must a be to reject H0 p-value Measure of how consistent data are with H0 Probability of a value equal to or more extreme than what you actually got Large p-value  H0 is a good explanation of the data Small p-value  H0 is a poor explanation of the data p > : Retain null hypothesis p < : Reject null hypothesis; accept alternative hypothesis Researchers generally report p-values, because then reader can choose own alpha level E.g. “p = .03” If willing to allow 5% error rate, then accept result as reliable If more stringent, say 1% (a = .01), then remain skeptical tcrit for a = .05 tcrit for a = .03 tcrit for a = .01 t t t = 2.15  p = .03


Download ppt "Review Ordering company jackets, different men’s and women’s styles, but HR only has database of employee heights. How to divide people so only 5% of."

Similar presentations


Ads by Google