Download presentation
Presentation is loading. Please wait.
Published bySarah Freeman Modified over 9 years ago
1
Hypothesis Testing
2
Research hypothesis are formulated in terms of the outcome that the experimenter wants, and an alternative outcome that he doesn’t want Research hypothesis are formulated in terms of the outcome that the experimenter wants, and an alternative outcome that he doesn’t want I.e. If we’re comparing scores on an exam with two groups, one with test anxiety and one without, our hypotheses are: I.e. If we’re comparing scores on an exam with two groups, one with test anxiety and one without, our hypotheses are: (1) That the group with test anxiety will score higher (expected outcome) (1) That the group with test anxiety will score higher (expected outcome) (2) The two groups will score the same (unexpected outcome) (2) The two groups will score the same (unexpected outcome)
3
Hypothesis Testing The hypothesis that outlines that outcome that we’re expecting/hoping for is the Research Hypothesis (H 1 ) The hypothesis that outlines that outcome that we’re expecting/hoping for is the Research Hypothesis (H 1 ) The hypothesis that runs counter to our expectations is the Null Hypothesis (H o ) The hypothesis that runs counter to our expectations is the Null Hypothesis (H o )
4
Hypothesis Testing We can use the sampling distribution of the mean to determine the probability that we would obtain the mean of our sample by chance We can use the sampling distribution of the mean to determine the probability that we would obtain the mean of our sample by chance I.e. the same way we could convert a score to a z-score, and determine the probability of obtaining values higher or lower than it I.e. the same way we could convert a score to a z-score, and determine the probability of obtaining values higher or lower than it
5
Hypothesis Testing If the probability is low (i.e. only a 5% chance or less), we can assume that chance sampling error did not produce our results, and our IV did If the probability is low (i.e. only a 5% chance or less), we can assume that chance sampling error did not produce our results, and our IV did I.e. In our comparison of people with test anxiety, our test anxious group may also be quite dumb, resulting in their poor test scores. However, if their scores are extreme enough (low), we can discount even that possibility I.e. In our comparison of people with test anxiety, our test anxious group may also be quite dumb, resulting in their poor test scores. However, if their scores are extreme enough (low), we can discount even that possibility
6
Hypothesis Testing Why bother with H o at all? Why bother with H o at all? Technically, we can never prove a particular hypothesis to be true Technically, we can never prove a particular hypothesis to be true You cannot prove the statement: “All ducks are black”, because you would have to have observations on all ducks that were, are, and ever will be (i.e. on all ducks) You cannot prove the statement: “All ducks are black”, because you would have to have observations on all ducks that were, are, and ever will be (i.e. on all ducks) You can disprove a hypothesis – “All ducks are black” can be easily proven false by seeing one white (non-black) duck You can disprove a hypothesis – “All ducks are black” can be easily proven false by seeing one white (non-black) duck This is why technically, we are supposed to talk about “rejecting H o ” and not “accepting H 1 ” and “failing to reject H o ”, never “proving H 0 ” This is why technically, we are supposed to talk about “rejecting H o ” and not “accepting H 1 ” and “failing to reject H o ”, never “proving H 0 ”
7
Hypothesis Testing Beginning with the assumption that H 0 is true, and trying to disprove it also maintains the scientific spirit of objectivity and skepticism Beginning with the assumption that H 0 is true, and trying to disprove it also maintains the scientific spirit of objectivity and skepticism Objectivity – illustrates that we value the results of the data more than the hypothesis that, if proven, would make us happiest (H 1 ) Objectivity – illustrates that we value the results of the data more than the hypothesis that, if proven, would make us happiest (H 1 ) Skepticism – showing that we are not convinced of even our own hypothesis until confirmed by the data Skepticism – showing that we are not convinced of even our own hypothesis until confirmed by the data
8
Hypothesis Testing In our example of people with (x 1 ) and without test anxiety (x 2 ), where our hypothesis is that people with anxiety will have lower IQ scores: In our example of people with (x 1 ) and without test anxiety (x 2 ), where our hypothesis is that people with anxiety will have lower IQ scores: Ho = [x 1 ≥ x 2 ] Ho = [x 1 ≥ x 2 ] H 1 = [x 1 < x 2 ] H 1 = [x 1 < x 2 ]
9
Hypothesis Testing If, instead, we were testing if the group with anxiety was different from the average student population (Hint: Look at the italics), how would we phrase H o and H 1 ? If, instead, we were testing if the group with anxiety was different from the average student population (Hint: Look at the italics), how would we phrase H o and H 1 ? What if we were testing whether or not the two groups (x 1 & x 2 ) were equal? What if we were testing whether or not the two groups (x 1 & x 2 ) were equal?
10
Hypothesis Testing How do we know when our sample is rare enough to fail to accept H o ? How do we know when our sample is rare enough to fail to accept H o ? Statistical convention says when the probability of obtaining a mean that exceeds the one you’ve obtained is only 5% or less, we can says this is not due to chance Statistical convention says when the probability of obtaining a mean that exceeds the one you’ve obtained is only 5% or less, we can says this is not due to chance AKA the probability of rejecting H o when it is “true” (i.e. screwing-up) = significance/rejection level/alpha/critical value AKA the probability of rejecting H o when it is “true” (i.e. screwing-up) = significance/rejection level/alpha/critical value HOWEVER THIS DOES NOT MEAN THAT 5.1% IS MEANINGLESS! HOWEVER THIS DOES NOT MEAN THAT 5.1% IS MEANINGLESS!
11
p<.05
12
Hypothesis Testing For our group with test anxiety, if their mean score on an IQ test was 70, we first convert this into a z-score (μ = 100, σ = 15) For our group with test anxiety, if their mean score on an IQ test was 70, we first convert this into a z-score (μ = 100, σ = 15) z = (70 – 100)/15 = -2 z = (70 – 100)/15 = -2 Since our H 1 is that the group with anxiety will be less than those without, we look at the percent in the “Lesser Portion” Since our H 1 is that the group with anxiety will be less than those without, we look at the percent in the “Lesser Portion”
13
Hypothesis Testing Look at Table E.10, the probability of obtaining a score at or below z = -2 is.0228 or 2.3% Look at Table E.10, the probability of obtaining a score at or below z = -2 is.0228 or 2.3% Since this is below the 5% convention, we would reject H o (or “accept” H 1 ) Since this is below the 5% convention, we would reject H o (or “accept” H 1 )
14
Hypothesis Testing α is the p(“accepting” H 1 when it is false/rejecting H 0 when it is true), or of making a mistake called a Type I Error α is the p(“accepting” H 1 when it is false/rejecting H 0 when it is true), or of making a mistake called a Type I Error p(“accepting” H 1 when it is false) ≠ p(“accepting” H 1 ) – the former refers to a type of error, the latter simply to an outcome p(“accepting” H 1 when it is false) ≠ p(“accepting” H 1 ) – the former refers to a type of error, the latter simply to an outcome What about the p(“accepting” H 0 when it is false/rejecting H 1 when it is true)? What about the p(“accepting” H 0 when it is false/rejecting H 1 when it is true)? This is called a Type II Error, or β (Beta) This is called a Type II Error, or β (Beta)
15
Hypothesis Testing Why not make α as small as possible? Why not make α as small as possible? Because as α [p(Type I Error)] decreases, β [p(Type II Error)] increases Because as α [p(Type I Error)] decreases, β [p(Type II Error)] increases Red = α, Blue = β Red = α, Blue = β
16
Hypothesis Testing It seems like we care more about Type I Error than Type II Error. Why? It seems like we care more about Type I Error than Type II Error. Why? Scientists are more likely to commit a Type I Error because they are more motivated to prove their hypothesis (H 1 ) Scientists are more likely to commit a Type I Error because they are more motivated to prove their hypothesis (H 1 ) In Law, establishing motive is important to proving guilt, without a motive, there’s little reason to expect that a crime will occur, let alone stringently attempt to protect against it In Law, establishing motive is important to proving guilt, without a motive, there’s little reason to expect that a crime will occur, let alone stringently attempt to protect against it
17
Hypothesis Testing So long we’re only willing to take a 5% of incorrectly rejecting H o, it doesn’t matter how we distribute this 5%, as long as it doesn’t exceed 5% So long we’re only willing to take a 5% of incorrectly rejecting H o, it doesn’t matter how we distribute this 5%, as long as it doesn’t exceed 5% We can place all 5% in one “tail” of the distribution if we only expect a difference in means in one direction = One- Tailed/Directional Test We can place all 5% in one “tail” of the distribution if we only expect a difference in means in one direction = One- Tailed/Directional Test We can place half of 5% (2.5%) in either “tail”, if we have no a priori (before) hypothesis about where our mean difference will be – Two-Tailed/Non-Directional Test We can place half of 5% (2.5%) in either “tail”, if we have no a priori (before) hypothesis about where our mean difference will be – Two-Tailed/Non-Directional Test The decision of which type of test to use should be made a priori based on theory, not data driven The decision of which type of test to use should be made a priori based on theory, not data driven
18
Hypothesis Testing One-Tailed Test Two-Tailed Test One-Tailed Test Two-Tailed Test
19
Hypothesis Testing H o and H 1 with One- and Two-Tailed Tests: H o and H 1 with One- and Two-Tailed Tests: For One-Tailed Tests: For One-Tailed Tests: If our hypothesis is that group x is lower than group y If our hypothesis is that group x is lower than group y H o = (x ≥ y) H o = (x ≥ y) H 1 = (x < y) H 1 = (x < y) For Two-Tailed Tests: For Two-Tailed Tests: If our hypothesis is that group x is either greater than or less than group y If our hypothesis is that group x is either greater than or less than group y H o = (x = y) H o = (x = y) H 1 = (x ≠ y) H 1 = (x ≠ y)
20
Hypothesis Testing Psychologists can be sneaky bastards and covertly increase α by testing one hypothesis many times by: Psychologists can be sneaky bastards and covertly increase α by testing one hypothesis many times by: Evaluating one hypothesis with many different statistical tests Evaluating one hypothesis with many different statistical tests Using more than one measure to operationalize one DV Using more than one measure to operationalize one DV i.e. Measuring depression with both the Beck Depression Inventory-II (BDI-II) and the Minnesota Multi-Phasic Personality Inventory-II (MMPI-II) = testing depression twice = doubling your α i.e. Measuring depression with both the Beck Depression Inventory-II (BDI-II) and the Minnesota Multi-Phasic Personality Inventory-II (MMPI-II) = testing depression twice = doubling your α
21
Hypothesis Testing What should you do to prevent this from happening? What should you do to prevent this from happening? If you’re testing one hypothesis many different ways or with many measures, adjust α accordingly w/ the Bonferroni Correction If you’re testing one hypothesis many different ways or with many measures, adjust α accordingly w/ the Bonferroni Correction Note: NOT the same as the Beeferoni™ Correction, which prevents incorrect preparation of Chef Boyardee ™ products Note: NOT the same as the Beeferoni™ Correction, which prevents incorrect preparation of Chef Boyardee ™ products Testing w/ 2 tests; Test using α =.05/2 =.025 Testing w/ 2 tests; Test using α =.05/2 =.025 Test using 3 measures of one construct; Use α =.05/3 =.0167 Test using 3 measures of one construct; Use α =.05/3 =.0167 Testing w/2 tests and 3 measures; Use α =.05/6 =.008 Testing w/2 tests and 3 measures; Use α =.05/6 =.008
22
Hypothesis Testing Example: Example: Your hypothesis is that males and females will differ in degree of instrumental aggression (IA = aggression designed to obtain an end). IA is measured with the Instrumental Aggression Scale (IAS) and the Positive and Negative Affect Scale (PANAS), and the groups are evaluated with both ANOVA and SEM Your hypothesis is that males and females will differ in degree of instrumental aggression (IA = aggression designed to obtain an end). IA is measured with the Instrumental Aggression Scale (IAS) and the Positive and Negative Affect Scale (PANAS), and the groups are evaluated with both ANOVA and SEM What is your corrected α-level? What is your corrected α-level?
23
Hypothesis Testing Three of the Ten Commandments of Statistics: Three of the Ten Commandments of Statistics: 1. P-Values indicate the probability that your findings occurred by chance or the likelihood of obtaining them again in a similar sample NOT the strength of the relationship between an IV and DV 1. P-Values indicate the probability that your findings occurred by chance or the likelihood of obtaining them again in a similar sample NOT the strength of the relationship between an IV and DV I.e. NEVER SAY “In my experiment evaluating the influence of coffee (the IV) on people’s activity levels (the DV), I found highly significant results at p =.000001, indicating that coffee produces a lot of activity in people” I.e. NEVER SAY “In my experiment evaluating the influence of coffee (the IV) on people’s activity levels (the DV), I found highly significant results at p =.000001, indicating that coffee produces a lot of activity in people” CORRECT – “The likelihood that the effect, that coffee boosted activity levels, was due to sampling error (i.e. chance) was only.000001” CORRECT – “The likelihood that the effect, that coffee boosted activity levels, was due to sampling error (i.e. chance) was only.000001”
24
Hypothesis Testing Three of the Ten Commandments of Statistics: Three of the Ten Commandments of Statistics: 2. p =.052,.055, etc. is not “insignificant”, and does not mean that a relationship between your IV and DV does not exist, just that it did not meet “conventional” levels of significance. 2. p =.052,.055, etc. is not “insignificant”, and does not mean that a relationship between your IV and DV does not exist, just that it did not meet “conventional” levels of significance. 3. When testing a hypothesis multiple ways, always use some corrected level of α (i.e. the Bonferroni Correction). 3. When testing a hypothesis multiple ways, always use some corrected level of α (i.e. the Bonferroni Correction).
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.