Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Smoking Data The investigation was based on examining the effectiveness of smoking cessation programs among heavy smokers who are also recovering alcoholics.

Similar presentations


Presentation on theme: "1 Smoking Data The investigation was based on examining the effectiveness of smoking cessation programs among heavy smokers who are also recovering alcoholics."— Presentation transcript:

1 1 Smoking Data The investigation was based on examining the effectiveness of smoking cessation programs among heavy smokers who are also recovering alcoholics. Thursday, 18 June 20159:45 AM

2 2 Smoking Data ALA – American Lung Association

3 3 Smoking Data Note that gender has been coded

4 4 Two-sample Correlated t-test Do rates of smoking decrease from pre-intervention to post-intervention?

5 5 Two-sample Correlated t-test Analyze > Compare Means > Paired-Samples t Test

6 6 Two-sample Correlated t-test Highlight pre, move across. You will see that pre now appears as Variable 1 in the Paired Variables box.

7 7 Two-sample Correlated t-test Highlight post, move across. You will see that post now appears as Variable 2 in the Paired Variables box.

8 8 Two-sample Correlated t-test Click on OK to run the analysis or Paste to preserve the syntax.

9 9 Two-sample Correlated t-test Syntax GET FILE='\\Client\f$\spss\1\1.sav'. DATASET NAME DataSet1 WINDOW=FRONT. T-TEST PAIRS = pre WITH post (PAIRED) /CRITERIA = CI(.95) /MISSING = ANALYSIS. Note, you can even include an option to load the data file.

10 10 Two-sample Correlated t-test The first table of the printout contains descriptive statistics while the second table contains inferential statistics. Study the printout you can identify n, Mean, and Std for each group.

11 11 Two-sample Correlated t-test The second table contains inferential statistics. You can identify t, df, and p (p is in the column labelled "Sig. (2-tailed)"). In this case, the mean number of cigarettes smoked prior to the intervention programs was significantly higher than the number of cigarettes smoked after the intervention programs, t 29 = 18.57, p < 0.001.

12 12 Caution In Psychology there is great reliance on the “p” value. Over the past twenty year serious flaws have been pointed out with this reliance. In general reporting of a confidence interval is recommended. R. Hubbard and R.M. Lindsay, 2008, “Why p Values Are Not a Useful Measure of Evidence in Statistical Significance Testing” Theory and Psychology 18 69- 88.Why p Values Are Not a Useful Measure of Evidence in Statistical Significance Testing J.L. Moran et al., 2004, “A farewell to p-values” Critical Care and Resuscitation 6 130-137.A farewell to p-values

13 13 Caution As pointed out by Johansson 1.p is uniformly distributed under the null hypothesis and can therefore never indicate evidence for the null. 2.p is conditioned solely on the null hypothesis and is therefore unsuited to quantify evidence, because evidence is always relative in the sense of being evidence for or against a hypothesis relative to another hypothesis. 3.p designates probability of obtaining evidence (given the null), rather than strength of evidence.

14 14 Caution 4.p depends on unobserved data and subjective intentions and therefore implies, given the evidential interpretation, that the evidential strength of observed data depends on things that did not happen and subjective intentions. T. Johansson, 2011,“Hail the impossible: p-values, evidence, and likelihood” Scandinavian Journal of Psychology, 52, 113–125.Hail the impossible: p-values, evidence, and likelihood

15 15 Caution For an alternate but equally jaundiced view see Valen E. Johnson Revised standards for statistical evidence Proceedings of the National Academy of Sciences of the United States of America 2013 110(48) 19313–19317. Which is nicely summarised in Erika Check Hayden Weak statistical standards implicated in scientific irreproducibility Nature 11 November 2013. and Geoff Cumming The problem with p values: how significant are they, really?

16 16 Caution Regina Nuzzo Scientific method: Statistical errors Nature Volume: 506, Pages: 150–152 13 February 2014 P values, the 'gold standard' of statistical validity, are not as reliable as many scientists assume. However it seems to get the explanation of hypothesis testing wrong!

17 17 Caution The misuse of asterisks in hypothesis testing Dieter Rasch, Klaus D. Kubinger, Jörg Schmidtke and Joachim Häusler Psychology Science, Volume 46(2), 2004, p. 227-242. This paper serves to demonstrate that the practise of using one, two, or three asterisks (according to a type-I-risk α either 0.05, 0.01, or 0.001) in significance testing as given particularly with regard to empirical research in psychology is in no way in accordance with the Neyman-Pearson theory of statistical hypothesis testing. Claiming a- posteriori that even a low type-I-risk α leads to significance merely discloses a researcher’s self-deception. Furthermore it will be emphasised that by using sequential sampling procedures instead of fixed sample sizes the practice of asterisks” would not arise.

18 18 Repeated Measures Analysis of Variance (ANOVA) Do rates of smoking decrease across the four data collection periods. That is, does smoking not only decrease from pre-intervention to post-intervention but also does the rate continue to decrease during a 6-month and 12-month follow up?

19 19 Repeated Measures Analysis of Variance (ANOVA) Analyze > General Linear Model > Repeated Measures

20 20 Repeated Measures Analysis of Variance (ANOVA) In the Within Subject Factor Name box designate a name for the repeated measure factor, let’s call it rate.

21 21 Repeated Measures Analysis of Variance (ANOVA) In the Number of Levels window type in the number of time periods measured. In this case it is 4.

22 22 Repeated Measures Analysis of Variance (ANOVA) Click on Add. To generate rate(4) Click on Define

23 23 Repeated Measures Analysis of Variance (ANOVA) Click on Define. A Repeated Measures box will appear.

24 24 Repeated Measures Analysis of Variance (ANOVA) Highlight your first time variable, pre, from the list of variables on the left, and click on the upper arrow button to move it into the Within Subject Variables window.

25 25 Repeated Measures Analysis of Variance (ANOVA) Add the remaining three time variables, post, follow6, and follow12, in the same fashion. Finally Click on the Options button.

26 26 Repeated Measures Analysis of Variance (ANOVA) Click on the square next to the word Descriptive Highlight rate in the Factor(s) and Factor Interaction box. Click on the arrow button and click on the square next to the word Compare main effects. Finally click on Continue.

27 27 Repeated Measures Analysis of Variance (ANOVA) Finally click on OK to run the analysis or Paste to preserve the syntax.

28 28 Repeated Measures Analysis of Variance (ANOVA) Syntax GLM pre post follow6 follow12 /WSFACTOR = rate 4 Polynomial /METHOD = SSTYPE(3) /EMMEANS = TABLES(rate) COMPARE ADJ(LSD) /PRINT = DESCRIPTIVE /CRITERIA = ALPHA(.05) /WSDESIGN = rate.

29 29 Repeated Measures Analysis of Variance (ANOVA) You can identify n, Mean, and Std for each of the four time periods in the descriptive statistics output.

30 30 Repeated Measures Analysis of Variance (ANOVA) - Sphericity ANOVAs with repeated measures (within-subject factors) are particularly susceptible to the violation of the assumption of sphericity. Sphericity is the condition where the variances of the differences between all combinations of related groups (levels) are equal. Violation of sphericity is when the variances of the differences between all combinations of related groups are not equal. Sphericity can be likened to homogeneity of variances in a between-subjects ANOVA.

31 31 Repeated Measures Analysis of Variance (ANOVA) - Sphericity When the probability of Mauchly's test statistic is greater than or equal to.05 (i.e., p >.05), we fail to reject the null hypothesis that the variances are equal. Therefore we could conclude that the assumption has not been violated. However, when the probability of Mauchly's test statistic is less than or equal to.05 (i.e., p <.05), sphericity cannot be assumed and we would therefore conclude that there are significance differences between the variances. It should be noted that sphericity is always met for two levels of a repeated measure factor and it is, therefore, unnecessary to evaluate.

32 32 Repeated Measures Analysis of Variance (ANOVA) Examine Mauchly's test of Sphericity to determine if the homogeniety of variance assumption is met.

33 33 Repeated Measures Analysis of Variance (ANOVA) For Mauchly's test if the p-value is significant (look under Sig.) then the assumption has been violated. This will determine which values you interpret on the ANOVA table (Tests of Within-Subject Effects). Which is the case here, so use values associated with Huynh-Feldt test.

34 34 Repeated Measures Analysis of Variance (ANOVA) Examine the Tests of Within-Subject Effects table (ANOVA table) to determine the significance of your omnibus test.

35 35 Repeated Measures Analysis of Variance (ANOVA) When the Sphericity assumption is not violated, you can interpret the top set of values (i.e., Sum of Squares, df, Mean Sum of Squares, F, and p (Sig.)).

36 36 Repeated Measures Analysis of Variance (ANOVA) When the Sphericity assumption is violated, you can interpret the values associated with Huynh- Feldt test. In this case, there is a significant difference in smoking rates across the time periods, F(1.31,38.09) = 256.85, p < 0.001 (Huynh-Feldt). Since the results of the repeated measures ANOVA are significant, you will want to examine the post-hoc tests to determine between which time periods significant differences are occurring by using the Multiple Comparison table.

37 37 Repeated Measures Analysis of Variance (ANOVA) The Pairwise Comparisons table provides detailed information concerning the post-hoc results.

38 38 Repeated Measures Analysis of Variance (ANOVA) The table shows all possible comparisons between the four time periods. In the first row, the pre-intervention smoking rate is compared to the post-intervention smoking rates. The mean difference for this comparison is 25.1000 (i.e., the average smoking rate for pre-intervention, 30.5333, is subtracted from the average post-intervention smoking rate, 5.4333). To determine whether this mean difference is statistically significant examine the "Sig. Column" which represents the p-value. The p-value is (p <) 0.001 suggesting that the groups are significantly different from one another.

39 39 Repeated Measures Analysis of Variance (ANOVA) This is also supported by the 95% confidence interval which indicates that zero is outside the bounds. Following this comparison, a comparison is made between the pre- intervention smoking rates and smoking rates at a six month follow-up which shows a significant difference between the two groups, p < 0.001. You will notice that SPSS places a star next to mean difference scores that differ significantly. The remaining rows provide the results for the other comparisons.

40 40 Mixed Factorial ANOVA Do the smoking rates differ across the three types of smoking cessation program over time? That is, does one program lead to greater reductions in smoking rates among smokers?

41 41 Mixed Factorial ANOVA Analyze > General Linear Model > Repeated Measures

42 42 Mixed Factorial ANOVA Note the reset button used to remove redundant factors from old analysis. In the Within Subject Factor Name box designate a name for the repeated measure factor. In this case, let's call it time.

43 43 Mixed Factorial ANOVA In the Number of Levels window type in the number of time periods measured. In this case it is 2.

44 44 Mixed Factorial ANOVA Click on Add.Click on Define

45 45 Mixed Factorial ANOVA Having clicked on Define.

46 46 Mixed Factorial ANOVA Highlight your first time variable, pre, from the list of variables on the left, and click on the upper arrow button to move it into the Within Subject Variables window.

47 47 Mixed Factorial ANOVA Add the remaining time variable, post, in the same fashion.

48 48 Mixed Factorial ANOVA Highlight program and click on the arrow button in front of the Between-Subjects Factor(s) box. Click on the Options button.

49 49 Mixed Factorial ANOVA Click on the square next to the word Descriptive. Highlight time in the Factor(s) and Factor Interaction box. Click on the arrow button and click on the square next to the word Compare main effects. Click on Continue to return.

50 50 Mixed Factorial ANOVA Click on the Post Hoc... button

51 51 Mixed Factorial ANOVA Highlight program in the Factor(s) box. Click on the arrow button to move program to the “Post Hoc tests for:” box. Select a Tukey test by clicking on the box.

52 52 Mixed Factorial ANOVA Click on Continue to return to the Repeated Measures window. Click on OK to run the analysis. Syntax GLM pre post BY program /WSFACTOR = time 2 Polynomial /METHOD = SSTYPE(3) /POSTHOC = program ( TUKEY ) /EMMEANS = TABLES(time) COMPARE ADJ(LSD) /PRINT = DESCRIPTIVE /CRITERIA = ALPHA(.05) /WSDESIGN = time /DESIGN = program.

53 53 Mixed Factorial ANOVA You can identify n, Mean, and Std for each of the two time periods across the three interventions using the descriptive statistics output.

54 54 Mixed Factorial ANOVA Examine the Tests of Within-Subject Effects table (ANOVA table) to determine the significance of your omnibus test.

55 55 Mixed Factorial ANOVA The interaction effect should be examined first to determine if it is significant. In this case, the interaction effect is significant, F(1, 27) = 3.397, p = 0.05 (It should be noted that sphericity is always met for two levels of a repeated measure factor and it is, therefore, unnecessary to evaluate.) This suggests that there is a significant difference in the interventions programs across time. Since the interaction is significant, the main effect for time should not be interpreted. That completes the relevant ANOVA analysis.

56 56 Mixed Factorial ANOVA It should be noted that sphericity is always met for two levels of a repeated measure factor and it is, therefore, unnecessary to evaluate.

57 57 Caution “Like elaborately plumed birds…we preen and strut and display our t-values.” That was Edward Leamer’s uncharitable description of his profession in 1983. Mr. Leamer, an economist at the University of California in Los Angeles, was frustrated by empirical economists’ emphasis on measures of correlation over underlying questions of cause and effect, such as whether people who spend more years in school go on to earn more in later life. Cause and defect Cause and defect The Economist, 13 August 2009, p. 68 “Instrumental variables help to isolate causal relationships. But they can be taken too far.”

58 58 Caution Hardly anyone, he wrote gloomily, “takes anyone else’s data analyses seriously”. To make his point, Mr. Leamer showed how different (but apparently reasonable) choices about which variables to include in an analysis of the effect of capital punishment on murder rates could lead to the conclusion that the death penalty led to more murders, fewer murders, or had no effect at all. “Let’s take the con out of econometrics”, by Edward Leamer, American Economic Review 73(1), March 1983Let’s take the con out of econometrics

59 59 Caution Confidence intervals have frequently been proposed as a more useful alternative to null hypothesis significance testing, and their use is strongly encouraged in the APA Manual (American Psychological Association 2009 Publication Manual of the American Psychological Association (6th ed.). Washington, DC).APA ManualAmerican Psychological Association 2009 The misunderstandings surrounding p-values and confidence intervals are particularly unfortunate because they constitute the main tools by which psychologists draw conclusions from data. Robust misinterpretation of confidence intervals

60 60 Caution Robust misinterpretation of confidence intervals Hoekstra, R., Morey, R., Rouder, J., & Wagenmakers, E. (2014). Robust misinterpretation of confidence intervals Psychonomic Bulletin & Review, 21 (5), 1157-1164 DOI: 10.3758/s13423-013-0572-3 Reformers say psychologists should change how they report their results, but does anyone understand the alternative?Reformers say psychologists should change how they report their results, but does anyone understand the alternative? BPS Research Digest

61 61 Caution Please mark each of the statement below “true” or “false”. False means that the statement does not follow logically from the above statement. Also note that all, several, or none of the statements may be correct: Hoekstra, R., Morey, R., Rouder, J., & Wagenmakers, E. (2014). Robust misinterpretation of confidence intervals Psychonomic Bulletin & Review, 21(5), 1157-1164 DOI: 10.3758/s13423-013-0572-3

62 62 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 1.The probability that the true mean is greater than 0 is at least 95%.[] true/false [] 2.The probability that the true mean equals 0 is smaller than 5%.[] true/false [] 3.The “null hypothesis” that the true mean equals 0 is likely to be incorrect.[] true/false [] 4.There is a 95% probability that the true mean lies between 0.1 and 0.4.[] true/false [] 5.We can be 95% confident that the true mean lies between 0.1 and 0.4.[] true/false [] 6.If we were to repeat the experiment over and over, then 95% of the time the true mean falls between 0.1 and 0.4. [] true/false []

63 63 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 1.The probability that the true mean is greater than 0 is at least 95%.[] true/false [] 2.The probability that the true mean equals 0 is smaller than 5%.[] true/false [] 3.The “null hypothesis” that the true mean equals 0 is likely to be incorrect.[] true/false [] 4.There is a 95% probability that the true mean lies between 0.1 and 0.4.[] true/false [] 5.We can be 95% confident that the true mean lies between 0.1 and 0.4.[] true/false [] 6.If we were to repeat the experiment over and over, then 95% of the time the true mean falls between 0.1 and 0.4. [] true/false [] Assign probabilities to parameters or hypotheses, something that is not allowed within the frequentist framework.

64 64 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 1.The probability that the true mean is greater than 0 is at least 95%.[] true/false [] 2.The probability that the true mean equals 0 is smaller than 5%.[] true/false [] 3.The “null hypothesis” that the true mean equals 0 is likely to be incorrect.[] true/false [] 4.There is a 95% probability that the true mean lies between 0.1 and 0.4.[] true/false [] 5.We can be 95% confident that the true mean lies between 0.1 and 0.4.[] true/false [] 6.If we were to repeat the experiment over and over, then 95% of the time the true mean falls between 0.1 and 0.4. [] true/false [] Assign probabilities to parameters or hypotheses, something that is not allowed within the frequentist framework.

65 65 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 1.The probability that the true mean is greater than 0 is at least 95%.[] true/false [] 2.The probability that the true mean equals 0 is smaller than 5%.[] true/false [] 3.The “null hypothesis” that the true mean equals 0 is likely to be incorrect.[] true/false [] 4.There is a 95% probability that the true mean lies between 0.1 and 0.4.[] true/false [] 5.We can be 95% confident that the true mean lies between 0.1 and 0.4.[] true/false [] 6.If we were to repeat the experiment over and over, then 95% of the time the true mean falls between 0.1 and 0.4. [] true/false [] Assign probabilities to parameters or hypotheses, something that is not allowed within the frequentist framework.

66 66 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 1.The probability that the true mean is greater than 0 is at least 95%.[] true/false [] 2.The probability that the true mean equals 0 is smaller than 5%.[] true/false [] 3.The “null hypothesis” that the true mean equals 0 is likely to be incorrect.[] true/false [] 4.There is a 95% probability that the true mean lies between 0.1 and 0.4.[] true/false [] 5.We can be 95% confident that the true mean lies between 0.1 and 0.4.[] true/false [] 6.If we were to repeat the experiment over and over, then 95% of the time the true mean falls between 0.1 and 0.4. [] true/false [] Assign probabilities to parameters or hypotheses, something that is not allowed within the frequentist framework.

67 67 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 1.The probability that the true mean is greater than 0 is at least 95%.[] true/false [] 2.The probability that the true mean equals 0 is smaller than 5%.[] true/false [] 3.The “null hypothesis” that the true mean equals 0 is likely to be incorrect.[] true/false [] 4.There is a 95% probability that the true mean lies between 0.1 and 0.4.[] true/false [] 5.We can be 95% confident that the true mean lies between 0.1 and 0.4.[] true/false [] 6.If we were to repeat the experiment over and over, then 95% of the time the true mean falls between 0.1 and 0.4. [] true/false [] Mentions the boundaries of the confidence interval (i.e., 0.1 and 0.4), whereas a confidence interval can be used to evaluate only the procedure and not a specific interval.

68 68 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 1.The probability that the true mean is greater than 0 is at least 95%.[] true/false [] 2.The probability that the true mean equals 0 is smaller than 5%.[] true/false [] 3.The “null hypothesis” that the true mean equals 0 is likely to be incorrect.[] true/false [] 4.There is a 95% probability that the true mean lies between 0.1 and 0.4.[] true/false [] 5.We can be 95% confident that the true mean lies between 0.1 and 0.4.[] true/false [] 6.If we were to repeat the experiment over and over, then 95% of the time the true mean falls between 0.1 and 0.4. [] true/false [] Mentions the boundaries of the confidence interval (i.e., 0.1 and 0.4), whereas a confidence interval can be used to evaluate only the procedure and not a specific interval.

69 69 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 1.The probability that the true mean is greater than 0 is at least 95%.[] true/false [] 2.The probability that the true mean equals 0 is smaller than 5%.[] true/false [] 3.The “null hypothesis” that the true mean equals 0 is likely to be incorrect.[] true/false [] 4.There is a 95% probability that the true mean lies between 0.1 and 0.4.[] true/false [] 5.We can be 95% confident that the true mean lies between 0.1 and 0.4.[] true/false [] 6.If we were to repeat the experiment over and over, then 95% of the time the true mean falls between 0.1 and 0.4. [] true/false [] To sum up, all six statements are incorrect. Note that all six err in the same direction of wishful thinking.

70 70 Caution The 95% confidence interval for the mean ranges from 0.1 to 0.4! 7.If we were to repeat the experiment over and over, then 95 % of the time the confidence intervals contain the true mean. [] true/false [] The correct statement, which was absent from the list, is the following:

71 71 Caution Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01). Please mark each of the statements below as “true” or “false.” “False” means that the statement does not follow logically from the above premises. Also note that several or none of the statements may be correct (Gigerenzer 2004). Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33, 587–606.

72 72 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01).

73 73 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Which statements are in fact true? Recall that a p-value is the probability of the observed data (or of more extreme data points), given that the null hypothesis H 0 is true, defined in symbols as p(D|H 0 ).This definition can be rephrased in a more technical form by introducing the statistical model underlying the analysis (Gigerenzer et al., 1989, chapter 3). Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J., Krüger, L., 1989. The Empire of Chance. How Probability Changed Science and Every Day Life. Cambridge University Press, Cambridge, UK.

74 74 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01). Is easily detected as being false, because a significance test can never disprove the null hypothesis or the (undefined) experimental hypothesis. They are instances of the illusion of certainty (Gigerenzer, 2002). Gigerenzer, G., 2002. Calculated Risks: How to Know When Numbers Deceive You. Simon & Schuster, New York (UK edition: Reckoning with Risk: Learning to Live with Uncertainty. Penguin, London).

75 75 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01). Is easily detected as being false, because a significance test can never disprove the null hypothesis or the (undefined) experimental hypothesis. They are instances of the illusion of certainty (Gigerenzer, 2002). Gigerenzer, G., 2002. Calculated Risks: How to Know When Numbers Deceive You. Simon & Schuster, New York (UK edition: Reckoning with Risk: Learning to Live with Uncertainty. Penguin, London).

76 76 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01). Is also false. The probability p(D|H 0 ) is not the same as p(H 0 |D), and more generally, a significance test does not provide a probability for a hypothesis. The statistical toolbox, of course, contains tools that would allow estimating probabilities of hypotheses, such as Bayesian statistics.

77 77 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01). Is also false. The probability p(D|H 0 ) is not the same as p(H 0 |D), and more generally, a significance test does not provide a probability for a hypothesis. The statistical toolbox, of course, contains tools that would allow estimating probabilities of hypotheses, such as Bayesian statistics.

78 78 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01). Also refers to a probability of a hypothesis. This is because if one rejects the null hypothesis, the only possibility of making a wrong decision is if the null hypothesis is true. Thus, it makes essentially the same claim as Statement 2 does, and both are incorrect.

79 79 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01). Amounts to the replication fallacy (Gigerenzer, 1993, 2000). Here, p=1% is taken to imply that such significant data would reappear in 99% of the repetitions. Statement 6 could be made only if one knew that the null hypothesis was true. In formal terms, p(D|H 0 ) is confused with 1−p(D). Gigerenzer, G., 1993. The superego, the ego, and the id in statistical reasoning. In: Keren, G., Lewis, C. (Eds.), A Handbook for Data Analysis in the Behavioral Sciences: Methodological Issues. Erlbaum, Hillsdale, NJ, pp. 311–339. Gigerenzer, G., 2000. Adaptive Thinking: Rationality in the Real World. Oxford University Press, New York.

80 80 Caution 1. You have absolutely disproved the null hypothesis (that is, there is no difference between the population means). [] true/false [] 2.You have found the probability of the null hypothesis being true.[] true/false [] 3. You have absolutely proved your experimental hypothesis (that there is a difference between the population means). [] true/false [] 4. You can deduce the probability of the experimental hypothesis being true. [] true/false [] 5. You know, if you decide to reject the null hypothesis, the probability that you are making the wrong decision. [] true/false [] 6.You have a reliable experimental finding in the sense that if, hypothetically, the experiment were repeated a great number of times, you would obtain a significant result on 99% of occasions. [] true/false [] Suppose you have a treatment that you suspect may alter performance on a certain task. You compare the means of your control and experimental groups (say 20 subjects in each sample). Further, suppose you use a simple independent means t-test and your result is significant (t = 2.7, d.f. = 18, p = 0.01). Gigerenzer, G., 1993. The superego, the ego, and the id in statistical reasoning. In: Keren, G., Lewis, C. (Eds.), A Handbook for Data Analysis in the Behavioral Sciences: Methodological Issues. Erlbaum, Hillsdale, NJ, pp. 311–339. Gigerenzer, G., 2000. Adaptive Thinking: Rationality in the Real World. Oxford University Press, New York.

81 81 SPSS Tips Now you should go and try for yourself. Each week our cluster (5.05) is booked for 2 hours after this session. This will enable you to come and go as you please. Obviously other timetabled sessions for this module take precedence.


Download ppt "1 Smoking Data The investigation was based on examining the effectiveness of smoking cessation programs among heavy smokers who are also recovering alcoholics."

Similar presentations


Ads by Google