1 Psych 5500/6500 The t Test for a Single Group Mean (Part 2): p Values One-Tail Tests Assumptions Fall, 2008.

Slides:



Advertisements
Similar presentations
Psych 5500/6500 t Test for Two Independent Groups: Power Fall, 2008.
Advertisements

1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
Chapter 12: Testing hypotheses about single means (z and t) Example: Suppose you have the hypothesis that UW undergrads have higher than the average IQ.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 9 Hypothesis Testing Developing Null and Alternative Hypotheses Developing Null and.
Psych 5500/6500 Null Hypothesis Testing: General Concepts
Review: What influences confidence intervals?
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 5): Outliers Fall, 2008.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Independent and Dependent Variables Between and Within Designs.
8-2 Basics of Hypothesis Testing
Hypothesis Testing: Two Sample Test for Means and Proportions
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Chapter 5For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Suppose we wish to know whether children who grow up in homes without access to.
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Hypothesis Testing:.
Overview of Statistical Hypothesis Testing: The z-Test
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
Overview Definition Hypothesis
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Statistical Techniques I
Copyright © 2012 by Nelson Education Limited. Chapter 8 Hypothesis Testing II: The Two-Sample Case 8-1.
Sections 8-1 and 8-2 Review and Preview and Basics of Hypothesis Testing.
Significance Tests …and their significance. Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from.
1 Today Null and alternative hypotheses 1- and 2-tailed tests Regions of rejection Sampling distributions The Central Limit Theorem Standard errors z-tests.
Overview Basics of Hypothesis Testing
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
1 Statistical Inference Greg C Elvers. 2 Why Use Statistical Inference Whenever we collect data, we want our results to be true for the entire population.
Hypothesis Testing: One Sample Cases. Outline: – The logic of hypothesis testing – The Five-Step Model – Hypothesis testing for single sample means (z.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
Chapter 8 Introduction to Hypothesis Testing
Psy B07 Chapter 4Slide 1 SAMPLING DISTRIBUTIONS AND HYPOTHESIS TESTING.
1 Psych 5500/6500 t Test for Two Independent Means Fall, 2008.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 4): Power Fall, 2008.
1 Psych 5500/6500 Standard Deviations, Standard Scores, and Areas Under the Normal Curve Fall, 2008.
Chapter 20 Testing hypotheses about proportions
IE241: Introduction to Hypothesis Testing. We said before that estimation of parameters was one of the two major areas of statistics. Now let’s turn to.
Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests.
1 Psych 5500/6500 The t Test for a Single Group Mean (Part 1): Two-tail Tests & Confidence Intervals Fall, 2008.
1 Psych 5500/6500 t Test for Dependent Groups (aka ‘Paired Samples’ Design) Fall, 2008.
Sociology 5811: Lecture 11: T-Tests for Difference in Means Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
1 Psych 5500/6500 Introduction to the F Statistic (Segue to ANOVA) Fall, 2008.
DIRECTIONAL HYPOTHESIS The 1-tailed test: –Instead of dividing alpha by 2, you are looking for unlikely outcomes on only 1 side of the distribution –No.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Section 8-2 Basics of Hypothesis Testing.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall 9-1 σ σ.
Hypothesis Testing An understanding of the method of hypothesis testing is essential for understanding how both the natural and social sciences advance.
Chapter 3: Statistical Significance Testing Warner (2007). Applied statistics: From bivariate through multivariate. Sage Publications, Inc.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
Stats Lunch: Day 3 The Basis of Hypothesis Testing w/ Parametric Statistics.
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
Welcome to MM570 Psychological Statistics
Math 3680 Lecture #13 Hypothesis Testing: The z Test.
© Copyright McGraw-Hill 2004
Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.
Hypothesis Testing Introduction to Statistics Chapter 8 Feb 24-26, 2009 Classes #12-13.
Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.
SPSS Problem and slides Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of times would result in.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Copyright © 2009 Pearson Education, Inc. 9.2 Hypothesis Tests for Population Means LEARNING GOAL Understand and interpret one- and two-tailed hypothesis.
Welcome to MM570 Psychological Statistics Unit 5 Introduction to Hypothesis Testing Dr. Ami M. Gates.
Chapter 9 Introduction to the t Statistic
Chapter 5: Introduction to Statistical Inference
Hypothesis Testing: Two Sample Test for Means and Proportions
Testing Hypotheses I Lesson 9.
Presentation transcript:

1 Psych 5500/6500 The t Test for a Single Group Mean (Part 2): p Values One-Tail Tests Assumptions Fall, 2008

2 Reporting Results When a t test analysis is reported in the literature the t critical values and a curve with the rejection regions are rarely given. Usually there is not even an explicit statement about whether or not H0 was rejected. Instead, the first example from the previous lecture would commonly be reported like this: t(5) = 3.82, p=.012 The general format is: t(d.f.) = t obtained, p=...

3 p values Let’s take a closer look at what is reported: t(5) = 3.82, p=.012, or in generic terms, t(d.f.) = t obtained, p=... While the d.f. and the value of t obtained are of interest, the most important piece of information is the value of ‘p’, this tells you whether or not H0 was rejected.

4 1) Pragmatical Understanding of ‘p values’ The single most important thing to understand about a ‘p value’ is this: If p is less than or equal to your significance level then you reject H0, if it is greater than your significance level then you do not reject H0. t(5) = 3.82, p=.012 If our significance level was the usual.05, then we know that H0 was rejected. Even if you are unfamiliar with the statistic procedure i.e. χ²(5) = 6.43, p=.06 you can still easily tell if H0 was rejected of not simply by looking at the p value (here H0 is not rejected).

5 Examples In the following analyses H0 was rejected (assuming the significance level was set at.05): p=0.04 p<.05 p<.001 p=.05 In the following H0 was not rejected: p=0.07 p>.05 p=0.051 Giving a range for the p value (e.g. ‘p<.05’) used to be quite common because computing the exact value of p was difficult without a computer, now an exact value of p is usually provided.

6 2) Conceptual Understanding of ‘p values’ The ‘p value’ is a probability, in this case it is a conditional probability: p(getting a result that far or further from what H0 predicted | H0 is true) Let’s go back and take a look at the t test examples from the previous lecture.

7 First Example (Sample Mean = 106) Note that the obtained value of t falls within the ‘reject H0’ region.

8 H0 stated that the mean of the population was 100, the sample mean we obtained was 106. The ‘p value’ in this case would be the probability of obtaining a sample mean that is 6 or more away from 100 (in either direction).

9 The p value is the probability of obtaining a sample mean of 106 (t=3.82) or greater, or a sample mean of 94 (t=-3.82) or less, if this curve (based upon H0) is correct. In this case p =.012 (.006 on each tail). How to compute that is covered next, for now note that as the reject H0 regions add up to.05, then the p value of our result is obviously less than.05.

10 p=0.012 (.006 on each tail). The t table is not very useful for computing the p value here as the table covers only a small number of possible p values (you can’t look up t=3.82 and see what value of p goes with that). We have two tools we can use: 1) in the t tool I have written you can input the value of t and the df and the tool will give you the value of p; and 2) if you have SPSS analyze the data using the ‘One Sample t Test’, it will give you both the value of t and the value of p.

11 Second Example (Sample Mean = 102) Note the obtained value of t falls within the ‘do not reject H0’ region.

12 H0 stated that the mean of the population was 100, the sample mean we obtained was 102. The ‘p value’ in this case would be the probability of obtaining a sample mean that is 2 or more away from 100 (in either direction).

13 The p value is the probability of obtaining a sample mean of 102 (t=1.27) or greater, or a sample mean of 98 (t=-1.27) or less, if this curve (based upon H0) is correct. Note that p must be greater than.05 here, in this case p=.25 (.125 on each tail).

14 Important! Note again the ‘p value’ is the probability of the obtained sample mean given H0 is true, it is not the probability that H0 is true given the sample mean we obtained. This is a common mistake.

15 ‘A’ = getting a sample mean that far from what H0 predicted. ‘B’ = H0 being true. p value = p(A|B). It would be nice if it were p(B|A), or in other words if it were the probability that H0 is true given our sample mean, because that is what we really want to know. To find the p(B|A), however, we would have to turn to Bayes Theorem (we will take a look at that again later in the semester).

16 3) Mechanical Understanding of ‘p values’ Looking back at the graphs with the rejection regions and the p values, we can see that another way to understand a p value is that it is what our significance level would have to be in order to reject H0. Thus p=.04 means that we could have set the significance level at.04 (rather than.05) and still have rejected H0. While p=.06 would mean we would have had to set our significance level at.06 (which is not allowed) in order to reject H0.

17 So What is Our Significance Level Really? We have to decide upon a significance level before we run an analysis (we have to set up our decision making criteria before we look at the results) and so we set our significance level to.05. Say we then analyze the data and find that p=.03 (this is often reported as ‘being significant at the.03 level’). Is our significance level.05 or.03? Authors often write as if it were.03.

18 My Answer The significance level is.05 and α=.05. We would have rejected H0 if p=.05 or p=.049, that is our criterion. After the fact we see that we could have made p=.03 and still have rejected H0, but our decision-making - criterion was.05, and thus that was the actual probability of making a type 1 error if H0 were true.

19 On ‘Significance’ Dictionary definition of significance: ‘Full of meaning, important, momentous’. Statistical Significance: ‘We were able to reject H0 (i.e. the results were unlikely if H0 were true)’. When p is less than or equal to.05 we way the results were ‘statistically significant’, this does not necessarily mean the results are very meaningful or important or momentous. It does mean that we can conclude that H0 is probably false, which may or may not be important.

20 Example Let’s go back to: H 0 : μ Elbonia = 100 (same as USA) H A : μ Elbonia  100 (different than the USA) Let’s say that with a sample N=10,000 we obtain a sample mean of 101. As we will see in the lecture on power, with such a large N even such a small difference is likely to lead to rejection of H0.

21 The results are ‘statistically significant’ because we rejected H0. Whether the results are ‘theoretically significant’ would depend upon whether rejecting H0 sheds important light on the theory that predicted Elbonians would have an IQ other than 100. Whether the results are ‘socially significant’ would depend upon whether it significantly adds to our world to know that Elbonians are just a little tiny bit smarter on average than people in the USA.

22 Bottom Line Statistical significance is important because it is a prerequisite to making something out of the analysis. Whether that something is important or trivial is beyond the scope of statistics. People sometimes lose track of that, and begin to believe that getting statistically significant results is an end in itself.

23 Final Point about Significance (Really...I Promise) One view on statistical significance is that results are either statistically significant or they are not. To say something ‘neared significance’ or was ‘almost significant’ is meaningless (like saying someone is almost pregnant) at best and violates the logic of null hypothesis testing at worst. Others argue that.05 is arbitrary, and so a result of.06 is just about as good and probably reflects that the null hypothesis is wrong. More on this later in the semester when we tackle the controversy surrounding null hypothesis testing.

24 However, it seems like both sides agree that if the results are statistically significant (i.e. p .05) then it makes sense to say that a p=.001 is more convincing than a p=.049 (following along with the pregnancy metaphor: if you are pregnant you can either be at the beginning of the pregnancy or well advanced).

25 Segue Recall that when we run an experiment we are testing a theory that makes some prediction, and that prediction becomes our alternative hypothesis (HA). So far our examples have involved examining a theory which predicts a difference (i.e. μ Elbonia  100) but does not predict in which direction that difference will go (whether μ Elbonia will be less than or greater than 100). Such ‘nondirectional’ hypotheses are examined by putting a reject H0 region on both tails of the sampling distribution, and are thus called ‘two-tailed tests’.

26 One-tailed Tests When the theory we are testing specifically predicts in which direction the difference should go (i.e. we are testing a ‘directional’ hypothesis), then we perform a ‘one-tailed test’. First we will look at how to perform the t test when the theory predicts that Elbonians should have higher IQ’s than people in the USA.

27 Writing the Hypotheses If the theory predicts that the mean score of Elbonians is greater than 100: H 0 : μ Elbonia  100 H A : μ Elbonia > 100 Notes: 1.H A always state what the theory predicts. 2.Conceptually H 0 is still the hypothesis of ‘no difference’, but it needs to be written the way it is to insure that the two hypotheses are mutually exclusive and exhaustive.

28 Setting up the Rejection Region Note that the full.05 is in one tail now, making it easier to reject H0 if the results go in that direction and impossible to reject H0 if the results go in the other direction. The new tc value is 2.015, for the two-tail test it was

29 Determining Which Tail Gets the Rejection Region H 0 : μ Elbonia  100 H A : μ Elbonia > The conceptual approach: If H A is correct we will want to ‘reject H0’, so put the reject H0 region where H A predicts the results will fall (in this case above 100). 2.The idiot proof approach: pretend that the symbol in H A is an arrow, it points to the tail with the rejection region.

30 Decision

31 p Value With a one-tail test the p value is the area of the curve from the t obt value to the end of the tail that has the rejection region.

32 p Value The p value is the probability of getting a result that is that far from what H0 predicted if H0 is true. We can see that the p value must be less than.05, actually its p=.006.

33 Another Example Now let’s see how to set things up if we are testing a theory which specifically predicts that the mean IQ of Elbonians is less than 100.

34 Writing Hypotheses If the theory predicts that the mean score of Elbonians is less than 100: H 0 : μ Elbonia  100 H A : μ Elbonia <100 Again, H A expressed the prediction made by the theory, while H 0 is every other possibility.

35 Setting Up the Rejection Region

36 Which Tail? H 0 : μ Elbonia  100 H A : μ Elbonia <100 1.The conceptual approach: If H A is correct we will want to ‘reject H0’, so put the reject H0 region where H A predicts the results will fall (in this case below 100). 2.The idiot proof approach: pretend that the symbol in H A is an arrow, it points to the tail with the rejection region.

37 Decision

38 p Value With a one-tail test the p value is the area of the curve from the t obt value to the end of the tail that has the rejection region..

39 p Value The p value is the probability of getting a result that is that far from what H0 predicted if H0 is true. We can see that the p value must be greater than.05, actually its p=.994.

40 The Trade-Off Doing a one-tail test is a bit of a gamble; if the theory is correct in its prediction then it is easier to reject H0 in favor of HA (i.e. the theory) because all.05 is put in the tail the theory predicts. If the results go in the opposite direction from that predicted by the theory, however, then it is impossible to reject H0 no matter how much a difference there is.

41 Justifications for Making a Test One- Tailed Bottom line: you have to decide that the test is one-tailed before you obtain your data. 1.Refer to results from prior, similar experiments. 2.You are testing a theory which specifically predicts which way the results should go. 3.Only one tail is of interest.

42 SPSS Analysis The t test for a single group mean can be found in the SPSS: ‘Analyze>>Compare Means>>One-Sample T Test...’ menu. For the value of the ‘Test Value’ input the population mean according to H0.

43 SPSS Output

44 SPSS Analysis (cont.) When SPSS does a ‘One Sample T Test’ analysis it will give a value of ‘t’ and a value of ‘p’. The value of t is for t obtained. The value of p is for a two-tailed test. If you want the p value for a one-tailed test do the following: First, determine whether the difference between the sample mean and population mean proposed by H0 was in the direction predicted by Ha. Second, if the direction was predicted by Ha then the actual p value = (SPSS p value)/2, if the direction was the opposite of that predicted by Ha then the actual p value = 1 – ((SPSS p value)/2)). In the former case (Ha made the correct prediction) the p value goes down by half, in the latter case (Ha made the wrong prediction) the p value is greater than.50

45 SPSS Analysis (cont.) The confidence interval given by SPSS in this analysis is the interval that has the stated probability (95% unless you indicate otherwise) of containing the true value of the difference between the mean of the population from which the sample was drawn, and the mean of the population as stated by H0, i.e. the true value of:

46 SPSS Analysis (cont.) For the example when the sample mean was 106, the SPSS output states that the 95% confidence interval of the difference between the mean IQ of Elbonians and the value proposed by H0 is: 1.96  Difference  As that interval does not contain ‘0’ we reject H0.

47 Assumptions In all of the statistical tests we look at we will be assuming that we have a valid measure procedure, and that there is no systematic bias in our samples.

48 Assumptions Underlying This t Test The t test for a single group mean has these additional assumptions: 1.That the scores in the population are normally distributed. 2.That the scores are independent.

49 Assumption of Normality The t test is said to be ‘robust’ in terms of this assumption, which means that the population can be fairly non-normal without it having much of an effect on the validity of the analysis, particularly when the N of our sample is large (which will influence the shape of the SDM towards normality thanks to the Central Limit Theorem). But some deviations from normality can create problems. We will take a closer look at this assumption in an upcoming lecture.

50 Assumption of Independence The assumption of independence of scores means that any one person’s score is unaffected by (can’t be predicted by) anyone else’s score. This would be violated if we measured the same person twice, or if we let two Elbonians work together on the IQ test.

51 Independence of Errors The ‘independence of the scores’ is sometimes referred to as the ‘independence of errors. Calling it the ‘independence of errors’ will make more sense within the context of the type of analyses we will learn next semester, and we will take a look at what it means at that time. For now, just think of independence in terms of any one score not influencing (i.e. not able to predict) any other score in the sample.