Chapter 11: The t Test for Two Related Samples. Repeated-Measures Designs The related-samples hypothesis test allows researchers to evaluate the mean.

Slides:

Advertisements

Similar presentations

Chapter 10: The t Test For Two Independent Samples

Advertisements

INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE

Chapter 11: The t Test for Two Related Samples

Chapter 15 Comparing Two Populations: Dependent samples.

COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.

EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.

PY 427 Statistics 1Fall 2006 Kin Ching Kong, Ph.D Lecture 9 Chicago School of Professional Psychology.

PSY 307 – Statistics for the Behavioral Sciences

Lecture 10 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D

Lecture 8 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D

Statistics Are Fun! Analysis of Variance

BCOR 1020 Business Statistics

Chapter Goals After completing this chapter, you should be able to:

Inferences On Two Samples

Chapter 9 Audit Sampling: An Application to Substantive Tests of Account Balances McGraw-Hill/Irwin ©2008 The McGraw-Hill Companies, All Rights Reserved.

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 9-1 Introduction to Statistics Chapter 10 Estimation and Hypothesis.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.

Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).

Today Concepts underlying inferential statistics

PSY 307 – Statistics for the Behavioral Sciences

Chapter 9: Introduction to the t statistic

1 Chapter 13: Introduction to Analysis of Variance.

PSY 307 – Statistics for the Behavioral Sciences

Inferential Statistics

COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.

AM Recitation 2/10/11.

Hypothesis Testing:.

Overview Definition Hypothesis

Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.

Chapter 14: Repeated-Measures Analysis of Variance.

The Hypothesis of Difference Chapter 10. Sampling Distribution of Differences Use a Sampling Distribution of Differences when we want to examine a hypothesis.

Chapter 11 Hypothesis Tests: Two Related Samples.

Chapter 12: Introduction to Analysis of Variance

COURSE: JUST 3900 TIPS FOR APLIA Developed By: Ethan Cooper (Lead Tutor) John Lohman Michael Mattocks Aubrey Urwick Chapter : 10 Independent Samples t.

t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.

Pengujian Hipotesis Dua Populasi By. Nurvita Arumsari, Ssi, MSi.

Hypothesis Testing Using the Two-Sample t-Test

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 10. Hypothesis Testing II: Single-Sample Hypothesis Tests: Establishing the Representativeness.

Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.

Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.

Testing Hypotheses about Differences among Several Means.

1 Section 9-4 Two Means: Matched Pairs In this section we deal with dependent samples. In other words, there is some relationship between the two samples.

© Copyright McGraw-Hill 2000

Chapter 14 Repeated Measures and Two Factor Analysis of Variance

Chapter 10 The t Test for Two Independent Samples

Chapter 12 Introduction to Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Eighth Edition by Frederick.

Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance

© Copyright McGraw-Hill 2004

Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 26 Chapter 11 Section 1 Inference about Two Means: Dependent Samples.

Chapter 11 The t-Test for Two Related Samples

Other Types of t-tests Recapitulation Recapitulation 1. Still dealing with random samples. 2. However, they are partitioned into two subsamples. 3. Interest.

Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,

AP Statistics. Chap 13-1 Chapter 13 Estimation and Hypothesis Testing for Two Population Parameters.

Chapter 8: Introduction to Hypothesis Testing. Hypothesis Testing A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis.

Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.

©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 4 Investigating the Difference in Scores.

Chapter 10: The t Test For Two Independent Samples.

Copyright © Cengage Learning. All rights reserved. 10 Inferences about Differences.

Chapter 12 Introduction to Analysis of Variance

Chapter 9 Introduction to the t Statistic

Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.

Dependent-Samples t-Test

Two-Sample Tests of Hypothesis

Two-Sample Hypothesis Testing

Lecture Slides Elementary Statistics Twelfth Edition

INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Test Review: Ch. 7-9

Chapter 13: Repeated-Measures Analysis of Variance

What are their purposes? What kinds?

Lecture Slides Elementary Statistics Twelfth Edition

Presentation transcript:

Chapter 11: The t Test for Two Related Samples

Repeated-Measures Designs The related-samples hypothesis test allows researchers to evaluate the mean difference between two treatment conditions using the data from a single sample. In a repeated-measures design, a single group of individuals is obtained and each individual is measured in both of the treatment conditions being compared. Thus, the data consist of two scores for each individual.

Repeated-Measures Designs: Matched-Subjects Design The related-samples t test can also be used for a similar design, called a matched-subjects design, in which each individual in one treatment is matched one-to-one with a corresponding individual in the second treatment. The matching is accomplished by selecting pairs of subjects so that the two subjects in each pair have identical (or nearly identical) scores on the variable that is being used for matching.

Matched-Subjects Design (cont’d.) Thus, the data consist of pairs of scores with each pair corresponding to a matched set of two "identical" subjects. For a matched-subjects design, a difference score is computed for each matched pair of individuals. matched-subjects design: 2 different samples  find the “matched” subject in each sample  formed the “matched pair”

Matched-Subjects Design (cont’d.) However, because the matching process can never be perfect, matched-subjects designs are relatively rare. As a result, repeated-measures designs (using the same individuals in both treatments) make up the vast majority of related-samples studies. repeated-measures designs: e.g. same individual  2 treatments  2 results (scores, samples) e.g. scores from 2 different judges e.g. before v.s. after

The t Statistic for a Repeated- Measures Research Design The repeated-measures t statistic allows researchers to test a hypothesis about the population mean difference between two treatment conditions using sample data from a repeated-measures research study. In this situation it is possible to compute a difference score for each individual: difference score = D = X 2 – X 1 Where X 1 is the person’s score in the first treatment and X 2 is the score in the second treatment.

The t Statistic for a Repeated- Measures Research Design (cont’d.) The sample of difference scores is used to test hypotheses about the population of difference scores. The null hypothesis states that the population of difference scores has a mean of zero: H 0 : μ D = 0

The t Statistic for a Repeated- Measures Research Design (cont’d.) In words, the null hypothesis (H 0 ) says that there is no consistent or systematic difference between the two treatment conditions. Note that the null hypothesis does not say that each individual will have a difference score equal to zero. Some individuals will show a positive change from one treatment to the other, and some will show a negative change.

Hypothesis Tests for the Repeated- Measures Design On average, the entire population will show a mean difference of zero. Thus, according to the null hypothesis, the sample mean difference should be near to zero. Remember, the concept of sampling error states that samples are not perfect and we should always expect small differences between a sample mean and the population mean.

Hypothesis Tests for the Repeated- Measures Design (cont’d.) The alternative hypothesis states that there is a systematic difference between treatments that causes the difference scores to be consistently positive (or negative) and produces a non-zero mean difference between the treatments: H 1 : μ D ≠ 0 According to the alternative hypothesis, the sample mean difference obtained in the research study is a reflection of the true mean difference that exists in the population.

Comparing Population Means: Hypothesis Testing with Dependent Samples Use the following test when the samples are dependent: Where is the mean of the differences s d is the standard deviation of the differences n is the number of pairs (differences)  = s MD ss MDMD  = M D - μ D

p repeated-measure v.s. independent –measure same/ different individuals tested twice 2. M D, s MD (remember n 1 = n 2 = n) D = X 2 – X 1, M D = ΣD/n, s 2 = SS/(n-1) s MD = s/  n 3. null hypothesis in words and in symbols no systematic differences or average difference=0

Ex 11.1 (p. 359) photo with white v.s. red background n1 = n2 = n = 9 males  df = n-1 = 8 H 1 : μ D ≠ 0 α = 0.01 Table 11.3 M D = ΣD/n = ?, s 2 = SS/(n-1) = ? s MD = s/  n = ?, t = (M D - 0) / s MD = ? t*(0.01,df=8) =  Conclusion: ?

Hypothesis Tests for the Repeated- Measures Design (cont’d.) The repeated-measures t statistic forms a ratio with exactly the same structure as the single- sample t statistic presented in Chapter 9. The numerator of the t statistic measures the difference between the sample mean and the hypothesized population mean. = M D - μ D t (e.g. p358)

Hypothesis Tests for the Repeated- Measures Design (cont’d.) The bottom of the ratio is the standard error, which measures how much difference is reasonable to expect between a sample mean and the population mean if there is no treatment effect; that is, how much difference is expected simply by sampling error. i.e. s M D obtained difference M D – μ D t = ───────────── = ─────── df = n – 1 standard error s M D

Hypothesis Tests for the Repeated- Measures Design (cont’d.) For the repeated-measures t statistic, all calculations are done with the sample of difference scores. The mean for the sample appears in the numerator of the t statistic and the variance of the difference scores is used to compute the standard error in the denominator.

Hypothesis Tests for the Repeated- Measures Design (cont’d.) As usual, the standard error is computed by: s 2 s s M D =  ___ or s M D = ___ n  n

Measuring Effect Size for the Repeated-Measures t Effect size for the repeated-measures t is measured in the same way that we measured effect size for the single-sample t and the independent-measures t. Specifically, you can compute an estimate of Cohen’s d to obtain a standardized measure of the mean difference, or you can compute r 2 to obtain a measure of the percentage of variance accounted for by the treatment effect.

Cohen’s d, r 2, and CI (p. 361) estimated d = M D / s r 2 = t 2 / (t 2 + df) confidence intervals: M D  t s MD

Ex (p. 362) Ex 11.1 (cont.): M D = 3, s MD = 0.5 find 95% CI 1 st, find 95% critical t value =  (df=8) CI: M D  t s MD = 3  * 0.5 = 3  = (1.847, 4.153) > 0  meaning....? n↑  s MD ↓  CI’s width ↓ % ↑  CI’s width ↑ ∴ CI is not a pure measure for effect size! ( ∵ it changes with n and %)

one-tailed test (p. 364) example 11.3 (from example 11.1) H 0 : μ d ≦ 0 H 1 : μ d > 0 α= 0.01 n = 9  df = 8  critical t* = reject H 0 if estimated t > SS=18, s 2 =SS/df=18/8=2.25, s MD =  (s 2 /n)=0.5 t = (3-0)/0.5 = 6 >2.896  reject H 0  significant i.e. p < 0.01

p n=4, acupuncture treatment to reduce back pain, M D =4.5, SS=27, α= 0.05 df = 3, s 2 = 27/3 = 9, s=3, s MD =3/2=1.5, t = (4.5-0)/1.5 = 3 a. 2-tailed test: t* =   failed to reject b. 1-tailed test: t*=  reject 2. acupuncture case: Cohen’s d and r 2 = ? d = M D /s = 4.5/3 = 1.5 r 2 = t 2 /(t 2 +df) = 9/(9+3) = p=0.021 for a repeated-measures t test: a. α= 0.01  failed to reject  not significant b. α= 0.05  reject  significant

11.4 Uses and Assumptions (p. 366) repeated-measures or independent, which design? advantages and disadvantages: 1. number of subjects 2. study changes over time 3. individual differences Assumptions: (p. 369) 1. independent within each treatment 2. population distribution of D ~ normal

Repeated-Measures Versus Independent-Measures Designs Because a repeated-measures design uses the same individuals in both treatment conditions, this type of design usually requires fewer participants than would be needed for an independent-measures design. In addition, the repeated-measures design is particularly well suited for examining changes that occur over time, such as learning or development.

Repeated-Measures Versus Independent-Measures Designs (cont’d.) The primary advantage of a repeated-measures design, however, is that it reduces variance and error by removing individual differences. The first step in the calculation of the repeated- measures t statistic is to find the difference score for each subject.

This simple process has two very important consequences: –First, the D score for each subject provides an indication of how much difference there is between the two treatments. If all of the subjects show roughly the same D scores, then there appears to be a consistent, systematic difference between the two treatments. Also, note that when all the D scores are similar, the variance of the D scores will be small, which means that the standard error will be small and the t statistic is more likely to be significant. Repeated-Measures Versus Independent-Measures Designs (cont’d.)

–Second, note that the process of subtracting to obtain the D scores removes the individual differences from the data. That is, the initial differences in performance from one subject to another are eliminated. Removing individual differences also tends to reduce the variance, which creates a smaller standard error and increases the likelihood of a significant t statistic. (Di, i: individual) Repeated-Measures Versus Independent-Measures Designs (cont’d.)

The following data demonstrate these points: SubjectX1X1 X2X2 D A9167 B25283 C31365 D58613 E72797 Repeated-Measures Versus Independent-Measures Designs (cont’d.)

First, notice that all of the subjects show an increase of roughly 5 points when they move from treatment 1 to treatment 2. Because the treatment difference is very consistent, the D scores are all clustered close together will produce a very small value for s 2. This means that the standard error in the bottom of the t statistic will be very small. Repeated-Measures Versus Independent-Measures Designs (cont’d.)

Second, notice that the original data show big differences from one subject to another. For example, subject B has scores in the 20's and subject E has scores in the 70's. –These big individual differences are eliminated when the difference scores are calculated. –Because the individual differences are removed, the D scores are usually much less variable than the original scores. –Again, a smaller variance will produce a smaller standard error, which will increase the likelihood of a significant t statistic. Repeated-Measures Versus Independent-Measures Designs (cont’d.)

Finally, you should realize that there are potential disadvantages to using a repeated- measures design instead of independent- measures. Because the repeated-measures design requires that each individual participate in more than one treatment, there is always the risk that exposure to the first treatment will cause a change in the participants that influences their scores in the second treatment.  error Repeated-Measures Versus Independent-Measures Designs (cont’d.)

For example, practice in the first treatment may cause improved performance in the second treatment. Thus, the scores in the second treatment may show a difference, but the difference is not caused by the second treatment. When participation in one treatment influences the scores in another treatment, the results may be distorted by order effects; this can be a serious problem in repeated-measures designs. Repeated-Measures Versus Independent-Measures Designs (cont’d.)

Counterbalancing One way to deal with time-related factors and order effect is counterbalance the order of presentation of treatments: randomly divided subjects into 2 groups, one from treatment 1  treatment 2, the other from treatment 2  treatment 1. (so prior experience helps the 2 treatments equally) Another way to deal with this problem: use independent-measures or a matched-subjects design (each individual receives only one treatment and measured only one time).

p the assumptions for repeated-measures t test? independent, normal 2. situations to use repeated-measure design? requires few subjects, changes over time (before/after, learning/developing), large variation between subjects/individuals 3. matched-subject vs repeated-measures? similarity: individual differences eliminated differences: 2 groups of individuals vs 1 group of individuals

p different treatments, 10 scores for each treatment, how many subjects is needed? a. independent-measures design? 20 b. repeated-measures design? 10 c. matched-subjects design? 20

Repeated-Measures Versus Independent-Measures Designs examples from another textbook H 0 : μ 1 = μ 2 (i.e. μ D = 0) 1. treat this example as the case of 2 dependent samples 2. treat this example as the case of 2 independent samples

11-* Comparing Population Means: Hypothesis Testing with Dependent Samples – Example Nickel Savings and Loan wishes to compare the two companies, Schadek and Bowyer, it uses to appraise the value of residential homes. Nickel Savings selected a sample of 10 residential properties and scheduled both firms for an appraisal. The results, reported in $000, are shown in the table (right). At the.05 significance level, can we conclude there is a difference in the mean appraised values of the homes?

Comparing Population Means: Hypothesis Testing with Dependent Samples – Example Step 1: State the null and alternate hypotheses. H 0 :  μ d = 0 H 1 :  μ d ≠ 0 Step 2: State the level of significance. The.05 significance level is stated in the problem. Step 3: Select the appropriate test statistic. To test the difference between two population means with dependent samples, we use the t-statistic.

11-* Comparing Population Means: Hypothesis Testing with Dependent Samples – Example Step 4: State the decision rule. Reject H 0 if t > t  /2, n-1 or t < - t  /2,n-1 t > t.025,9 or t < - t.025, 9 t > or t < LO11-3

11-* Comparing Population Means: Hypothesis Testing with Dependent Samples – Example Step 5: Take a sample and make a decision. The computed value of t, 3.305, is greater than the higher critical value, 2.262, so our decision is to reject the null hypothesis. Step 6: Interpret the result. The data indicate that there is a significant statistical difference in the property appraisals from the two firms. We would hope that appraisals of a property would be similar.

11-* Comparing Population Means: Hypothesis Testing with Dependent Samples – Excel Example paired (repeated- measures) test ：

Dependent versus Independent Samples How do we differentiate between dependent and independent samples?  Dependent samples are characterized by a measurement followed by an intervention of some kind and then another measurement. This could be called a “before” and “after” study.  Dependent samples are characterized by matching or pairing observations. Why do we prefer dependent samples to independent samples?  By using dependent samples, we are able to reduce the variation in the sampling distribution.

Comparing Population Means: Hypothesis Testing with Independent Samples – Example test H 0 : μ 1 =μ 2 ， assume σ 1 = σ 2 。 α=5 ％， 2-tailed test ， df = n 1 +n 2 -2 = 18 critical value of t test ： ±2.101 failed to reject H 0 ， different from the “dependent-sample test” ， why? independent-sample case: s MD = dependent-sample case: s MD = 1.392

Comparing Population Means: Hypothesis Testing with Independent Samples – Example (explained) paired-sample treated as independent sample, the variance includes 2 different parts: 1. the variation of two different companies  our target for comparison 2. the variation of different houses  not the target for comparison (or test)  variance is inflated, or increased out of proportion

11-* Comparing Population Means: Hypothesis Testing with Independent Samples – Excel Example LO11-3

another example The federal government recently granted funds for a special program designed to reduce crime in high-crime areas. A study of the results of the program in eight high- crime areas of Miami, Florida, yielded the following results. Has there been a decrease in the number of crimes since the inauguration of the program? Use the.01 significance level. Estimate the p-value.

another example (cont.) Step 1: H 0 : μ d ≦ 0 H 1 : μ d > 0 Step 2: The 0.01 significance level was chosen Step 3: Use a t-statistic with the standard deviation unknown for a paired sample. Step 4: Reject Ho if t > Step 5: = s d = Do not reject Ho. Step 6: There has not been a decrease in the number of crimes. From the t-table we estimate the p-value is less than 0.05 but more than 0.025, using software we find the p-value is about

independent v.s. dependent samples independent (if n 1 =n 2 =n) dependent (n pairs) s MD df2n–22n–2n–1