Presentation is loading. Please wait.

Presentation is loading. Please wait.

T-T ESTS AND A NALYSIS OF V ARIANCE Jennifer Kensler.

Similar presentations


Presentation on theme: "T-T ESTS AND A NALYSIS OF V ARIANCE Jennifer Kensler."— Presentation transcript:

1 T-T ESTS AND A NALYSIS OF V ARIANCE Jennifer Kensler

2 Laboratory for Interdisciplinary Statistical Analysis Collaboration From our website request a meeting for personalized statistical advice Great advice right now: Meet with LISA before collecting your data Short Courses Designed to help graduate students apply statistics in their research Walk-In Consulting Monday—Friday* 12-2PM for questions requiring <30 mins *Mon—Thurs during the summer All services are FREE for VT researchers. We assist with research—not class projects or homework. LISA helps VT researchers benefit from the use of Statistics www.lisa.stat.vt.edu Experimental Design Data Analysis Interpreting Results Grant Proposals Software (R, SAS, JMP, SPSS...)

3 T-T ESTS AND A NALYSIS OF V ARIANCE

4 O NE S AMPLE T-T EST 4

5 Used to test whether the population mean is different from a specified value. Example: Is the mean height of 12 year old girls greater than 60 inches? 5 http://office.microsoft.com/en-us/images

6 S TEP 1: F ORMULATE THE H YPOTHESES The population mean is not equal to a specified value. Null Hypothesis, H 0 : μ = μ 0 Alternative Hypothesis: H a : μ ≠ μ 0 The population mean is greater than a specified value. H 0 : μ = μ 0 H a : μ > μ 0 The population mean is less than a specified value. H 0 : μ = μ 0 H a : μ < μ 0 6

7 S TEP 2: C HECK THE A SSUMPTIONS The sample is random. The population from which the sample is drawn is either normal or the sample size is large. 7

8 S TEPS 3-5 Step 3: Calculate the test statistic: Where Step 4: Calculate the p-value based on the appropriate alternative hypothesis. Step 5: Write a conclusion. 8

9 I RIS E XAMPLE A researcher would like to know whether the mean sepal width of a variety of irises is different from 3.5 cm. The researcher randomly selects 50 irises and measures the sepal width. Step 1: Hypotheses H 0 : μ = 3.5 cm H a : μ ≠ 3.5 cm 9 http://en.wikipedia.org/wiki/Iris_flower_data_set

10 JMP Steps 2-4: JMP Demonstration Analyze  Distribution Y, Columns: Sepal Width Normal Quantile Plot Test Mean Specify Hypothesized Mean: 3.5 10

11 JMP O UTPUT 11 Step 5 Conclusion: The mean sepal width is not significantly different from 3.5 cm.

12 T WO S AMPLE T-T EST 12

13 T WO S AMPLE T-T EST Two sample t-tests are used to determine whether the population mean of one group is equal to, larger than or smaller than the population mean of another group. Example: Is the mean cholesterol of people taking drug A lower than the mean cholesterol of people taking drug B? 13 http://office.microsoft.com/en-us/images

14 S TEP 1: F ORMULATE THE H YPOTHESES The population means of the two groups are not equal. H 0 : μ 1 = μ 2 H a : μ 1 ≠ μ 2 The population mean of group 1 is greater than the population mean of group 2. H 0 : μ 1 = μ 2 H a : μ 1 > μ 2 The population mean of group 1 is less than the population mean of group 2. H 0 : μ 1 = μ 2 H a : μ 1 < μ 2 14

15 S TEP 2: C HECK THE A SSUMPTIONS The two samples are random and independent. The populations from which the samples are drawn are either normal or the sample sizes are large. The populations have the same standard deviation. 15

16 S TEPS 3-5 Step 3: Calculate the test statistic where Step 4: Calculate the appropriate p-value. Step 5: Write a Conclusion. 16

17 T WO S AMPLE E XAMPLE A researcher would like to know whether the mean sepal width of setosa irises is different from the mean sepal width of versicolor irises. The researcher randomly selects 50 setosa irises and 50 versicolor irises and measures their sepal widths. Step 1 Hypotheses: H 0 : μ setosa = μ versicolor H a : μ setosa ≠ μ versicolor 17 http://en.wikipedia.org/ wiki/Iris_flower_data_set http://en.wikipedia.org/ wiki/Iris_versicolor

18 JMP Steps 2-4: JMP Demonstration: Analyze  Fit Y By X Y, Response: Sepal Width X, Factor: Species Means/ANOVA/Pooled t Normal Quantile Plot  Plot Actual by Quantile 18

19 JMP O UTPUT 19 Step 5 Conclusion: There is strong evidence (p- value < 0.0001) that the mean sepal widths for the two varieties are different.

20 P AIRED T-T EST 20

21 P AIRED T-T EST The paired t-test is used to compare the population means of two groups when the samples are dependent. Example: A researcher would like to determine if background noise causes people to take longer to complete math problems. The researcher gives 20 subjects two math tests one with complete silence and one with background noise and records the time each subject takes to complete each test. 21

22 S TEP 1: F ORMULATE THE H YPOTHESES The population mean difference is not equal to zero. H 0 : μ difference = 0 H a : μ difference ≠ 0 The population mean difference is greater than zero. H 0 : μ difference = 0 H a : μ difference > 0 The population mean difference is less than a zero. H 0 : μ difference = 0 H a : μ difference < 0 22

23 S TEP 2: C HECK THE ASSUMPTIONS The sample is random. The data is matched pairs. The differences have a normal distribution or the sample size is large. 23

24 S TEPS 3-5 24 Where d bar is the mean of the differences and s d is the standard deviations of the differences. Step 4: Calculate the p-value. Step 5: Write a conclusion. Step 3: Calculate the test Statistic:

25 P AIRED T-T EST E XAMPLE A researcher would like to determine whether a fitness program increases flexibility. The researcher measures the flexibility (in inches) of 12 randomly selected participants before and after the fitness program. Step 1: Formulate a Hypothesis H 0 : μ After - Before = 0 H a : μ After - Before > 0 25 http://office.microsoft.com/en-us/images

26 P AIRED T-T EST E XAMPLE Steps 2-4: JMP Analysis: Create a new column of After – Before Analyze  Distribution Y, Columns: After – Before Normal Quantile Plot Test Mean Specify Hypothesized Mean: 0 26

27 JMP O UTPUT 27 Step 5 Conclusion: There is not evidence that the fitness program increases flexibility.

28 O NE -W AY A NALYSIS OF V ARIANCE 28

29 O NE -W AY ANOVA ANOVA is used to determine whether three or more populations have different distributions. 29 A B C Medical Treatment

30 ANOVA S TRATEGY The first step is to use the ANOVA F test to determine if there are any significant differences among the population means. If the ANOVA F test shows that the population means are not all the same, then follow up tests can be performed to see which pairs of population means differ. 30

31 O NE -W AY ANOVA M ODEL 31 In other words, for each group the observed value is the group mean plus some random variation.

32 O NE -W AY ANOVA H YPOTHESIS Step 1: We test whether there is a difference in the population means. 32

33 S TEP 2: C HECK ANOVA A SSUMPTIONS The samples are random and independent of each other. The populations are normally distributed. The populations all have the same standard deviations. The ANOVA F test is robust to the assumptions of normality and equal standard deviations. 33

34 S TEP 3: ANOVA F T EST 34 Compare the variation within the samples to the variation between the samples. A B C A B C Medical Treatment

35 ANOVA T EST S TATISTIC 35 Variation within groups small compared with variation between groups → Large F Variation within groups large compared with variation between groups → Small F

36 MSG 36 The mean square for groups, MSG, measures the variability of the sample averages. SSG stands for sums of squares groups.

37 MSE 37 Mean square error, MSE, measures the variability within the groups. SSE stands for sums of squares error.

38 S TEPS 4-5 Step 4: Calculate the p-value. Step 5: Write a conclusion. 38

39 ANOVA E XAMPLE A researcher would like to determine if three drugs provide the same relief from pain. 60 patients are randomly assigned to a treatment (20 people in each treatment). Step 1: Formulate the Hypotheses H 0 : μ Drug A = μ Drug B = μ Drug C H a : The μ i are not all equal. 39 http://office.microsoft.com/en-us/images

40 S TEPS 2-4 JMP demonstration Analyze  Fit Y By X Y, Response: Pain X, Factor: Drug Normal Quantile Plot  Plot Actual by Quantile Means/ANOVA 40

41 JMP O UTPUT AND C ONCLUSION 41 Step 5 Conclusion: There is strong evidence that the drugs are not all the same.

42 F OLLOW -U P T EST The p-value of the overall F test indicates that the level of pain is not the same for patients taking drugs A, B and C. We would like to know which pairs of treatments are different. One method is to use Tukey’s HSD (honestly significant differences). 42

43 T UKEY T ESTS Tukey’s test simultaneously tests JMP demonstration Oneway Analysis of Pain By Drug  Compare Means  All Pairs, Tukey HSD 43 for all pairs of factor levels. Tukey’s HSD controls the overall type I error.

44 JMP O UTPUT 44 The JMP output shows that drugs A and C are significantly different.

45 T WO -W AY A NALYSIS OF V ARIANCE 45

46 T WO -W AY ANOVA We are interested in the effect of two categorical factors on the response. We are interested in whether either of the two factors have an effect on the response and whether there is an interaction effect. An interaction effect means that the effect on the response of one factor depends on the level of the other factor. 46

47 I NTERACTION 47

48 T WO -W AY ANOVA M ODEL 48

49 T WO -W AY ANOVA E XAMPLE We would like to determine the effect of two alloys (low, high) and three cooling temperatures (low, medium, high) on the strength of a wire. JMP demonstration Analyze  Fit Model Y: Strength Highlight Alloy and Temp and click Macros  Factorial to Degree Run Model 49 http://office.microsoft.com/en-us/images

50 JMP O UTPUT 50 Conclusion: There is strong evidence of an interaction between alloy and temperature.

51 A NALYSIS OF C OVARIANCE 51

52 A NALYSIS O F C OVARIANCE (ANCOVA) Covariates are variables that may affect the response but cannot be controlled. Covariates are not of primary interest to the researcher. We will look at an example with two covariates, the model is 52

53 ANCOVA E XAMPLE A school district would like to determine if a new reading program improves student reading. Approximately half the students are assigned to the treatment group (new reading program) and half to the control group (traditional method). The students are tested at the beginning and end of the school year and the change in their score is recorded. The administrator decide to use days absent and gender as covariates. 53 http://office.microsoft.com/en-us/images

54 JMP I NSTRUCTIONS JMP demonstration Analyze  Fit Model Y: Score Change Add: Treatment Days Absent Gender Run Model Response Score Change  Estimates  Show Prediction Expression 54

55 JMP O UTPUT 55 Treatment and days absent had significant effects on improvement, but gender did not.

56 C ONCLUSION The one sample t-test allows us to test whether the population mean of a group is equal to a specified value. The two-sample t-test and paired t-test allow us to determine if the population means of two groups are different. ANOVA and ANCOVA methods allow us to determine whether the population means of several groups are different. 56

57 SAS, SPSS AND R For information about using SAS, SPSS and R to do ANOVA: http://www.ats.ucla.edu/stat/sas/topics/anova.htm http://www.ats.ucla.edu/stat/spss/topics/anova.htm http://www.ats.ucla.edu/stat/r/sk/books_pra.htm 57

58 R EFERENCES Fisher’s Irises Data (used in one sample and two sample t-test examples). Flexibility data (paired t-test example): Michael Sullivan III. Statistics Informed Decisions Using Data. Upper Saddle River, New Jersey: Pearson Education, 2004: 602. 58


Download ppt "T-T ESTS AND A NALYSIS OF V ARIANCE Jennifer Kensler."

Similar presentations


Ads by Google