Presentation is loading. Please wait.

Presentation is loading. Please wait.

Class 22: Tuesday, Nov. 30th Today: One-way analysis of variance I will e-mail you tonight or tomorrow morning with comments on your project. Schedule:

Similar presentations


Presentation on theme: "Class 22: Tuesday, Nov. 30th Today: One-way analysis of variance I will e-mail you tonight or tomorrow morning with comments on your project. Schedule:"— Presentation transcript:

1 Class 22: Tuesday, Nov. 30th Today: One-way analysis of variance I will e-mail you tonight or tomorrow morning with comments on your project. Schedule: –Thurs., Dec. 2 nd – Homework 8 due –Thurs., Dec. 9 th – Final class –Mon., Dec. 13 th (5 pm) – Preliminary results from final project due –Tues., Dec. 14 th (5 pm) – Homework 9 due –Tues., Dec. 21 st (Noon) – Final project due.

2 Analysis of Variance The goal of analysis of variance is to compare the means of several (many) groups. Analysis of variance is regression with only categorical variables One-way analysis of variance: Groups are defined by one categorical variable.

3 Milgram’s Obedience Experiments Subjects recruited to take part in an experiment on “memory and learning.” The subject is the teacher. The subject conducted a paired-associated learning task with the student. The subject is instructed by the experimenter to administer a shock to the student each time he gave a wrong response. Moreover, the subject was instructed to “move one level higher on the shock generator each time the learner gives a wrong answer.” The subject was also instructed to announce the voltage level before administering a shock.

4 Four Experimental Conditions 1.Remote-Feedback condition: Student is placed in a room where he cannot be seen by the subject nor can his voice be heard; his answers flash silently on signal box. However, at 300 volts the laboratory walls resound as he pounds in protest. After 315 volts, no further answers appear, and the pounding ceases. 2.Voice-Feedback condition: Same as remote- feedback condition except that vocal protests were introduced that could be heard clearly through the walls of the laboratory.

5 3.Proximity: Same as the voice-feedback condition except that student was placed in the same room as the subject, a few feet from subject. Thus, he was visible as well as audible. 4.Touch-Proximity: Same as proximity condition except that student received a shock only when his hand rested on a shock plate. At the 150-volt level, the student demanded to be let free and refused to place his hand on the shock plate. The experimenter ordered the subject to force the victim’s hand onto the plate.

6 Two Key Questions 1.Is there any difference among the mean voltage levels of the four conditions? 2.If there are differences, what conditions specifically are different?

7 Multiple Regression Model for Analysis of Variance To answer these questions, we can fit a multiple regression model with voltage level as the response and one categorical explanatory variable (condition). We obtain a sample from each level of the categorical variable (group) and are interested in estimating the population means of the groups based on these samples. Assumptions of multiple regression model for one-way analysis of variance: –Linearity: automatically satisfied. –Constant variance: Spread within each group is the same. –Normality: Distribution within each group is normally distributed. –Independence: Sample consists of independent observations.

8 Comparing the Groups The coefficient on Condition[Proximity]=-26.25 means that proximity is estimated to have a mean that is 26.25 less than the mean of the means of all the conditions. Sample mean of proximity group.

9 Effect Test tests null hypothesis that the mean in all four conditions is the same versus alternative hypothesis that at least two of the conditions have different means. p-value of Effect Test < 0.0001. Strong evidence that population means are not the same for all four conditions.

10 JMP for One-way ANOVA One-way ANOVA can be carried out in JMP either using Fit Model with a categorical explanatory variable or Fit Y by X with the categorical variable as the response variable. After using the Fit Y by X command, click the red triangle next to Oneway Analysis and then Display Options, Boxplots to see side by side boxplots and click Mean/ANOVA to see means of the different groups and the test of whether all groups have the same means. This test of whether all groups have the same means has p- value Prob>F in the ANOVA table.

11 Prob>F = p-value for test that all groups have same mean. Same as p-value for Effect test in Fit Model Output.

12 Two Key Questions 1.Is there any difference among the mean voltage levels of the four conditions? Yes, there is strong evidence of a difference. p-value of Effect Test < 0.0001. 2.If there are differences, what conditions specifically are different?

13 Remote group is estimated to have a mean voltage level that is 66.75 higher than mean voltage level of all four groups. t-test is a test of vs.. p-value =.0001

14 Testing whether each of the groups is different Naïve approach to deciding which groups have mean that is different from the average of the means of all groups: Do t- test for each group and look for groups that have p-value <0.05. Problem: Multiple comparisons.

15 Errors in Hypothesis Testing State of World Null Hypothesis True Alternative Hypothesis True Decision Based on Data Accept Null Hypothesis Correct Decision Type II error Reject Null Hypothesis Type I errror Correct Decision When we do one hypothesis test and reject null hypothesis if p-value <0.05, then the probability of making a Type I error when the null hypothesis is true is 0.05. We protect against falsely rejecting a null hypothesis by making probability of Type I error small.

16 Multiple Comparisons Problem Compound uncertainty: When doing more than one test, there is an increase chance of making a mistake. If we do multiple hypothesis tests and use the rule of rejecting the null hypothesis in each test if the p-value is 0.05.

17 Multiple Comparison Simulation In multiplecomp.JMP, 20 groups are compared with sample sizes of ten for each group. The observations for each group are simulated from a standard normal distribution. Thus, in fact, Number of groups found to have means different than average using t-test and rejecting if p-value <0.05. Iteration12345 # of Groups

18 Individual vs. Familywise Error Rate When several tests are considered simultaneously, they constitute a family of tests. Individual Type I error rate: Probability for a single test that the null hypothesis will be rejected assuming that the null hypothesis is true. Familywise Type I error rate: Probability for a family of test that at least one null hypothesis will be rejected assuming that all of the null hypotheses are true. When we consider a family of tests, we want to make the familywise error rate small, say 0.05, to protect against falsely rejecting a null hypothesis.

19 Bonferroni Method General method for doing multiple comparisons for any family of k tests. Denote familywise type I error rate we want by p*, say p*=0.05. Compute p-values for each individual test -- Reject null hypothesis for ith test if Guarantees that familywise type I error rate is at most p*. Why Bonferroni works: If we do k tests and all null hypotheses are true, then using Bonferroni with p*=0.05, we have probability 0.05/k to make a Type I error for each test and expect to make k*(0.05/k)=0.05 errors in total.

20 Bonferroni method on Milgram’s data If we want to test whether each of the four groups has a mean different from the mean of all four groups, we have four tests. Bonferroni method: Check whether p-value of each test is <0.05/4=0.0125. There is strong evidence that the remote group has a mean higher than the mean of the four groups and the touch-proximity group has a mean lower than the mean of the four groups.

21 Multiple Comparisons and Jung’s test of the Zodiac What’s your sign? –Who should you date? –Look for successful relationships –Lets do a simulation to see what difficulties lie here add rows (say 150) in the first column (label it "his sign") formula --> random --> uniform (uniform*12) then numeric --> ceiling Repeat for "her sign" Repeat for "relationship success"


Download ppt "Class 22: Tuesday, Nov. 30th Today: One-way analysis of variance I will e-mail you tonight or tomorrow morning with comments on your project. Schedule:"

Similar presentations


Ads by Google