Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to SAS Essentials Mastering SAS for Data Analytics

Similar presentations


Presentation on theme: "Introduction to SAS Essentials Mastering SAS for Data Analytics"— Presentation transcript:

1 Introduction to SAS Essentials Mastering SAS for Data Analytics
Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward

2 Chapter 13: ANALYSIS OF VARIANCE
SAS ESSENTIALS -- Elliott & Woodward

3 LEARNING OBJECTIVES • To be able to compare three or more means using one- way ANOVA with multiple comparisons • To be able to perform a repeated measures (dependent samples) analysis of variance with multiple comparisons • To be able to graph mean comparisons SAS ESSENTIALS -- Elliott & Woodward

4 PROC ANOVA and PROC GLM This chapter illustrates how to perform an analysis of variance (ANOVA) for several common designs. The book covers three SAS procedures: PROC ANOVA, PROC GLM, and PROC MIXED. (PROC MIXED is covered in Chapter 14.) In this chapter, we describe: PROC ANOVA: a basic procedure useful for one-way ANOVA or for multiway factorial designs with fixed factors and an equal number of observations per cell. PROC GLM: for one-way repeated measures analysis, and techniques not supported by PROC ANOVA. SAS ESSENTIALS -- Elliott & Woodward

5 13.1 COMPARING THREE OR MORE MEANS USING ONE-WAY ANALYSIS OF VARIANCE
A one-way ANOVA is an extension of the independent group t-test where there are more than two groups. Assumptions for this test are similar to those for the t- test: Data within groups are normally distributed with equal variances across groups. Groups are from independent samples. The hypotheses for the comparison of independent groups are as follows (k is the number of groups): H0: m1 = m2 = … = mk: Means of all the groups are equal. Ha: mi  mj for some i  j: At least two means are not equal. SAS ESSENTIALS -- Elliott & Woodward

6 Simplified Syntax for PROC ANOVA
The syntax for the statement is as follows: PROC ANOVA <Options>; CLASS variable; MODEL dependentvar = independentvars; MEANS independentvars / typecomparison <meansoptions>; CLASS defines grouping variable. The MODEL statement defines the model tested. The MEANS statement defines post hoc multiple comparisons. SAS ESSENTIALS -- Elliott & Woodward

7 Table 13.1 Common Options for PROC ANOVA and PROC GLM for preforming a One-Way ANOVA or simple Repeated Measures Option Explanation DATA = dataname Specifies which data set to use. NOPRINT Suppresses output. This is used when you want to extract information from ANOVA results but don’t want SAS to produce output in the Results Viewer. OUTSTAT=dataname Names an output data set that saves a number of the results from the ANOVA calculation. PLOTS=options Specify PLOTS=NONE to suppress plots that are generated by default. ORDER=option Specifies order in which to display the CLASS variable (similar to what was covered in Chapter 10:Analyzing Counts and Tables.) Options are DATA, FORMATTED, FREQ, or INTERNAL. ALPHA=p Specifies alpha level for a Confidence Interval (GLM only) SAS ESSENTIALS -- Elliott & Woodward

8 Calculates means for dependent variables and may include comparisons.
Common Statements for PROC ANOVA and PROC GLM (For one-way analyses) (Table 13.1 Continued) CLASS variable list; This statement is required and specifies the grouping variable(s) for the analysis. MODEL specification Specifies the dependent and independent variables for the analysis. More specifically, it takes the form MODEL dependentvariable=independentvariable(s); FREQ var Specifies that a variable represents the count of values for an observation. Similar to the WEIGHT statement for PROC FREQ. MEANS vars Calculates means for dependent variables and may include comparisons. LSMEANS vars Calculates least square means for a dependent variable & to request comparisons. (GLM Only) REPEATED vars Used to specify repeated measure variables. TEST specificaion Used to specify a hypothesis test value. CONTRAST specification Allows you to create customized posthoc comparisons. (GLM Only) BY, FORMAT, LABEL, WHERE These statements are common to most procedures, and may be used here. SAS ESSENTIALS -- Elliott & Woodward

9 Using the MEANS or LSMEANS Statement
When you perform a one-way ANOVA, typically there is a two- step procedure: (1) test the null hypothesis to determine whether any significant differences exist, and (2) if H0 is rejected, run subsequent multiple comparison tests to determine which differences are significantly different. Pairwise comparison of means can be performed using one of several multiple comparison tests specified using the MEANS statement, which has the following format (where independantvar is a CLASS variable): MEANS in dependentvar/typecomparison <meansoptions>; For PROC GLM, use the LSMEANS statement: LSMEANS in dependentvar / typecomparison <meansoptions>; SAS ESSENTIALS -- Elliott & Woodward

10 Bonferroni t-tests of difference DUNCAN Duncan’s multiple range test
Table 13.2 Common typecomparison options for the PROC ANOVA or GLM MEANS Statement (Options following the slash /) Option Explanation BON Bonferroni t-tests of difference DUNCAN Duncan’s multiple range test SCHEFFE Scheffe multiple comparison SNK Student Newman Keuls multiple range test LSD Fisher’s Least Significant Difference TUKEY Tukey’s studentized range test DUNNETT (‘x’) Dunnett’s test—compare to a single control, where 'x' is the category value of the control group ALPHA=pvalue Specifies the significance level for comparisons (default: 0.05) CLDIFF Requests that confidence limits be included in the output. SAS ESSENTIALS -- Elliott & Woodward

11 Do Hands on Example p 315 (AANOVA1.SAS)
Common typecomparison options for the PROC GLM LSMEANS Statement (Options following the slash /) (Table continued) ADJUST=option Specify type of multiple comparison. Examples are BON, DUNCAN, SCHFEE, SNK, LSD, DUNNETT PDIFF= Calculates p-values base (default is T). You can also specify TUKEY or DUNNETT options. Do Hands on Example p 315 (AANOVA1.SAS) SAS ESSENTIALS -- Elliott & Woodward

12 SAS Code for a One-Way ANOVA (From AANOVA1.SAS)
PROC ANOVA DATA=ACHE; CLASS BRAND; MODEL RELIEF=BRAND; MEANS BRAND/TUKEY; TITLE 'ANOVA EXAMPLE'; RUN; QUIT; CLASS defines the grouping variable, BRAND. The MODEL statement indicates you are wanting to test if BRAND can predict mean RELIEF. The MEANS statement is used for a post hoc test (if Ho is rejected) to determine which means are different SAS ESSENTIALS -- Elliott & Woodward

13 Results of a One-Way ANOVA
The primary results for a One-Way ANOVA test are in the following table: The p-value is used to decide whether or not to reject the null hypothesis. Typically, if p<0.05, you reject Ho. If you reject Ho, it indicates that some means (by group) are different, so you proceed to look at the post hoc results. SAS ESSENTIALS -- Elliott & Woodward

14 Post Hoc Multiple Comparisons test – Tukey Test
This test summarizes which means are found different at the alpha=0.05 significance level. In this table, means that are considered NOT DIFFERENT (at alpha=0.05) are grouped (see the Tukey Grouping Column). Thus, means 3 and one are grouped into group B – and the means (26.54 and 26.28) are considered NOT DIFFERENT. BRAND 2 is grouped alone (GROUP A) , thus the mean for BRAND 2 (30.880) is considered LARGER than either or (at the alpha=0.05 level). SAS ESSENTIALS -- Elliott & Woodward

15 Graphical Comparison of Groups
This graph reinforces the statistical results --- that groups 1 and 3 are very similar, but the mean for group 2 is larger than for either groups 1 or 2. SAS ESSENTIALS -- Elliott & Woodward

16 Multiple Comparison Test Using Confidence Limits
Using this code for the comparison test: MEANS BRAND/TUKEY CLDIFF; Results in this table In this table, mean differences are compared. For example, the first line tests the difference between means for groups 2 minus 3 = and reports a 95% CL of to 7,989. Since this range does not include 0.0, the difference is considered statistical different at the 0.05 significance level. The *** indicates a 0.05 significant difference for that comparison SAS ESSENTIALS -- Elliott & Woodward

17 Multiple Comparisons using p-values
Using PROC GLM instead of PROC ANOVA, and using this code for the comparison test: LSMEANS BRAND/ PDIFF; Results in this table: This table reports the results of mean comparisons. For example, the comparison of mean 1 vs 3 reports a p-value of , indicating that the difference in means is NOT statistically different. The comparison of means 2 vs 3 is statistically different at p= SAS ESSENTIALS -- Elliott & Woodward

18 13.2 COMPARING THREE OR MORE REPEATED MEASURES
Repeated measures are observations taken from the same or related subjects over time or in differing circumstances. When there are three or more repeated measures, the corresponding analysis is a repeated measures ANOVA. The hypotheses being tested with repeated measures ANOVA are as follows: H0: There is no difference among the group means (repeated measures). Ha : There is a difference among the group means. SAS ESSENTIALS -- Elliott & Woodward

19 Example Syntax for a Repeated Measures ANOVA
PROC GLM DATA=STUDY; CLASS SUBJ DRUG; MODEL RESULT = SUBJ DRUG; MEANS DRUG/DUNCAN; TITLE 'Repeated Measures ANOVA'; RUN; QUIT; The CLASS statement indicates grouping variables. In repeated measures, a subject variable is included. The MODEL statement indicates that you want to predict RESULT from type of DRUG. Subject is included to account for subject differences SAS ESSENTIALS -- Elliott & Woodward

20 Example Repeated Measures Data
Each Subject received each of the 4 drugs (in random order, with a washout period between administrations.) Subj Drug1 Drug2 Drug3 Drug4 1 31 29 17 35 2 15 11 23 3 25 21 19 4 45 5 27 SAS ESSENTIALS -- Elliott & Woodward

21 Repeated Measures Data in SAS
The data for the repeated measures in not like in the talbe. Each line represents an observation, and each subject has 4 lines representing the 4 drugs. DATA STUDY; INPUT SUBJ DRUG RESULT; DATALINES; 1 1 31 1 2 29 1 3 17 1 4 35 Etc… Do the Hands on Example p 320 (AGLM1.SAS) Notice how data is set up for repeated measures – each subject has 4 records – one for each drug observation. SAS ESSENTIALS -- Elliott & Woodward

22 Results from Repeated Measures ANOVA
The results of interest are in the Type III table: Typically, you are not interested in the SUBJ line in this table (or p-value). The line of interest is the DRUG line, which tests the hypothesis of interest. In this case p<0.0001, which indicates a significant difference in means for the 4 Drugs. Do a post hoc test to determine which drugs are different. SAS ESSENTIALS -- Elliott & Woodward

23 Multiple Comparisons for Repeated Measures ANOVA
This statement provides a multiple comparison test, which is appropriate if the main hypothesis is significant: MEANS DRUG/DUNCAN; Results indicate that there is NO DIFFERNCE in DRUGS 1 and 2 (Means of 26.6 vs 25.8). However, DRUG4 has the largest (statistically significant) mean at 33.0 and DRUG 3 has the smallest at SAS ESSENTIALS -- Elliott & Woodward

24 Graphical Results of a Repeated Measures ANOVA
This is visual confirmation of the multiple comparisons – the line for DRUG4 is consistently higher than all the others. DRUGS 1 and 2 are too close to call different , and DRIG 3 has the smallest means. SAS ESSENTIALS -- Elliott & Woodward

25 Using LSMEANS for Comparisons (Tukey)
Using this code: LSMEANS DRUG/PDIFF ADJUST=TUKEY; You get the following results: Other common ADJUST= options are BON, DUNNETT, and SCHEFFE. Results indicate that there is NO DIFFERNCE in DRUGS 1 and 2 (p=.97). However, the mean for DRUG4 is different than for DRUG1 (p=0.0147) and so on… SAS ESSENTIALS -- Elliott & Woodward

26 13.3 GOING DEEPER: CONTRASTS
At times when you are comparing means across groups in a one-way ANOVA, you may be interested in specific posthoc comparisons. For example, suppose you have a data set consisting of four groups. For some hypothesized reason, you wonder if the average of means 1 and 2 is different from mean 4. Using a CONTRAST statement, you can specify such a comparison. A CONTRAST statement uses the following syntax: CONTRAST 'label' indvar effectvalues; SAS ESSENTIALS -- Elliott & Woodward

27 Setting Up a Contrast Statement
For example, a CONTRAST statement to compare GROUP 1 versus the combined mean of GROUP 3 and 4 use: CONTRAST '1 vs 3+4' GROUP ; Note that the effectvalues ( -1, 0, 0.5, 0.5) sum up to zero. So ( -1, 0, 0.5, 0.5) A Label of your choosing Definition of the contrast The signs indicate the comparison. The -1 Represents Group 1, the 0 Group 2, etc GROUPS 3 and 4 both have coefficients of 0.5, which indicate that their means are combined equally, each contributing a half (0.5) to the value. SAS ESSENTIALS -- Elliott & Woodward

28 Do Hands On Example p 324 (AGLM CONTRAST.SAS)
PROC GLM DATA=CONTRAST; CLASS GROUP; MODEL OBSERVATION=GROUP; CONTRAST 'Groups 1 vs 3&4' GROUP ; RUN; quit; This is standard ANOVA code. Add one or more CONTAST Statements within PROC GLM SAS ESSENTIALS -- Elliott & Woodward

29 Contrast Statement Results
Standard ANOVA Results CONTRAST Statement Results. SAS ESSENTIALS -- Elliott & Woodward

30 Continue CONTRAST Example
Add these statements: CONTRAST 'Drugs 1 vs 3&4 Again' GROUP ; CONTRAST 'Drugs 1&2 vs 3&4' GROUP ; Adds two more CONTRAST comparisons to output: SAS ESSENTIALS -- Elliott & Woodward

31 13.4 SUMMARY This chapter illustrates SAS procedures for comparing three or more means in both an independent group setting and for repeated measures. In both cases, the chapter includes examples illustrating how to perform posthoc multiple comparisons analysis. Continue to Chapter 14: ANALYSIS OF VARIANCE, PART II SAS ESSENTIALS -- Elliott & Woodward

32 These slides are based on the book:
Introduction to SAS Essentials Mastering SAS for Data Analytics, 2nd Edition By Alan C, Elliott and Wayne A. Woodward Paperback: 512 pages Publisher: Wiley; 2 edition (August 3, 2015) Language: English ISBN-10:  X ISBN-13:  These slides are provided for you to use to teach SAS using this book. Feel free to modify them for your own needs. Please send comments about errors in the slides (or suggestions for improvements) to Thanks. SAS ESSENTIALS -- Elliott & Woodward


Download ppt "Introduction to SAS Essentials Mastering SAS for Data Analytics"

Similar presentations


Ads by Google