Introduction to SAS Essentials Mastering SAS for Data Analytics

Slides:



Advertisements
Similar presentations
I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
Advertisements

Analysis of variance (ANOVA)-the General Linear Model (GLM)
Simple Logistic Regression
Analysis of Variance (ANOVA) Statistics for the Social Sciences Psychology 340 Spring 2010.
WINKS SDA Statistical Data Analysis (Windows Kwikstat) Getting Started Guide.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Analysis of variance (ANOVA)-the General Linear Model (GLM)
Comparing Means.
Analysis of Variance (ANOVA) MARE 250 Dr. Jason Turner.
Mean Comparison With More Than Two Groups
Two Groups Too Many? Try Analysis of Variance (ANOVA)
Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
8. ANALYSIS OF VARIANCE 8.1 Elements of a Designed Experiment
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Analysis of Variance & Multivariate Analysis of Variance
Comparing Means.
Today Concepts underlying inferential statistics
McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. Business Statistics: Communicating with Numbers By Sanjiv Jaggia.
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Chapter 12: Analysis of Variance
Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.
LEARNING PROGRAMME Hypothesis testing Intermediate Training in Quantitative Analysis Bangkok November 2007.
QNT 531 Advanced Problems in Statistics and Research Methods
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
Introduction to SAS Essentials Mastering SAS for Data Analytics
Chapter 11 HYPOTHESIS TESTING USING THE ONE-WAY ANALYSIS OF VARIANCE.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
1 Experimental Statistics - week 2 Review: 2-sample t-tests paired t-tests Thursday: Meet in 15 Clements!! Bring Cody and Smith book.
© Copyright McGraw-Hill CHAPTER 12 Analysis of Variance (ANOVA)
STA305 week21 The One-Factor Model Statistical model is used to describe data. It is an equation that shows the dependence of the response variable upon.
ANOVA (Analysis of Variance) by Aziza Munir
Between-Groups ANOVA Chapter 12. >When to use an F distribution Working with more than two samples >ANOVA Used with two or more nominal independent variables.
Randomized Block Design (Kirk, chapter 7) BUSI 6480 Lecture 6.
5-5 Inference on the Ratio of Variances of Two Normal Populations The F Distribution We wish to test the hypotheses: The development of a test procedure.
Copyright © 2011 Pearson Education, Inc. Analysis of Variance Chapter 26.
Hypothesis testing Intermediate Food Security Analysis Training Rome, July 2010.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, October 15, 2013 Analysis of Variance (ANOVA)
Be humble in our attribute, be loving and varying in our attitude, that is the way to live in heaven.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
Chapter 13 - ANOVA. ANOVA Be able to explain in general terms and using an example what a one-way ANOVA is (370). Know the purpose of the one-way ANOVA.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
Smoking Data The investigation was based on examining the effectiveness of smoking cessation programs among heavy smokers who are also recovering alcoholics.
ONE-WAY BETWEEN-GROUPS ANOVA Psyc 301-SPSS Spring 2014.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Between Subjects Analysis of Variance PowerPoint.
One-Way Analysis of Variance Recapitulation Recapitulation 1. Comparing differences among three or more subsamples requires a different statistical test.
Smith/Davis (c) 2005 Prentice Hall Chapter Fifteen Inferential Tests of Significance III: Analyzing and Interpreting Experiments with Multiple Independent.
Statistics for the Social Sciences Psychology 340 Spring 2009 Analysis of Variance (ANOVA)
Chapter 11: The ANalysis Of Variance (ANOVA)
Analysis of Variance STAT E-150 Statistical Methods.
Topic 22: Inference. Outline Review One-way ANOVA Inference for means Differences in cell means Contrasts.
ANOVA and Multiple Comparison Tests
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 26 Analysis of Variance.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Between-Groups ANOVA Chapter 12. Quick Test Reminder >One person = Z score >One sample with population standard deviation = Z test >One sample no population.
Chapter 12 Introduction to Analysis of Variance
Comparing Three or More Means
5-5 Inference on the Ratio of Variances of Two Normal Populations
One-Way Analysis of Variance
Introduction to SAS Essentials Mastering SAS for Data Analytics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward

Chapter 13: ANALYSIS OF VARIANCE SAS ESSENTIALS -- Elliott & Woodward

LEARNING OBJECTIVES • To be able to compare three or more means using one- way ANOVA with multiple comparisons • To be able to perform a repeated measures (dependent samples) analysis of variance with multiple comparisons • To be able to graph mean comparisons SAS ESSENTIALS -- Elliott & Woodward

PROC ANOVA and PROC GLM This chapter illustrates how to perform an analysis of variance (ANOVA) for several common designs. The book covers three SAS procedures: PROC ANOVA, PROC GLM, and PROC MIXED. (PROC MIXED is covered in Chapter 14.) In this chapter, we describe: PROC ANOVA: a basic procedure useful for one-way ANOVA or for multiway factorial designs with fixed factors and an equal number of observations per cell. PROC GLM: for one-way repeated measures analysis, and techniques not supported by PROC ANOVA. SAS ESSENTIALS -- Elliott & Woodward

13.1 COMPARING THREE OR MORE MEANS USING ONE-WAY ANALYSIS OF VARIANCE A one-way ANOVA is an extension of the independent group t-test where there are more than two groups. Assumptions for this test are similar to those for the t- test: Data within groups are normally distributed with equal variances across groups. Groups are from independent samples. The hypotheses for the comparison of independent groups are as follows (k is the number of groups): H0: m1 = m2 = … = mk: Means of all the groups are equal. Ha: mi  mj for some i  j: At least two means are not equal. SAS ESSENTIALS -- Elliott & Woodward

Simplified Syntax for PROC ANOVA The syntax for the statement is as follows: PROC ANOVA <Options>; CLASS variable; MODEL dependentvar = independentvars; MEANS independentvars / typecomparison <meansoptions>; CLASS defines grouping variable. The MODEL statement defines the model tested. The MEANS statement defines post hoc multiple comparisons. SAS ESSENTIALS -- Elliott & Woodward

Table 13.1 Common Options for PROC ANOVA and PROC GLM for preforming a One-Way ANOVA or simple Repeated Measures Option Explanation DATA = dataname Specifies which data set to use. NOPRINT Suppresses output. This is used when you want to extract information from ANOVA results but don’t want SAS to produce output in the Results Viewer. OUTSTAT=dataname Names an output data set that saves a number of the results from the ANOVA calculation. PLOTS=options Specify PLOTS=NONE to suppress plots that are generated by default. ORDER=option Specifies order in which to display the CLASS variable (similar to what was covered in Chapter 10:Analyzing Counts and Tables.) Options are DATA, FORMATTED, FREQ, or INTERNAL. ALPHA=p Specifies alpha level for a Confidence Interval (GLM only) SAS ESSENTIALS -- Elliott & Woodward

Calculates means for dependent variables and may include comparisons. Common Statements for PROC ANOVA and PROC GLM (For one-way analyses) (Table 13.1 Continued) CLASS variable list; This statement is required and specifies the grouping variable(s) for the analysis. MODEL specification Specifies the dependent and independent variables for the analysis. More specifically, it takes the form MODEL dependentvariable=independentvariable(s); FREQ var Specifies that a variable represents the count of values for an observation. Similar to the WEIGHT statement for PROC FREQ. MEANS vars Calculates means for dependent variables and may include comparisons. LSMEANS vars Calculates least square means for a dependent variable & to request comparisons. (GLM Only) REPEATED vars Used to specify repeated measure variables. TEST specificaion Used to specify a hypothesis test value. CONTRAST specification Allows you to create customized posthoc comparisons. (GLM Only) BY, FORMAT, LABEL, WHERE These statements are common to most procedures, and may be used here. SAS ESSENTIALS -- Elliott & Woodward

Using the MEANS or LSMEANS Statement When you perform a one-way ANOVA, typically there is a two- step procedure: (1) test the null hypothesis to determine whether any significant differences exist, and (2) if H0 is rejected, run subsequent multiple comparison tests to determine which differences are significantly different. Pairwise comparison of means can be performed using one of several multiple comparison tests specified using the MEANS statement, which has the following format (where independantvar is a CLASS variable): MEANS in dependentvar/typecomparison <meansoptions>; For PROC GLM, use the LSMEANS statement: LSMEANS in dependentvar I typecomparison <meansoptions>; SAS ESSENTIALS -- Elliott & Woodward

Bonferroni t-tests of difference DUNCAN Duncan’s multiple range test Table 13.2 Common typecomparison options for the PROC ANOVA or GLM MEANS Statement (Options following the slash /) Option Explanation BON Bonferroni t-tests of difference DUNCAN Duncan’s multiple range test SCHEFFE Scheffe multiple comparison SNK Student Newman Keuls multiple range test LSD Fisher’s Least Significant Difference TUKEY Tukey’s studentized range test DUNNETT (‘x’) Dunnett’s test—compare to a single control, where 'x' is the category value of the control group ALPHA=pvalue Specifies the significance level for comparisons (default: 0.05) CLDIFF Requests that confidence limits be included in the output. SAS ESSENTIALS -- Elliott & Woodward

Do Hands on Example p 315 (AANOVA1.SAS) Common typecomparison options for the PROC GLM LSMEANS Statement (Options following the slash /) (Table 13.2 continued) ADJUST=option Specify type of multiple comparison. Examples are BON, DUNCAN, SCHFEE, SNK, LSD, DUNNETT PDIFF= Calculates p-values base (default is T). You can also specify TUKEY or DUNNETT options. Do Hands on Example p 315 (AANOVA1.SAS) SAS ESSENTIALS -- Elliott & Woodward

SAS Code for a One-Way ANOVA (From AANOVA1.SAS) PROC ANOVA DATA=ACHE; CLASS BRAND; MODEL RELIEF=BRAND; MEANS BRAND/TUKEY; TITLE 'ANOVA EXAMPLE'; RUN; QUIT; CLASS defines the grouping variable, BRAND. The MODEL statement indicates you are wanting to test if BRAND can predict mean RELIEF. The MEANS statement is used for a post hoc test (if Ho is rejected) to determine which means are different SAS ESSENTIALS -- Elliott & Woodward

Results of a One-Way ANOVA The primary results for a One-Way ANOVA test are in the following table: The p-value is used to decide whether or not to reject the null hypothesis. Typically, if p<0.05, you reject Ho. If you reject Ho, it indicates that some means (by group) are different, so you proceed to look at the post hoc results. SAS ESSENTIALS -- Elliott & Woodward

Post Hoc Multiple Comparisons test – Tukey Test This test summarizes which means are found different at the alpha=0.05 significance level. In this table, means that are considered NOT DIFFERENT (at alpha=0.05) are grouped (see the Tukey Grouping Column). Thus, means 3 and one are grouped into group B – and the means (26.54 and 26.28) are considered NOT DIFFERENT. BRAND 2 is grouped alone (GROUP A) , thus the mean for BRAND 2 (30.880) is considered LARGER than either 26.54 or 26.28 (at the alpha=0.05 level). SAS ESSENTIALS -- Elliott & Woodward

Graphical Comparison of Groups This graph reinforces the statistical results --- that groups 1 and 3 are very similar, but the mean for group 2 is larger than for either groups 1 or 2. SAS ESSENTIALS -- Elliott & Woodward

Multiple Comparison Test Using Confidence Limits Using this code for the comparison test: MEANS BRAND/TUKEY CLDIFF; Results in this table In this table, mean differences are compared. For example, the first line tests the difference between means for groups 2 minus 3 = 4.340 and reports a 95% CL of 0.691 to 7,989. Since this range does not include 0.0, the difference is considered statistical different at the 0.05 significance level. The *** indicates a 0.05 significant difference for that comparison SAS ESSENTIALS -- Elliott & Woodward

Multiple Comparisons using p-values Using PROC GLM instead of PROC ANOVA, and using this code for the comparison test: LSMEANS BRAND/ PDIFF; Results in this table: This table reports the results of mean comparisons. For example, the comparison of mean 1 vs 3 reports a p-value of 0.8524, indicating that the difference in means is NOT statistically different. The comparison of means 2 vs 3 is statistically different at p=0.0080. SAS ESSENTIALS -- Elliott & Woodward

13.2 COMPARING THREE OR MORE REPEATED MEASURES Repeated measures are observations taken from the same or related subjects over time or in differing circumstances. When there are three or more repeated measures, the corresponding analysis is a repeated measures ANOVA. The hypotheses being tested with repeated measures ANOVA are as follows: H0: There is no difference among the group means (repeated measures). Ha : There is a difference among the group means. SAS ESSENTIALS -- Elliott & Woodward

Example Syntax for a Repeated Measures ANOVA PROC GLM DATA=STUDY; CLASS SUBJ DRUG; MODEL RESULT = SUBJ DRUG; MEANS DRUG/DUNCAN; TITLE 'Repeated Measures ANOVA'; RUN; QUIT; The CLASS statement indicates grouping variables. In repeated measures, a subject variable is included. The MODEL statement indicates that you want to predict RESULT from type of DRUG. Subject is included to account for subject differences SAS ESSENTIALS -- Elliott & Woodward

Example Repeated Measures Data Each Subject received each of the 4 drugs (in random order, with a washout period between administrations.) Subj Drug1 Drug2 Drug3 Drug4 1 31 29 17 35 2 15 11 23 3 25 21 19 4 45 5 27 SAS ESSENTIALS -- Elliott & Woodward

Repeated Measures Data in SAS The data for the repeated measures in not like in the talbe. Each line represents an observation, and each subject has 4 lines representing the 4 drugs. DATA STUDY; INPUT SUBJ DRUG RESULT; DATALINES; 1 1 31 1 2 29 1 3 17 1 4 35 2 1 15 Etc… Do the Hands on Example p 320 (AGLM1.SAS) Notice how data is set up for repeated measures – each subject has 4 records – one for each drug observation. SAS ESSENTIALS -- Elliott & Woodward

Results from Repeated Measures ANOVA The results of interest are in the Type III table: Typically, you are not interested in the SUBJ line in this table (or p-value). The line of interest is the DRUG line, which tests the hypothesis of interest. In this case p<0.0001, which indicates a significant difference in means for the 4 Drugs. Do a post hoc test to determine which drugs are different. SAS ESSENTIALS -- Elliott & Woodward

Multiple Comparisons for Repeated Measures ANOVA This statement provides a multiple comparison test, which is appropriate if the main hypothesis is significant: MEANS DRUG/DUNCAN; Results indicate that there is NO DIFFERNCE in DRUGS 1 and 2 (Means of 26.6 vs 25.8). However, DRUG4 has the largest (statistically significant) mean at 33.0 and DRUG 3 has the smallest at 16.60. SAS ESSENTIALS -- Elliott & Woodward

Graphical Results of a Repeated Measures ANOVA This is visual confirmation of the multiple comparisons – the line for DRUG4 is consistently higher than all the others. DRUGS 1 and 2 are too close to call different , and DRIG 3 has the smallest means. SAS ESSENTIALS -- Elliott & Woodward

Using LSMEANS for Comparisons (Tukey) Using this code: LSMEANS DRUG/PDIFF ADJUST=TUKEY; You get the following results: Other common ADJUST= options are BON, DUNNETT, and SCHEFFE. Results indicate that there is NO DIFFERNCE in DRUGS 1 and 2 (p=.97). However, the mean for DRUG4 is different than for DRUG1 (p=0.0147) and so on… SAS ESSENTIALS -- Elliott & Woodward

13.3 GOING DEEPER: CONTRASTS At times when you are comparing means across groups in a one-way ANOVA, you may be interested in specific posthoc comparisons. For example, suppose you have a data set consisting of four groups. For some hypothesized reason, you wonder if the average of means 1 and 2 is different from mean 4. Using a CONTRAST statement, you can specify such a comparison. A CONTRAST statement uses the following syntax: CONTRAST 'label' indvar effectvalues; SAS ESSENTIALS -- Elliott & Woodward

Setting Up a Contrast Statement For example, a CONTRAST statement to compare GROUP 1 versus the combined mean of GROUP 3 and 4 use: CONTRAST '1 vs 3+4' GROUP -1 0 .5 . 5; Note that the effectvalues ( -1, 0, 0.5, 0.5) sum up to zero. So ( -1, 0, 0.5, 0.5) A Label of your choosing Definition of the contrast The signs indicate the comparison. The -1 Represents Group 1, the 0 Group 2, etc GROUPS 3 and 4 both have coefficients of 0.5, which indicate that their means are combined equally, each contributing a half (0.5) to the value. SAS ESSENTIALS -- Elliott & Woodward

Do Hands On Example p 324 (AGLM CONTRAST.SAS) PROC GLM DATA=CONTRAST; CLASS GROUP; MODEL OBSERVATION=GROUP; CONTRAST 'Groups 1 vs 3&4' GROUP -1 0 .5 .5; RUN; quit; This is standard ANOVA code. Add one or more CONTAST Statements within PROC GLM SAS ESSENTIALS -- Elliott & Woodward

Contrast Statement Results Standard ANOVA Results CONTRAST Statement Results. SAS ESSENTIALS -- Elliott & Woodward

Continue CONTRAST Example Add these statements: CONTRAST 'Drugs 1 vs 3&4 Again' GROUP -2 0 1 1; CONTRAST 'Drugs 1&2 vs 3&4' GROUP -.5 -.5 .5 .5; Adds two more CONTRAST comparisons to output: SAS ESSENTIALS -- Elliott & Woodward

13.4 SUMMARY This chapter illustrates SAS procedures for comparing three or more means in both an independent group setting and for repeated measures. In both cases, the chapter includes examples illustrating how to perform posthoc multiple comparisons analysis. Continue to Chapter 14: ANALYSIS OF VARIANCE, PART II SAS ESSENTIALS -- Elliott & Woodward