Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to SAS Essentials Mastering SAS for Data Analytics

Similar presentations


Presentation on theme: "Introduction to SAS Essentials Mastering SAS for Data Analytics"— Presentation transcript:

1 Introduction to SAS Essentials Mastering SAS for Data Analytics
Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward

2 Chapter 15: NONPARAMETRIC ANALYSIS
SAS ESSENTIALS -- Elliott & Woodward

3 LEARNING OBJECTIVES • To be able to use SAS® procedures to compare two independent samples (Wilcoxon-Mann-Whitney) • To be able to use SAS procedures to compare k independent samples (Kruskal-Wallis) • To be able to use SAS procedures to compare two dependent (paired) samples • To be able to use SAS procedures to compare k dependent samples (Friedman's test) • To be able to use SAS procedures to perform multiple comparisons following a significant Kruskal-Wallis test (macro) SAS ESSENTIALS -- Elliott & Woodward

4 Nonparametric Tests An assumption for many statistical tests, such as the t-tests and ANOVA, is that the data are normally distributed. When this normality assumption is suspect or cannot be met, there are alternative techniques for analyzing the data. Statistical techniques based on an assumption that data are distributed according to some parameterized distribution (such as the normal distribution) are referred to as parametric analyses. Statistical techniques that do not rely on this assumption are called nonparametric procedures. This chapter illustrates several nonparametric techniques. SAS ESSENTIALS -- Elliott & Woodward

5 15.1 COMPARING TWO INDEPENDENT SAMPLES USING NPAR1 WAY
A nonparametric test can be used to compare two independent groups when you cannot make the assumptions associated with the t-test. The primary nonparametric test for comparing two groups discussed in this chapter is the Wilcoxon (sometimes called the Wilcoxon-Mann-Whitney) test. The hypothesis tested is as follows: H0: The two groups have the same distribution (they come from the same population). Ha: The two groups do not have the same distributions (they come from different populations). SAS ESSENTIALS -- Elliott & Woodward

6 PROC NPAR1WAY The SAS procedure for testing these hypotheses is PROC NPAR1WAY. Simplified syntax for NPAR1WAY is as follows: PROC NPAR1WAY <Options>; SAS ESSENTIALS -- Elliott & Woodward

7 Table 15.1 Common Options for PROC NPAR1WAY Option Explanation
DATA = dataname Specifies which data set to use. WILCOXON Limits output to Wilcoxon-type tests MEDIAN Requests Median test VW Requests Van der Waerden test NOPRINT Suppresses output Common Statements for PROC NPAR1WAY CLASS vars; Specifies grouping variable(s). VAR vars; Specifies dependent variable(s). EXACT Requests exact tests BY, FORMAT, LABEL, WHERE These statements are common to most procedures, and may be used here. Do Hands On Example p 348 (ANPAR1.SAS) SAS ESSENTIALS -- Elliott & Woodward

8 Code for NPAR1WAY (Two sample test)
The WILCOXON options limits output to what we want PROC NPAR1WAY WILCOXON; CLASS BRAND; VAR HEIGHT; EXACT; Title 'Compare two groups'; RUN; Performs a Wilcoxon-Mann-Whitney test. Include Exact test results in output SAS ESSENTIALS -- Elliott & Woodward

9 Results of NPAR1WAY for two sample comparison
These statistics are based on the Ranked Sum Scores by group rather than means. Results are either the Z two-sided p-value (p=.0054), the t two sided results (p=0.0147) or the Exact two sided results (p=0.0025). The are slightly different ways of testing the hypothesis. In this case they all have the same conclusion (if the criteria for rejecting is p<0.05) SAS ESSENTIALS -- Elliott & Woodward

10 Graphical Results The graph provides visual evidence that the groups are difference and that the distribution of SCORES in group B tend to be higher than for group A. SAS ESSENTIALS -- Elliott & Woodward

11 15.2 COMPARING k INDEPENDENT SAMPLES (KRUSKAL-WALLIS)
If more than two independent groups are being compared using nonparametric methods, SAS uses the KW Kruskal-Wallis test to compare groups. The hypotheses being tested are as follows: H0: There is no difference among the distributions of the groups. Ha: There are differences among the distributions of the groups. Do the Hands On Example p 351. (ANPAR2.SAS) SAS ESSENTIALS -- Elliott & Woodward

12 Code for a Kruskal-Wallis Test
PROC NPAR1WAY WILCOXON; CLASS GROUP; VAR WEIGHT; Title 'Four group analysis'; RUN; There is not much difference from the two-sample code… in this case since there are 4 groups in the GROUP variable, it becomes a Kruskal-Wallis Test. Note that there is NOT an EXACT statement – if you include it, it takes a LONG time to calculate. SAS ESSENTIALS -- Elliott & Woodward

13 Results for Kruskal-Wallis Test
These results show the RANK SUMS table on which the test statistic is based. The test statistic is in the Kruskal-Wallis table, which is a Chi-Square statistic. In this case, there is evidence that the groups are difference (p<.0001) SAS ESSENTIALS -- Elliott & Woodward

14 15.4 COMPARING k-DEPENDENT SAMPLFS (FRIEDMAN'S TEST)
When your data include more than two repeated measures and the normality assumption is questioned, you can perform a Friedman procedure to test the following hypotheses: H0: The distributions are the same across the repeated measures. Ha: There are some differences in distributions across the repeated measures. Although no procedure in SAS performs a Friedman's test directly, you can calculate the statistic needed to perform the test using PROC FREQ. SAS ESSENTIALS -- Elliott & Woodward

15 Friedman’s Analysis Example
Recall the repeated measures example in Chapter 13, in which four drugs were given to five patients in random order. The data are as follows: Subj Drug1 Drug2 Drug3 Drug4 1 31 29 17 35 2 15 11 23 3 25 21 19 4 45 5 27 Do Hands On Example p 355 (AFRIEDMAN.SAS) SAS ESSENTIALS -- Elliott & Woodward

16 Code for Friedman’s Analysis
DATA TIME; INPUT SUBJ DRUG OBS; DATALINES; Etc… Title "Friedman Analysis"; PROC FREQ; TABLES SUBJ*DRUG*OBS / CMH2 SCORES=RANK NOPRINT; RUN; This analysis looks like a 3-way Contingency table – where subject (SUBJ) is the first factor listed. The options for this analysis provide the results for a Friedman’s test SAS ESSENTIALS -- Elliott & Woodward

17 Results for Friedman’s Test
Friedman chi-square= with three degrees of freedom and p = Because the p-value is <0.05, you would conclude that there is a difference in the distributions across DRUGs NOTES: The PROC FREQ statement requests a three-way table. Statement options include CMH2, which requests Cochran- Mantel- Haenszel Statistics, and the SCORES=RANK option indicates that the analysis is to be performed on ranks. The NOPRINT option is used to suppress the output of the actual summary table because it is quite large and provides nothing needed for this analysis. SAS ESSENTIALS -- Elliott & Woodward

18 Multiple Comparisons? As with the KW test, SAS provides no built-in follow-up multiple comparison test for this procedure. If you reject the null hypothesis, you can perform comparisons for each drug pair using the Wilcoxon test on the differences as discussed in the earlier section on paired comparisons. As in the KW multiple comparisons discussed earlier, you should adjust your significance level using the Bonferroni technique. For example, in this case with four groups (measured in repeated readings) you would perform six pairwise comparisons, and for an overall 0.05 level test you would use the 0.05/6 = level of significance for each individual pairwise comparison. SAS ESSENTIALS -- Elliott & Woodward

19 15.5 GOING DEEPER: NONPARAMETRIC MULTIPLE COMPARISONS
Better Multiple Comparisons: The following Hands-on Example illustrates how a procedure written in SAS code (called a macro) can be used to perform an analysis not included in the SAS program. This example utilizes a SAS macro implementation of a multiple comparison test (Nemenyi[Tukey-type] or Dunn's test) to be used along with SAS NPAR1WAY (see Elliott and Hynan, 2007). Do Hands On Example p 357 (ANPAR3.SAS) SAS ESSENTIALS -- Elliott & Woodward

20 Using a Macro to do Multiple Comparisons
%INCLUDE 'C:\SASDATA\KW_MC.SAS'; DATA NPAR; INPUT GROUP WEIGHT DATALINES; ; RUN; The SAS MACRO used in this analysis is in a file named KW_MC.SAS. The %INCLUDE statement includes the code from this macro into the current program when it is run. It is important that the path (C:\SASDATA) for this macro is correct, otherwise the macro will not work. This is the data for the example – a Kruskal-Wallis analysis SAS ESSENTIALS -- Elliott & Woodward

21 A Sneak Peak at the Macro
The macro is to long an complicated to explain here – see the section on Macros in Chapter 7. Here is a snippet of the macro code from KW_MC.SAS. You can see that it begins by perform in the Kruskal-Wallis test. Open the KW_MC.SAS file to see more of the code… %macro KW_MC(source=, groups=, obsname=, gpname=, sig=); * PERFORM THE STANDARD KRUSKAL WALLIS TEST; PROC NPAR1WAY data=&DATANAME wilcoxon;output out=KW_MC_TMP5; CLASS &gpname; VAR &OBSNAME; RUN; * Rank the input data froum the source file; proc sort data=&source;by &gpname;run; Etc… You don’t need to understand the inner workings of the macro to use it… SAS ESSENTIALS -- Elliott & Woodward

22 Calling the KW_MC Macro
The Macro requires information that you define in several %LET statements – they tell the macro about your data set and the analysis you want to perform. %LET NUMGROUPS=4; %LET DATANAME=NPAR; %LET OBSVAR=WEIGHT; %LET GROUP=GROUP; %LET ALPHA=0.05; Title 'Kruskal-Wallis Multiple Comparisons'; *************************************************** *invoke the KW_MC macro * ***************************************************; %KW_MC(source=&DATANAME, groups=&NUMGROUPS, obsname=&OBSVAR, gpname=&GROUP, sig=&alpha); Call the macro, and send it the information needed to perform the analysis. SAS ESSENTIALS -- Elliott & Woodward

23 Results from KW_MC.SAS Macro
Results include the standard output for an NPAR1WAY Kruskal-Wallis analysis, plus… Output from the Macro includes a multiple comparison test for the 4 groups. In the “Conclude” column, a “Reject” indicates that the comparison is significant at the 0.05 significance level. For example, 3 vs 1 are found significantly different. SAS ESSENTIALS -- Elliott & Woodward

24 15.6 SUMMARY This chapter introduces several nonparametric alternatives to standard parametric analyses. These procedures are useful when the normality assumption is questionable and particularly important when sample sizes are small. Continue to Chapter 16:Logistic Regression SAS ESSENTIALS -- Elliott & Woodward

25 These slides are based on the book:
Introduction to SAS Essentials Mastering SAS for Data Analytics By Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward

26 These slides are based on the book:
Introduction to SAS Essentials Mastering SAS for Data Analytics, 2nd Edition By Alan C, Elliott and Wayne A. Woodward Paperback: 512 pages Publisher: Wiley; 2 edition (August 3, 2015) Language: English ISBN-10:  X ISBN-13:  These slides are provided for you to use to teach SAS using this book. Feel free to modify them for your own needs. Please send comments about errors in the slides (or suggestions for improvements) to Thanks. SAS ESSENTIALS -- Elliott & Woodward


Download ppt "Introduction to SAS Essentials Mastering SAS for Data Analytics"

Similar presentations


Ads by Google