Non-parametric Analysis of the Variance in SAS® By: Hend Aljobaily
Hend Aljobaily is a doctoral student in the Applied Statistics and Research Methods Department at the University of Northern Colorado. Hend Aljobaily is interested in the area of Multivariate Non-parametric Analysis.
NON-PARAMETRIC APPROACH Distribution-free tests Violation in assumptions - especially assumption of normality Ordinal outcome - Ranks Outliers
NON-PARAMETRIC ONE-WAY ANALYSIS OF THE VARIANCE Parametric Tests: ANOVA MANOVA Non-parametric Tests: Kruskal-Wallis (KW) Multivariate Kruskal-Wallis (MKW)
NON-PARAMETRIC ONE-WAY ANOVA – (Kruskal-Wallis) Rank all sampled observation in all groups from 1 to N. Assign the average of a rank to any tied values. Test Statistic: When ties are present the formula is given by: 𝑇=(𝑁−1) 𝑖=1 𝑘 𝑛 𝑖 𝑟 𝑖. − 𝑟 2 𝑖=1 𝑘 𝑗=1 𝑛 𝑖 𝑛 𝑖 𝑟 𝑖𝑗 − 𝑟 2 When no ties are present the formula is given by: 𝑇= 12 𝑁(𝑁+1) 𝑖=1 𝑘 𝑛 𝑖 𝑟 𝑖. 2 −3(𝑁+1) Where 𝑁 represents the total number of observations, 𝑛 𝑖 represents the number of observations in group 𝑖, 𝑟 𝑖. represents the average rank of all observations in group 𝑖, 𝑟 𝑖𝑗 represents the rank of observation 𝑗 in group 𝑖, and 𝑟 represents the average of all 𝑟 𝑖𝑗 .
SAS® PROCEDURE PROC NPAR1WAY DATA=Data; CLASS IV1; VAR DV1; RUN; The CLASS statement is used to define the grouping variable. VAR statement is used to define the variable of interest.
SAS® OUTPUT
NON-PARAMETRIC ONE-WAY MANOVA – (Multivariate Kruskal-Wallis) Rank all sampled observation in each of the variables separately from 1 to 𝑛 𝑖 . Assign the average of a rank to any tied values. Test Statistic: 𝑊 2 = 𝑖=1 𝑘 𝑛 𝑖 𝑈 𝑖 ′ 𝑉 −1 𝑈 𝑖 Where 𝑈 𝑖 = 𝑅 𝑖.1 −𝑚,…, 𝑅 𝑖.𝑝 −𝑚 which measures the distance between the mean vector of ranks for the 𝑖th group, 𝑅 𝑖.1 = 𝑗=1 𝑛 𝑖 𝑅 𝑖𝑗𝑘 𝑛 𝑖 , 𝑚= 𝑛+1 2 , and 𝑉= 1 𝑛−1 𝑖=1 𝑘 𝑗=1 𝑛 𝑖 ( 𝑅 𝑖𝑗 −𝑚 1 𝑝 )( 𝑅 𝑖𝑗 −𝑚 1 𝑝 )′ which estimated the pooled within-group covariance matrix
SAS® PROCEDURE %include 'macro file path\KWMULT.sas'; %KWMULT ( DATA = Data, GROUP = IV1, VARIATE = DV1 DV2, PRNTVEC = 0 RUNPERM= 1); The GROUP statement is used to define the grouping variable. The VARIATE statement is used to define the variables of interest (Dependent Variables). The PNTVEC statement is used to print the distribution of the statistic, where 1 is used to display the printout and 0 is used to suppress the printout. The RUNPERM statement is used to perform the permutation test where 1 is used to display the permutation test and 0 is used to suppress the permutation test.
SAS® OUTPUT
LIMITATIONS Non-parametric analysis of the variance can only be applied to one-factor models NO Built-in procedure in SAS® for non-parametric multivariate data
Contact Information Name: Hend Aljobaily Company: University of Northern Colorado City/State: Greeley, Colorado Phone: 970-301-9234 Email: hend.aljobaily@unco.edu hend.aljobaily@hotmail.com