Chapter ?? 10 Nonparametric Techniques C H A P T E R
Chapter Outline Chi square ( 2 ): testing the observed versus the expected Procedures for rank-order data Correlation Differences among groups
Analyzing Data Appropriately Behavioral scientists believe most data are normally distributed: God loves the normal curve! Is that true? Micceri (1989) said no for large data sets in psychology. Do we as scientists look carefully at the distribution of our data? God may not love the normal curve!
Parametric Statistical Procedures Are parametric statistical procedures sensitive to nonnormality? Substantial evidence exists that parametric statistical procedures are not as robust to violations of the normality assumption as once thought.
Chi Square: Testing the Observed Versus the Expected Formula for chi square 2 = (O – E) 2 / E where O = observed frequency and E = expected frequency
Coach Rabbitfoot and His Tennis Courts df = # cells – 1 = # courts – 1 = 4 – 1 = 3 X 2 (3) = 7.19, p >.05, not significant Court number 1234Total Observed losses O = Expected losses E = (O – E)–6+4–8+10 (O – E) (O – E) 2 /E X 2 = (O – E) 2 /E = = 7.19
Contingency Tables Chi square with two or more categories and two or more groups Athletes and nonathletes respond to an ethical statement about whether one should tell the umpire if they trap a fly ball in baseball. The athletes respond on a Likert-type scale: 3 = Agree, 2 = No opinion, 1 = Disagree
Working Out the Answer X 2 = (–34 2 /64) + (–34 2 /80) + (–10 2 /56) + (10 2 /70) +(44 2 /80) + (–44 2 /100) = df = (r – 1) (c – 1) = (2 – 1)(3 – 1) = 2, p <.01 Observed responses AgreeNo opinion DisagreeTotal Athletes Nonathletes Total Expected responses AgreeNo opinionDisagree Athletes Nonathletes
Puri and Sen Rank-Order General Linear Method This method maintains good power. This method protects against type I error. Change data to ranks. Use any of the standard parametric procedures for ranked scores using SPSS or SAS.
General Linear Model (GLM) Basis for procedures of –regression: r, R, R c –differences: t, ANOVA, MANOVA Y = B X + E Y = vector of scores on p dvs X = vector of scores on q Ivs B = p q matrix of regression coefficients E = vector of errors
Calculating the Test Statistic for Ranked Data Instead of the parametric test statistic (t or F), calculate L. L = (N – 1)r 2 df = p q
Example From Regression Can skinfold measurements be used to predict percentage fat (determined by underwater weighing) in women grouped by ethnicity? Data from K.T. Thomas et al., 1997.
First Example From Regression (Distribution) Thigh FrequencyStem & Leaf 81* * * * Stem width: 10.0 Each leaf: 1 case(s) N = 79 M (mm) = SD = 8.80 Skewness = 0.67 Kurtosis = 0.28
Second Example From Regression (Distribution) Percent fat from hydrostatic weighing FrequencyStem & Leaf 1Extreme * * * Stem width: 10.0 Each leaf: 1 case(s) N = 79 M = SD = 7.48 Skewness = 0.67 Kurtosis = 0.09
Multiple Regression on Original Data StepVariableRR2R2 bdfF-to-enter 1Subscap SF , * 2Calf SF , * 3Abdom SF ,758.86* 4Thigh SF.80.64–.2894,746.26* *p <.05 F(4,74) = 32.87, p <.001, for linear composite of predictors
Multiple Regression Using Ranked Data
Example Using Factorial ANOVA Do boys and girls differ in push-up scores in grades 4, 5, and 6? Data from J.K. Nelson et al., 1991.
Stem-and-Leaf, Mean, Standard Deviation, Skewness, and Kurtosis for Push-Up Scores for Boys and Girls in Grades 4, 5, and 6 FrequencyStem & Leaf 300* * * * *02 Stem width: 10 Each leaf: 1 N = 180 M = SD = Skewness = 0.41 Kurtosis = 0.70
3 2 ANOVA Results for Original Data Original data –Grade: F(2, 174) = 7.30, p <.001 –Sex: F(1, 174) = 17.48, p <.001 –Interaction: not significant
3 2 ANOVA Results for Ranked Data Ranked data –Grade: L(2) = 11.67, p <.005 –Sex: L(1) = 13.21, p <.001 –Interaction: not significant
Example Using Repeated-Measures ANOVA Does VO 2 differ by walking speeds in older and younger participants? Data from P.E. Martin, D.E., Rothstein, & D.D. Larish, 1992, “Effects of age and physical activity status on the speed-aerobic demand relationship of walking,” Journal of Applied Physiology, 73:
Characteristics of VO 2 at Five Walking Speeds. Miles per hour M SD Median Skewness Kurtosis–0.13– –
Summary Tables of Repeated-Measures ANOVAs for Original and Ranked Data SourcePillai’s tracedfFSignificance Original data Age.14 (r 2 – SS Bet /SS Tot )1, Speed.984, Age Speed.224, Huynh-Feldt Epsilon =.65 Ranked data L Age.14 (r 2 – SS Bet /SS Tot )18.12<.01 Speed <.001 Age Speed <.05 Huynh-Feldt Epsilon =.77
Example Using Factorial MANOVA Do four ethnic groups at two age levels differ on two skinfold measurements and hip-to- waist ratio? Data from K.T. Thomas et al., 1997.
Using MANOVA on Original and Ranked Data Data are for four ethnic groups (African American, European American, Mexican American, and Native American) at two age levels (20–30 and 40–50 years). They include the previously reported data on abdomen and calf skinfolds and add a third dependent variable, hip-to-waist ratio.
4 (Ethnic Group) 2 (Age Level) MANOVA on Three Dependent Variables Original data –Ethnic group: F(3, 152) = 5.64, p <.0001 –Age level: F(3, 152) = 7.86, p <.0001 –Interaction: not significant Ranked data (Pillai’s trace = R 2 ) –Ethnic group: L(3) = 22.54, p <.0001 –Age level: L(9) = 41.86, p <.0001 –Interaction: not significant
Applications to GLM These procedures are appropriate for all GLM models. –Regression: Pearson r, multiple R canonical (R c ) –ANOVA: t, simple and factorial ANOVA (including repeated measures), ANOCOVA –Multivariate techniques: Discriminant analysis, MANOVA (including repeated measures), MANCOVA
Summary Are data from physical activity normally distributed? If not, changing data to ranks and using nonparametric procedures allows the researcher the alternative of using standard statistical packages while calculating only the L statistic.