Presentation is loading. Please wait.

Presentation is loading. Please wait.

Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:

Similar presentations


Presentation on theme: "Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:"— Presentation transcript:

1 Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week: Astronauts Ninjas Ponies Birds Total Boys Girls Total Suppose we want to know whether gender and book type are independent. Which of the following is NOT a correct statement of the null hypothesis? The distribution of book preferences is the same for boys and girls. Boys like all book types equally, and so do girls. Knowing whether a kid is male or female gives no information about his or her likely book preference. Knowing a kid’s book preference gives no information about the kid’s gender.

2 Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week: Astronauts Ninjas Ponies Birds Total Boys Girls Total If book type is independent of gender, how many boys would be expected to pick pony books? 2.25 2.63 3.00 3.50

3 Review Here are the expected frequencies, in red:
Astronauts Ninjas Ponies Birds Total Boys Girls Total Calculate the c2 statistic for testing independence. 2.32 5.89 9.04 9.69

4 Non-Parametric Tests 12/4

5 Parametric vs. Non-parametric Statistics
Most common type of inferential statistics r, t, F Make strong assumptions about the population Mathematically fully described, except for a few unknown parameters Powerful, but limited to situations consistent with assumptions When parametric statistics fail Assumptions not met Ordinal data: Assumptions not meaningful Non-parametric statistics Alternatives to parametric statistics "Naive" approach: Far fewer assumptions about data Work in wider variety of situations Not as powerful as parametric statistics (when applicable)

6 Assumption Violations
Parametric statistics work only if data obey certain properties Normality Shape of population distribution Determines shape of sampling distributions Tells how likely extreme results should be; critical for correct p-values More important with small sample sizes (Central Limit Theorem) Homogeneity of variance Variance of groups is equal (t-test or ANOVA) Variance from regression line does not depend on values of predictors Linear relationships Pearson correlation cannot recognize nonlinear relationships

7 Assumption Violations
Parametric statistics only work if data obey certain properties If these assumptions are true: Population is almost fully described in advance Goal is simply to estimate a few unknown parameters If assumptions violated: Parametric statistics will not give correct answer Need more conservative and flexible approach Normal(m1, s2) Normal(m2, s2)

8 Ordinal Data Some variables have ordered values but are not as well-defined as interval/ratio variables Preferences Rankings Nonlinear measures, e.g. money as indicator of value Can't do statistics based on differences of scores Mean, variance, r, t, F More structure than nominal data Scores are ordered Chi-square goodness of fit ignores this structure Want to answer same types of questions as with interval data, but without parametric statistics Are variables correlated? Do central tendencies differ?

9 Non-parametric Tests Can use without parametric assumptions and with ordinal data Basic idea Convert raw scores to ranks Do statistics on the ranks Answer similar questions as parametric tests Your job: Understand what each is used for and in what situations Parametric Test Non-parametric Test Pearson correlation Spearman correlation Independent-samples t-test Mann-Whitney Single- or paired-samples t-test Wilcoxon Simple ANOVA Kruskal-Wallis Repeated-measures ANOVA Friedman

10 Spearman Correlation Alternative to Pearson correlation
Produces correlation between -1 and 1 Convert data on each variable to ranks For each subject, find rank on X and rank on Y within sample Compute Pearson correlation from ranks Works for Ordinal data Monotonic nonlinear relationships (consistently increasing or decreasing) X: 65 72 69 75 66 62 68 Y: 157 194 185 148 163 127 173 RX: 2 6 5 7 3 1 4 RY: X Y RY RX

11 Mann-Whitney Test Alternative to independent-samples t-test
Do two groups differ? Combine groups and rank-order all scores If groups differ, high ranks should be mostly in one group and low ranks in the other Test statistic (U) measures how well the groups' ranks are separated Compare U to its sampling distribution Is it smaller than expected by chance? Compute p-value in usual way Works for Ordinal data Non-normal populations and small sample sizes A B 17 11 23 22 9 15 18 16 20 21 14 13 19 A B 7 2 13 12 1 5 8 6 10 11 4 3 9 Perfect separation U No separation

12 Wilcoxon Test Alternative to single- or paired-samples t-test
X – m0 Rank 107 7 4 92 -8 5 115 15 9 104 3 87 -13 8 102 2 1 97 -3 112 12 90 -10 6 Alternative to single- or paired-samples t-test Does median differ from m0? Does median difference score differ from 0? Subtract m0 from all scores Can skip this step for paired samples or if m0 = 0 Rank-order the absolute values Sum the ranks separately for positive and negative difference scores If m > m0, positive scores should be larger If m < m0, negative scores should be larger If sums of ranks are more different than likely by chance, reject H0 Works for Ordinal data Non-normal populations and small sample sizes

13 Kruskal-Wallis Test Alternative to simple ANOVA
Do groups differ? Extends Mann-Whitney Test Combine groups and rank-order all scores Sum ranks in each group If groups differ, then their sums of ranks should differ Test statistic (H) essentially measures variance of sums of ranks If H is larger than likely by chance, reject null hypothesis that populations are equal Works for Ordinal data Non-normal populations and small sample sizes

14 Friedman Test Alternative to repeated-measures ANOVA
Do measurements differ? Look at each subject separately and rank-order his/her scores Best to worst for that subject, or favorite to least favorite For each measurement, sum ranks from all subjects If measurements differ, then their sums of ranks should differ Test statistic (c2r) essentially measures variance of sums of ranks If larger than likely by chance, reject H0 Works for Ordinal data Non-normal populations and small sample sizes Subject Morning Noon Night 1 27 31 35 2 54 41 48 3 62 59 61 4 37 33 5 25 21 23 6 56 50 47 Subject Morning Noon Night 1 2 3 4 5 6 Total 14 10 12

15 Summary Test Replaces When Useful Approach Spearman correlation
Pearson correlation Ordinal data Nonlinear relationships Rank each variable Mann-Whitney Independent-samples t-test Ordinal Data Non-normal + small samples Rank all scores Wilcoxon Single- or paired-samples t-test Rank absolute values Kruskal-Wallis Simple ANOVA Friedman Repeated-measures ANOVA Rank all scores for each subject

16 Review Comparing average score between two groups, on an ordinal-scale variable. What test should you use? Friedman Mann-Whitney Spearman Kruskal-Wallis

17 Review Five subjects each measured in three conditions, on an interval-scale variable with a non-normal distribution. What test should you use? Repeated-measures ANOVA Kruskal-Wallis Friedman Spearman

18 Review Comparing average scores among five groups, on an interval-scale variable with a non-normal distribution subjects per group. What test should you use? Kruskal-Wallis Friedman One-way ANOVA Mann-Whitney


Download ppt "Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:"

Similar presentations


Ads by Google