Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 131 Assumptions Underlying Parametric Statistical Techniques.

Similar presentations


Presentation on theme: "Chapter 131 Assumptions Underlying Parametric Statistical Techniques."— Presentation transcript:

1 Chapter 131 Assumptions Underlying Parametric Statistical Techniques

2 Chapter 132 Parametric Statistics zWe have been studying parametric statistics. zThey include estimations of mu and sigma, correlation, t tests and F tests.

3 Chapter 133 Five Assumptions ytwo research assumptions; ytwo assumptions about the type of the distributions in the samples, yand one assumption about the kind of numbering system that we are using. To validly use parametric statistics, we make

4 Chapter 134 Research Assumptions zSubjects have to be randomly selected from the population. zExperimental error is randomly distributed across samples in the design. (We will not discuss these any further).

5 Chapter 135 Distribution Assumptions zThe distribution of sample means fit a normal curve. zHomogeneity of variance (using F MAX ).

6 Chapter 136 Assumptions about Numbering Schemes zThe measures we take are on an interval scale. (Other numbering scales, such as ordinal and nominal, are non-parametric).

7 Chapter 137 Violating the Assumptions If any of these assumptions are violated, we cannot use parametric statistics. We must use less-powerful, non-parametric statistics.

8 Chapter 138

9 9 Sample Means zAn assumption we need to make is that the distribution of sample means is normally distributed. zThis is not as extreme an assumption as it might seem. zWe will follow an example from Chapter 4 to demonstrate.

10 Chapter 1310 Example: Start with a tiny population N=5 zThe scores in this population form a perfectly rectangular distribution. zMu = 5.00 zSigma = 2.83 zWe are going to list all the possible samples of size 2 (n=2) zFirst see the population, then the list of samples

11 Chapter 1311

12 Chapter 1312

13 Chapter 1313 Table 4.10: List of all 25 possible samples (n=2) of scores from the tiny population of 5 scores shown in Table 4.9 and Figure 4.5 I Sample Scores X X Summary statistics (all samples, n=2) AA1,1 1.00 DA 7,1 4.00  X = 125.00 AB1,3 2.00 DB 7,3 5.00 N = 25 AC 1,5 3.00 DC 7,5 6.00 mu = 5.00 AD1,7 4.00 DD 7,7 7.00 SS = 100.00 AE1,9 5.00 DE 7,9 8.00 BA3,1 2.00 EA 9,1 5.00 BB3,3 3.00 EB 9,3 6.00 BC 3,5 4.00 EC9,5 7.00 BD3,7 5.00 ED 9,7 8.00 BE 3,9 6.00 EE 9,9 9.00 CA 5,1 3.00 CB5,3 4.00 CC5,5 5.00 CD5,7 6.00 CE 5,9 7.00

14 Chapter 1314 Normal Curve for Sample Means Conclusion Even if we have a small population (5), … with a rectangular distribution, … and a small sample size (2), … which yields a small number of possible samples (5 2 = 25) … the sample means tend to fall in a normal distribution. This assumption is seldom violated. This assumption is robust.

15 Chapter 1315

16 Chapter 1316 Violating the Normal Curve Assumption Normal curves yare symmetric yare bell-shaped yhave a single peak Non-normal curves yhave skew yhave kurtosis- platykutic or leptokurtic yare polymodal Distributions can vary from normal in many ways.

17 Chapter 1317 Symmetry FrequencyFrequency score The left side is the same shape as the right side.

18 Chapter 1318 Skewed NORMAL Skewed Right Skewed Left

19 Chapter 1319 Bell-shaped FrequencyFrequency score Area under the curve occurs in a prescribed manner, as listed in the Z table. 1 sigma ~ 34%; 2 sigma ~ 48%; On each side of the mean

20 Chapter 1320 Kurtosis NORMAL Leptokurtic Platykurtic

21 Chapter 1321 One mode FrequencyFrequency score There is only one mode and it equals the median and the mean.

22 Chapter 1322 Polymodality NORMAL Bimodal Trimodal

23 Chapter 1323 Violation of normally distributed sample means If the distribution of sample means is z… skewed, z… or has kurtosis, z… or more than one mode, z… then we cannot use parametric statistics.

24 Chapter 1324

25 Chapter 1325 For F Ratios and t Tests zWe assume that the distribution of scores around each sample mean is similar. zThe distributions within each group all estimate the same thing, that is, sigma 2. zThe mean squares within each group should be the same in each group. zFor F ratios and t tests, this is called homogeneity of variance.

26 Chapter 1326 For Correlation zFor correlation, the scores must vary roughly the same amount around the entire length of the regression line. zThis is called homoscedasticity.

27 Chapter 1327 Homoscedasticity 3 -3 2 1 0 -2 3 -3 2 1 0 -2

28 Chapter 1328 Non-Homoscedasticity 3 -3 2 1 0 -2 3 -3 2 1 0 -2

29 Chapter 1329 Homogeneity of Variance In mathematical terms, homogeneity of variance means that the mean squares for each group are about the same. MS W is a consistent estimate of sigma 2. The more degrees of freedom for MS W, the closer it tends to come to sigma 2.

30 Chapter 1330 We assume the mean square is your best estimate of sigma 2 Since MS W has more df than MS 1 or MS 2 or MS K, it should be a better estimate of sigma 2. But that only works when the mean squares in all the groups are fairly good estimates of sigma 2. We use the F MAX test to check if the group with the smallest mean squares is “too different” from the group with the largest mean squares for the combined mean square (MSW) to be a good estimate of sigma 2.

31 Chapter 1331 F MAX  If F MAX is significant, then the Mean Squares differ too much from each other to combine into a single estimate. z(Usually it means that the variance in one of the groups has virtually disappeared because of a floor or ceiling effect. When that happens, adding that groups sum of squares and df into the mix produces an underestimate of sigma 2. zWhen that happens, it becomes too easy to make a Type 1 error. zWe say that “The assumption of homogeneity of variance is violated.” zAnd we cannot use parametric statistics!

32 Chapter 1332 Divide by df (n G -1) to get MS for each group. Sum the deviations. Book Example - no homogeneity 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1235566412355664 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 9899998998999989 9.00 4.00 1.00 4.00 0.00 Square the deviations..06.56.06.56.06 -3.00 -2.00 1.00 2.00 0.00 Calculate the deviations..25 -.75.25 -.75.25 4.00 Within each group Calculate the means. 8.75

33 Chapter 1333 F MAX In F MAX, the “MAX” part refers to the largest ratio that can be obtained by comparing the estimated variances from 2 experimental groups. The significance of F MAX is checked in an F MAX table.

34 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 K = number of variances df FMAX alpha =.01. n G(larger) - 1 Default = larger df. The number of groups in the experiment.

35 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 k = number of variances df FMAX The critical values.

36 Chapter 1336 Book Example

37 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 k = number of variances df FMAX F MAX = 16.33 > 8.89 F MAX exceeds the critical value. We cannot use parametric statistics.

38 Chapter 1338 Examples NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X4 8 21 5.3 2X2 ? 16 ? 3X3 ? 11 ? 2X3 ? 9 ? 4 9 6

39 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 K = number of variances df FMAX

40 Chapter 1340 CPE 14.2.1 NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X4 8 21 5.3 2X2 4 16 5.5 3X3 9 11 ? 2X3 6 9 ?

41 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 K = number of variances df FMAX

42 Chapter 1342 CPE 14.2.1 NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X4 8 21 5.3 2X2 4 16 5.5 3X3 9 11 12.4 2X3 6 9 ?

43 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 K = number of variances df FMAX

44 Chapter 1344 CPE 14.2.1 NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X4 8 21 5.3 2X2 4 16 5.5 3X3 9 11 9.5 2X3 6 9 14.5

45 Chapter 1345 Example – other way Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max 18.2 26.3 34.2 18.0 MS G min 1.1 2.0 4.6 0.5 F MAX 16.5 ? Subjects in larger N G 10 12 21 7 df FMAX 9 ? p .01.01 ? 6 4 9 11 20 6 13.2 7.4 36.0

46 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 k = number of variances df FMAX F MAX (6,11) = 13.2 p .01

47 Chapter 1347 Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max 18.2 26.3 34.2 18.0 MS G min 1.1 2.0 4.6 0.5 F MAX 16.5 13.2 7.4 36.0 Subjects in larger N G 10 12 21 7 df FMAX 9 11 20 6 p .01.01 ? 6 4 9

48 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 k = number of variances df FMAX F MAX (4,20) = 7.4 p .01

49 Chapter 1349 Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max 18.2 26.3 34.2 18.0 MS G min 1.1 2.0 4.6 0.5 F MAX 16.5 13.2 7.4 36.0 Subjects in larger N G 10 12 21 7 df FMAX 9 11 20 6 p .01.01 ? 6 4 9

50 2 3 4 5 6 7 8 9 10 4 23.2 37 49 59 69 79 89 97 106 5 14.9 22 28 33 38 42 46 50 54 6 11.1 15.5 19.1 22 25 27 30 32 34 7 8.89 12.1 14.5 16.5 18.4 20 22 23 24 8 7.50 9.9 11.7 13.2 14.5 15.8 16.9 17.9 18.9 9 6.54 8.5 9.9 11.1 12.1 13.1 13.9 14.7 15.3 10 5.85 7.4 8.6 9.6 10.4 11.1 11.8 12.4 12.9 12 4.91 6.1 6.9 7.6 8.2 8.7 9.1 9.5 9.9 15 4.07 4.9 5.5 6.0 6.4 6.7 7.1 7.3 7.5 20 3.32 3.8 4.3 4.6 4.9 5.1 5.3 5.5 5.6 30 2.63 3.0 3.3 3.4 3.6 3.7 3.8 3.9 4.0 60 1.96 2.2 2.3 2.4 2.4 2.5 2.5 2.6 2.6 K = number of variances df FMAX F MAX (9,6) = 36.0 p .01

51 Chapter 1351 Answers to examples Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max 18.2 26.3 34.2 18.0 MS G min 1.1 2.0 4.6 0.5 F MAX 16.5 13.2 7.4 36.0 Subjects in larger N G 10 12 21 7 df FMAX 9 11 20 6 p .01.01 6 4 9 You cannot use the F test for any of these experiments!

52 Chapter 1352 Homogeneity of Variance Conclusions If F MAX is significant, then the assumption of homogeneity of variance has been violated. If the assumption of homogeneity of variance is violated, then we cannot use parametric statistics.

53 Chapter 1353

54 Chapter 1354 Assumption zOur last assumption that we must meet to use parametric statistics is that the measures in our experiment use an interval scale. zAn interval scale is a set of numbers whose differences are equal at all points along the scale.

55 Chapter 1355 Examples of Interval Scales zIntegers - 1,2,3,4,… zReal numbers - 1.0, 1.1, 1.2, 1.3,… zTime - 1 minute, 2 minutes, 3 minutes, … zDistance - 1 foot, 2 feet, 3 feet, 4 feet, …

56 Chapter 1356 Examples of Non-Interval Scales zOrdinal - ranks, such as first, second, third; high medium low; etc. yThe difference in time between first and second can be very different from the time between second and third. yThe median is the best measure of central tendency for ordinal data.

57 Chapter 1357 Examples of Non-Interval Scales zNominal - categories, such as, male, female; pass, fail. yThere is not even an order for nominal data. yCategories should be mutually exclusive and exhaustive. yThe best measure of central tendency is the mode.

58 Chapter 1358 Comparing Scales zInterval scales have more information than ordinal scales, which in turn have more information than nominal scales. zThe more information that is available, the more sensitive that a given statistical test can be.

59 Chapter 1359 Book Example - test grades Interval Scale SCORES 98 84 77 76 75 62 61 60 Ordinal Scale RANKS 1 2 3 4 5 6 7 8 Nominal Scale Pass/Fail P P P P P F F F

60 Chapter 1360 Book Example - test grades Interval Scale SCORES 98 84 77 76 75 62 61 60 Ordinal Scale RANKS 1 2 3 4 5 6 7 8 Nominal Scale Pass/Fail P P P P P F F F Ordinal scales show the relative order of individual measures. However, there is no information about how far apart individuals are.

61 Chapter 1361 Book Example - test grades Interval Scale SCORES 98 84 77 76 75 62 61 60 Ordinal Scale RANKS 1 2 3 4 5 6 7 8 Nominal Scale Pass/Fail P P P P P F F F Categories are mutually exclusive; you either pass or fail. Categories are exhaustive; you can only pass or fail.

62 Chapter 1362 Interval Scale Conclusion zParametric tests can only be performed on interval data. zNon-parametric tests must be used on ordinal and nominal data. zResearchers prefer parametric tests because more information is available, which makes it easier to find: ySignificant differences between experimental group means or ySignificant correlations between two variables. zIf any assumptions are violated, it is common practice to convert from the interval scale to another scale. Then you can use the weaker, non-parametric statistics. zThere are non-parametric statistics that correspond to all of the parametric statistics that we have studied.

63 Chapter 1363 Summary - Assumptions zSubjects are randomly selected from the population. zExperimental error is randomly distributed across samples in the design. zThe distribution of sample means fit a normal curve. zThere is homogeneity of variance demonstrated by using F MAX. zThe measures we take are on an interval scale. To use parametric statistics, it must be true that:


Download ppt "Chapter 131 Assumptions Underlying Parametric Statistical Techniques."

Similar presentations


Ads by Google