Chapter 131 Assumptions Underlying Parametric Statistical Techniques.

Slides:

Advertisements

Similar presentations

Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.

Advertisements

Chapter 3, Numerical Descriptive Measures

I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.

© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 4. Measuring Averages.

Descriptive Statistics

Statistics. Review of Statistics Levels of Measurement Descriptive and Inferential Statistics.

Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.

Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.

Chapter 5 Introduction to Inferential Statistics.

QUANTITATIVE DATA ANALYSIS

PRED 354 TEACH. PROBILITY & STATIS. FOR PRIMARY MATH

The standard error of the sample mean and confidence intervals

Chapter 3 The Normal Curve Where have we been? To calculate SS, the variance, and the standard deviation: find the deviations from , square and sum.

Chapter 131 Assumptions Underlying Parametric Statistical Techniques.

Descriptive Statistics

Analysis of Research Data

Chapter 5 Introduction to Inferential Statistics.

Chapter 131 Assumptions Underlying Parametric Statistical Techniques.

Introduction to Educational Statistics

One-way Between Groups Analysis of Variance

Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).

Data observation and Descriptive Statistics

Basic Analysis of Variance and the General Linear Model Psy 420 Andrew Ainsworth.

Measures of Central Tendency

Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:

Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.

Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.

AM Recitation 2/10/11.

Chapter 3 Statistical Concepts.

Statistics and Research methods Wiskunde voor HMI Betsy van Dijk.

Chapter 15 Correlation and Regression

Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.

Chapter Eleven A Primer for Descriptive Statistics.

Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.

University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.

Variability. Statistics means never having to say you're certain. Statistics - Chapter 42.

Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.

Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.

Review Hints for Final. Descriptive Statistics: Describing a data set.

Experimental Design and Statistics. Scientific Method

Midterm Review Ch 7-8. Requests for Help by Chapter.

Chapter Eight: Using Statistics to Answer Questions.

Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.

Central Tendency A statistical measure that serves as a descriptive statistic Determines a single value –summarize or condense a large set of data –accurately.

BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.

IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.

An importer of Herbs and Spices claims that average weight of packets of Saffron is 20 grams. However packets are actually filled to an average weight,

Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.

Statistics for Political Science Levin and Fox Chapter Seven

Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,

 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.

Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.

Measures of Central Tendency (MCT) 1. Describe how MCT describe data 2. Explain mean, median & mode 3. Explain sample means 4. Explain “deviations around.

Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.

CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.

Central Bank of Egypt Basic statistics. Central Bank of Egypt 2 Index I.Measures of Central Tendency II.Measures of variability of distribution III.Covariance.

Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.

Non-Parametric Tests 12/1.

Basic Practice of Statistics - 5th Edition

Non-Parametric Tests 12/6.

Distribution of the Sample Means

Non-Parametric Tests.

Summary descriptive statistics: means and standard deviations:

Make an Organized List and Simulate a Problem

Quantitative Methods in HPELS HPELS 6210

Chapter 3: Central Tendency

Chapter Nine: Using Statistics to Answer Questions

Presentation transcript:

Chapter 131 Assumptions Underlying Parametric Statistical Techniques

Chapter 132 Parametric Statistics zWe have been studying parametric statistics. zThey include estimations of mu and sigma, correlation, t tests and F tests.

Chapter 133 Five Assumptions ytwo research assumptions; ytwo assumptions about the type of the distributions in the samples, yand one assumption about the kind of numbering system that we are using. To validly use parametric statistics, we make

Chapter 134 Research Assumptions zSubjects have to be randomly selected from the population. zExperimental error is randomly distributed across samples in the design. (We will not discuss these any further).

Chapter 135 Distribution Assumptions zThe distribution of sample means fit a normal curve. zHomogeneity of variance (using F MAX ).

Chapter 136 Assumptions about Numbering Schemes zThe measures we take are on an interval scale. (Other numbering scales, such as ordinal and nominal, are non-parametric).

Chapter 137 Violating the Assumptions If any of these assumptions are violated, we cannot use parametric statistics. We must use less-powerful, non-parametric statistics.

Chapter 138

9 Sample Means zAn assumption we need to make is that the distribution of sample means is normally distributed. zThis is not as extreme an assumption as it might seem. zWe will follow an example from Chapter 4 to demonstrate.

Chapter 1310 Example: Start with a tiny population N=5 zThe scores in this population form a perfectly rectangular distribution. zMu = 5.00 zSigma = 2.83 zWe are going to list all the possible samples of size 2 (n=2) zFirst see the population, then the list of samples

Chapter 1311

Chapter 1312

Chapter 1313 Table 4.10: List of all 25 possible samples (n=2) of scores from the tiny population of 5 scores shown in Table 4.9 and Figure 4.5 I Sample Scores X X Summary statistics (all samples, n=2) AA1, DA 7,  X = AB1, DB 7, N = 25 AC 1, DC 7, mu = 5.00 AD1, DD 7, SS = AE1, DE 7, BA3, EA 9, BB3, EB 9, BC 3, EC9, BD3, ED 9, BE 3, EE 9, CA 5, CB5, CC5, CD5, CE 5,9 7.00

Chapter 1314 Normal Curve for Sample Means Conclusion Even if we have a small population (5), … with a rectangular distribution, … and a small sample size (2), … which yields a small number of possible samples (5 2 = 25) … the sample means tend to fall in a normal distribution. This assumption is seldom violated. This assumption is robust.

Chapter 1315

Chapter 1316 Violating the Normal Curve Assumption Normal curves yare symmetric yare bell-shaped yhave a single peak Non-normal curves yhave skew yhave kurtosis- platykutic or leptokurtic yare polymodal Distributions can vary from normal in many ways.

Chapter 1317 Symmetry FrequencyFrequency score The left side is the same shape as the right side.

Chapter 1318 Skewed NORMAL Skewed Right Skewed Left

Chapter 1319 Bell-shaped FrequencyFrequency score Area under the curve occurs in a prescribed manner, as listed in the Z table. 1 sigma ~ 34%; 2 sigma ~ 48%; On each side of the mean

Chapter 1320 Kurtosis NORMAL Leptokurtic Platykurtic

Chapter 1321 One mode FrequencyFrequency score There is only one mode and it equals the median and the mean.

Chapter 1322 Polymodality NORMAL Bimodal Trimodal

Chapter 1323 Violation of normally distributed sample means If the distribution of sample means is z… skewed, z… or has kurtosis, z… or more than one mode, z… then we cannot use parametric statistics.

Chapter 1324

Chapter 1325 For F Ratios and t Tests zWe assume that the distribution of scores around each sample mean is similar. zThe distributions within each group all estimate the same thing, that is, sigma 2. zThe mean squares within each group should be the same in each group. zFor F ratios and t tests, this is called homogeneity of variance.

Chapter 1326 For Correlation zFor correlation, the scores must vary roughly the same amount around the entire length of the regression line. zThis is called homoscedasticity.

Chapter 1327 Homoscedasticity

Chapter 1328 Non-Homoscedasticity

Chapter 1329 Homogeneity of Variance In mathematical terms, homogeneity of variance means that the mean squares for each group are about the same. MS W is a consistent estimate of sigma 2. The more degrees of freedom for MS W, the closer it tends to come to sigma 2.

Chapter 1330 We assume the mean square is your best estimate of sigma 2 Since MS W has more df than MS 1 or MS 2 or MS K, it should be a better estimate of sigma 2. But that only works when the mean squares in all the groups are fairly good estimates of sigma 2. We use the F MAX test to check if the group with the smallest mean squares is “too different” from the group with the largest mean squares for the combined mean square (MSW) to be a good estimate of sigma 2.

Chapter 1331 F MAX  If F MAX is significant, then the Mean Squares differ too much from each other to combine into a single estimate. z(Usually it means that the variance in one of the groups has virtually disappeared because of a floor or ceiling effect. When that happens, adding that groups sum of squares and df into the mix produces an underestimate of sigma 2. zWhen that happens, it becomes too easy to make a Type 1 error. zWe say that “The assumption of homogeneity of variance is violated.” zAnd we cannot use parametric statistics!

Chapter 1332 Divide by df (n G -1) to get MS for each group. Sum the deviations. Book Example - no homogeneity Square the deviations Calculate the deviations Within each group Calculate the means. 8.75

Chapter 1333 F MAX In F MAX, the “MAX” part refers to the largest ratio that can be obtained by comparing the estimated variances from 2 experimental groups. The significance of F MAX is checked in an F MAX table.

K = number of variances df FMAX alpha =.01. n G(larger) - 1 Default = larger df. The number of groups in the experiment.

k = number of variances df FMAX The critical values.

Chapter 1336 Book Example

k = number of variances df FMAX F MAX = > 8.89 F MAX exceeds the critical value. We cannot use parametric statistics.

Chapter 1338 Examples NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X X2 ? 16 ? 3X3 ? 11 ? 2X3 ? 9 ? 4 9 6

K = number of variances df FMAX

Chapter 1340 CPE NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X X X ? 2X3 6 9 ?

K = number of variances df FMAX

Chapter 1342 CPE NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X X X X3 6 9 ?

K = number of variances df FMAX

Chapter 1344 CPE NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X X X X

Chapter 1345 Example – other way Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max MS G min F MAX 16.5 ? Subjects in larger N G df FMAX 9 ? p  ?

k = number of variances df FMAX F MAX (6,11) = 13.2 p .01

Chapter 1347 Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max MS G min F MAX Subjects in larger N G df FMAX p  ? 6 4 9

k = number of variances df FMAX F MAX (4,20) = 7.4 p .01

Chapter 1349 Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max MS G min F MAX Subjects in larger N G df FMAX p  ? 6 4 9

K = number of variances df FMAX F MAX (9,6) = 36.0 p .01

Chapter 1351 Answers to examples Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max MS G min F MAX Subjects in larger N G df FMAX p  You cannot use the F test for any of these experiments!

Chapter 1352 Homogeneity of Variance Conclusions If F MAX is significant, then the assumption of homogeneity of variance has been violated. If the assumption of homogeneity of variance is violated, then we cannot use parametric statistics.

Chapter 1353

Chapter 1354 Assumption zOur last assumption that we must meet to use parametric statistics is that the measures in our experiment use an interval scale. zAn interval scale is a set of numbers whose differences are equal at all points along the scale.

Chapter 1355 Examples of Interval Scales zIntegers - 1,2,3,4,… zReal numbers - 1.0, 1.1, 1.2, 1.3,… zTime - 1 minute, 2 minutes, 3 minutes, … zDistance - 1 foot, 2 feet, 3 feet, 4 feet, …

Chapter 1356 Examples of Non-Interval Scales zOrdinal - ranks, such as first, second, third; high medium low; etc. yThe difference in time between first and second can be very different from the time between second and third. yThe median is the best measure of central tendency for ordinal data.

Chapter 1357 Examples of Non-Interval Scales zNominal - categories, such as, male, female; pass, fail. yThere is not even an order for nominal data. yCategories should be mutually exclusive and exhaustive. yThe best measure of central tendency is the mode.

Chapter 1358 Comparing Scales zInterval scales have more information than ordinal scales, which in turn have more information than nominal scales. zThe more information that is available, the more sensitive that a given statistical test can be.

Chapter 1359 Book Example - test grades Interval Scale SCORES Ordinal Scale RANKS Nominal Scale Pass/Fail P P P P P F F F

Chapter 1360 Book Example - test grades Interval Scale SCORES Ordinal Scale RANKS Nominal Scale Pass/Fail P P P P P F F F Ordinal scales show the relative order of individual measures. However, there is no information about how far apart individuals are.

Chapter 1361 Book Example - test grades Interval Scale SCORES Ordinal Scale RANKS Nominal Scale Pass/Fail P P P P P F F F Categories are mutually exclusive; you either pass or fail. Categories are exhaustive; you can only pass or fail.

Chapter 1362 Interval Scale Conclusion zParametric tests can only be performed on interval data. zNon-parametric tests must be used on ordinal and nominal data. zResearchers prefer parametric tests because more information is available, which makes it easier to find: ySignificant differences between experimental group means or ySignificant correlations between two variables. zIf any assumptions are violated, it is common practice to convert from the interval scale to another scale. Then you can use the weaker, non-parametric statistics. zThere are non-parametric statistics that correspond to all of the parametric statistics that we have studied.

Chapter 1363 Summary - Assumptions zSubjects are randomly selected from the population. zExperimental error is randomly distributed across samples in the design. zThe distribution of sample means fit a normal curve. zThere is homogeneity of variance demonstrated by using F MAX. zThe measures we take are on an interval scale. To use parametric statistics, it must be true that: