Chapter 131 Assumptions Underlying Parametric Statistical Techniques.

Slides:



Advertisements
Similar presentations
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
Advertisements

I OWA S TATE U NIVERSITY Department of Animal Science Using Basic Graphical and Statistical Procedures (Chapter in the 8 Little SAS Book) Animal Science.
BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Descriptive Statistics
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Review of Basics. REVIEW OF BASICS PART I Measurement Descriptive Statistics Frequency Distributions.
Chapter 5 Introduction to Inferential Statistics.
QUANTITATIVE DATA ANALYSIS
Chapter 13 Conducting & Reading Research Baumgartner et al Data Analysis.
PRED 354 TEACH. PROBILITY & STATIS. FOR PRIMARY MATH
Calculating & Reporting Healthcare Statistics
Chapter 9 - Lecture 2 Some more theory and alternative problem formats. (These are problem formats more likely to appear on exams. Most of your time in.
Chapter 131 Assumptions Underlying Parametric Statistical Techniques.
Descriptive Statistics
Chapter 131 Assumptions Underlying Parametric Statistical Techniques.
Introduction to Educational Statistics
One-way Between Groups Analysis of Variance
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
Data observation and Descriptive Statistics
Chapter 3: Central Tendency
Measures of Central Tendency
Review I volunteer in my son’s 2nd grade class on library day. Each kid gets to check out one book. Here are the types of books they picked this week:
AM Recitation 2/10/11.
Non-parametric Dr Azmi Mohd Tamil.
Chapter 3 Statistical Concepts.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
Chapter Eleven A Primer for Descriptive Statistics.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Tuesday August 27, 2013 Distributions: Measures of Central Tendency & Variability.
Variability. Statistics means never having to say you're certain. Statistics - Chapter 42.
Hypothesis of Association: Correlation
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Measures of Center.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Descriptive Research: Quantitative Method Descriptive Analysis –Limits generalization to the particular group of individuals observed. –No conclusions.
An Introduction to Statistics. Two Branches of Statistical Methods Descriptive statistics Techniques for describing data in abbreviated, symbolic fashion.
1 Measure of Center  Measure of Center the value at the center or middle of a data set 1.Mean 2.Median 3.Mode 4.Midrange (rarely used)
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Descriptive Statistics The goal of descriptive statistics is to summarize a collection of data in a clear and understandable way.
Review Hints for Final. Descriptive Statistics: Describing a data set.
Experimental Design and Statistics. Scientific Method
Midterm Review Ch 7-8. Requests for Help by Chapter.
Chapter Eight: Using Statistics to Answer Questions.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Central Tendency A statistical measure that serves as a descriptive statistic Determines a single value –summarize or condense a large set of data –accurately.
BASIC STATISTICAL CONCEPTS Chapter Three. CHAPTER OBJECTIVES Scales of Measurement Measures of central tendency (mean, median, mode) Frequency distribution.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Outline of Today’s Discussion 1.Displaying the Order in a Group of Numbers: 2.The Mean, Variance, Standard Deviation, & Z-Scores 3.SPSS: Data Entry, Definition,
 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.
Chapter 3: Central Tendency 1. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
CHAPTER 2: Basic Summary Statistics
Educational Research: Data analysis and interpretation – 1 Descriptive statistics EDU 8603 Educational Research Richard M. Jacobs, OSA, Ph.D.
Chapter 6: Descriptive Statistics. Learning Objectives Describe statistical measures used in descriptive statistics Compute measures of central tendency.
Dr.Rehab F.M. Gwada. Measures of Central Tendency the average or a typical, middle observed value of a variable in a data set. There are three commonly.
Different Types of Data
Descriptive measures Capture the main 4 basic Ch.Ch. of the sample distribution: Central tendency Variability (variance) Skewness kurtosis.
Non-Parametric Tests 12/1.
Basic Practice of Statistics - 5th Edition
Non-Parametric Tests 12/6.
Central Tendency and Variability
Non-Parametric Tests.
Descriptive Statistics
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Chapter 3: Central Tendency
CHAPTER 2: Basic Summary Statistics
Chapter Nine: Using Statistics to Answer Questions
Presentation transcript:

Chapter 131 Assumptions Underlying Parametric Statistical Techniques

Chapter 132 Parametric Statistics zWe have been studying parametric statistics. zThey include estimations of mu and sigma, correlation, t tests and F tests.

Chapter 133 Five Assumptions ytwo research assumptions; ytwo assumptions about the type of the distributions in the samples, yand one assumption about the kind of numbering system that we are using. To validly use parametric statistics, we make

Chapter 134 Research Assumptions zSubjects have to be randomly selected from the population. zExperimental error is randomly distributed across samples in the design. (We will not discuss these any further).

Chapter 135 Distribution Assumptions zThe distribution of sample means fit a normal curve. zHomogeneity of variance (using F MAX ).

Chapter 136 Assumptions about Numbering Schemes zThe measures we take are on an interval scale. (Other numbering scales, such as ordinal and nominal, do not allow the estimation of population parameters such as mu and sigma and the tests used to analyze such data are therefore call “nonparametric”).

Chapter 137 Violating the Assumptions If any of these assumptions are violated, we cannot use parametric statistics. We must use less-powerful, non-parametric statistics.

Chapter 138

9 Sample Means zAn assumption we need to make is that the distribution of sample means fits a normal curve. zThis is not as extreme an assumption as it might seem. zWe will follow the example in the book to demonstrate (only smaller).

Chapter 1310 An Artificial Population zSeven subjects. zEach subject has a different score. zWe sample five subjects. A1B2C3D4E5F6G7A1B2C3D4E5F6G7 Subject Score

Chapter 1311 The Distribution is Rectangular SCORE FREQUENCY 

Chapter 1312 Sampling without replacement just to illustrate the concept. Disregarding Order All Possible Samples: 7 scores, 5 at a time. ABCDE ABCDF ABCDG ABCEF ABCEG ABCFG ABDEF ABDEG ABDFG ABEFG ACDEF ACDEG ACDFG ACEFG ADEFG BCDEF BCDEG BCDFG BCEFG BDEFG CDEFG Sample Scores Mean

Chapter 1313 Sample Distribution

Chapter 1314 Normal Curve for Sample Means Conclusion Even if we have a small population (7), … with a rectangular distribution, … and a small sample size (5), … which, without replacement, yields a small number of possible samples (21), … the sample means tend to fall in an (approximately) normal distribution. This assumption that the distribution of sample means will basically fit a normal curve is seldom violated. This assumption is robust.

Chapter 1315 But it can happen -Violating the Normal Curve Assumption Normal curves yare symmetric yare bell-shaped yhave a single peak Non-normal curves yhave skew yhave kurtosis- platykutic or leptokurtic yare polymodal Distributions of sample means can vary from normal in several ways.

Chapter 1316 Symmetry FrequencyFrequency score The left side is the same shape as the right side.

Chapter 1317 Skewed NORMAL Skewed Right Skewed Left

Chapter 1318 Bell-shaped FrequencyFrequency score Area under the curve occurs in a prescribed manner, as listed in the Z table. 1 SD is 34%; 2 SD is 48%; etc.

Chapter 1319 Kurtosis NORMAL Leptokurtic Platykurtic

Chapter 1320 One mode FrequencyFrequency score There is only one mode and it equals the median and the mean.

Chapter 1321 Polymodality NORMAL Bimodal Trimodal

Chapter 1322 Violation of normally distributed sample means If the distribution of sample means is z… skewed, z… or has kurtosis, z… or more than one mode, z… then we cannot use parametric statistics. zBUT THIS IS RARE.

Chapter 1323

Chapter 1324 For Correlation zFor correlation, the scores must vary roughly the same amount around the entire length of the regression line. zThis is called homoscedasticity. zWithout homoscedasticity, you can not compute Pearson’s r to test for a significant correlation. Instead, you must use a nonparametric test, such as Spearman’s rho.

Chapter 1325 Homoscedasticity

Chapter 1326 Non-Homoscedasticity

Chapter 1327 For F Ratios and t Tests zWe assume that the distribution of scores around each sample mean is similar. zThe distributions within each group all estimate the same thing, that is, sigma 2. zThe mean squares within each group should be the approximately the same in each group, differing only because of random sampling fluctuation. zFor F ratios and t tests, this is called homogeneity of variance.

Chapter 1328 Homogeneity of Variance In mathematical terms, homogeneity of variance means that the mean squares for each group are about the same. We use the F MAX test to check if the group with the smallest mean squares is “too different” from the group with the largest mean squares.

Chapter 1329 F MAX  If F MAX is significant, then the Mean Squares deviate from each other too much. zThe assumption of homogeneity of variance is violated. zWe cannot use parametric statistics!

Chapter 1330 Why??? zBecause all parametric statistical procedures rely on our ability to estimate sigma 2 with MS W. zIf the estimates of MS W among the groups differ among groups so that F max is significant, the odds are someone (most likely the senior experimenter) messed up and created a measure with too small a range of scores.

Chapter 1331 When that happens all the scores pile up at one end of the scale. zWhen everyone scores at the top or bottom a scale, individual differences and measurement problems seem to disappear. zWe call this a ceiling effect (if the scores are all at the top of the scale) and a floor effect if the scores are all at the bottom

Chapter 1332 zBecause ID and MP in one or more groups have been pushed up against the top or bottom of the scale there is practically no within group variation. zSo, while adding df, the group contributes little or nothing to sum of squares within group (SS W ). zSo, when you include one or more groups with practically no variation within group in your totals sums of squares and mean square, you wind up with an underestimate of sigma 2.

Chapter 1333 zThis makes it possible to get significant results not because you have pushed the means apart with an IV, but because MS W is an underestimate of sigma 2 and therefore the denominator of the F or t test will be too small. zSo you can get significant results more often than you should when the null is true.

Chapter 1334 Divide by df n G -1. Sum the deviations. Uncrowded vs crowded groups – How crowded do you feel? Square the deviations Calculate the deviations Calculate the means. 8.75

Chapter 1335 F MAX In F MAX, the “MAX” part refers to the largest ratio that can be obtained by comparing the estimated variances from 2 experimental groups. The significance of F MAX is checked in an F MAX table.

K = number of variances df FMAX alpha =.01. n G(larger) - 1 Interpolate to larger df. The number of groups in the experiment.

k = number of variances df FMAX The critical values.

Chapter 1338 Book Example

k = number of variances df FMAX F MAX = F MAX exceeds the critical value. We cannot use parametric statistics.

Chapter 1340 You solve these NumberSubjectsCritical value Designof Means in larger n G of F MAX 2X X2 ? 16 ? 3X3 ? 11 ? 2X3 ? 9 ? 4 9 6

K = number of variances df FMAX

Chapter 1342 Answers to examples: NumberSubjectsCritical value Designof Means in larger N G of F MAX 2X X X X

Chapter 1343 You solve these Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max MS G min F MAX 16.5 ? Subjects in larger N G df FMAX 9 ? p  ?

k = number of variances df FMAX F MAX (6,11) = 13.2 p .01

Chapter 1345 Answers to examples Design 2X4 2X3 2X2 3X3 Number of Means 8 ? MS G max MS G min F MAX Subjects in larger N G df FMAX p  You cannot use the F test for any of these experiments!

Chapter 1346 Homogeneity of Variance Conclusions If F MAX is significant, then the assumption of homogeneity of variance has been violated. If the assumption of homogeneity of variance is violated, we will tend to underestimate estimate sigma 2 and therefore can not compute the t or F test without committing an unacceptable number of Type 1 errors.

Chapter 1347

Chapter 1348 Assumption zOur last assumption that we must meet to use parametric statistics is that the measures in our experiment use an interval scale. zAn interval scale is a set of numbers whose differences are equal at all points along the scale.

Chapter 1349 Examples of Interval Scales zIntegers - 1,2,3,4,… zReal numbers - 1.0, 1.1, 1.2, 1.3,… zTime - 1 minute, 2 minutes, 3 minutes, … zDistance - 1 foot, 2 feet, 3 feet, 4 feet, …

Chapter 1350 Examples of Non-Interval Scales zOrdinal - ranks, such as first, second, third; high medium low; etc. yThe difference in time between first and second can be very different from the time between second and third. yThe median is the best measure of central tendency for ordinal data.

Chapter 1351 Examples of Non-Interval Scales zNominal - categories, such as, male, female; pass, fail. yThere is not even an order for nominal data. yCategories should be mutually exclusive and exhaustive. yThe best measure of central tendency is the mode.

Chapter 1352 Comparing Scales zInterval scales have more information than ordinal scales, which in turn have more information than nominal scales. zThe more information that is available, the more sensitive that a given statistical test can be.

Chapter 1353 Example - test grades Interval Scale SCORES Ordinal Scale RANKS Nominal Scale Pass/Fail P P P P P F F F

Chapter 1354 Book Example - test grades Interval Scale SCORES Ordinal Scale RANKS Nominal Scale Pass/Fail P P P P P F F F Ordinal scales show the relative order of individual measures. However, there is no information about how far apart individuals are.

Chapter 1355 Book Example - test grades Interval Scale SCORES Ordinal Scale RANKS Nominal Scale Pass/Fail P P P P P F F F Categories are mutually exclusive; you either pass or fail. Categories are exhaustive; you can only pass or fail.

Chapter 1356 Interval Scale Conclusion zParametric tests can only be performed on interval data. zNon-parametric tests must be used on ordinal and nominal data. zResearchers prefer parametric tests because more information is available, which makes it easier to find: ySignificant differences between experimental group means or ySignificant correlations between two variables. zIf any assumptions are violated, it is common practice to convert from the interval scale to another scale. Then you can use the weaker, non-parametric statistics. zThere are non-parametric statistics that correspond to all of the parametric statistics that we have studied.

Chapter 1357 Summary - Assumptions zSubjects are randomly selected from the population. zExperimental error is randomly distributed across samples in the design. zThe distribution of sample means fit a normal curve. zThere is homogeneity of variance demonstrated by using F MAX. zThe measures we take are on an interval scale. To use parametric statistics, it must be true that: