Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans

Slides:



Advertisements
Similar presentations
10.2: Estimating a Population Mean (no )
Advertisements

+ Paired Data and Using Tests Wisely Chapter 9: Testing a Claim Section 9.3b Tests About a Population Mean.
AP STATISTICS LESSON 11 – 1 (DAY 3) Matched Pairs t Procedures.
CHAPTER 9 Testing a Claim
Conditions with σ Unknown Note: the same as what we saw before.
Objective: To test claims about inferences for two sample means from matched-pair tests or blocked samples, under specific conditions.
Inference for the Mean of a Population
Simulation – Stat::Fit
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 2 Modeling Distributions of Data 2.2 Density.
AP Statistics Section 10.2 B. Comparative studies are more convincing than single-sample investigations. For that reason, one-sample inference is less.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 2 Modeling Distributions of Data 2.2 Density.
© Buddy Freeman, Independence of error assumption. In many business applications using regression, the independent variable is TIME. When the data.
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
Slide 1 Regression Assumptions and Diagnostic Statistics The purpose of this document is to demonstrate the impact of violations of regression assumptions.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Hypothesis Testing.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 9 Testing a Claim 9.3 Tests About a Population.
Math 4030 – 7b Normality Issues (Sec. 5.12) Properties of Normal? Is the sample data from a normal population (normality)? Transformation to make it Normal?
Paired t Procedures. p ,34,36 p ,38,40,42,44.
10.2 ESTIMATING A POPULATION MEAN. QUESTION: How do we construct a confidence interval for an unknown population mean when we don’t know the population.
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Chapter 2: Modeling Distributions of Data
Chapter 25: Paired t-Test
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Warm Up Check your understanding P. 586 (You have 5 minutes to complete) I WILL be collecting these.
Goodness-of-Fit Tests
Chapter 2: Modeling Distributions of Data
Comparing Two Means: Paired Data
CHAPTER 9 Testing a Claim
Assessing Normality and Data Transformations
Elementary Statistics
Chapter 2: Modeling Distributions of Data
Modelling Input Data Chapter5.
CHAPTER 9 Testing a Claim
Chapter 2: Modeling Distributions of Data
AP Stats Check In Where we’ve been…
Chapter 9: Testing a Claim
Click the mouse button or press the Space Bar to display the answers.
Tests About a Population Mean
Comparing Two Means Match Pair Designs
CHAPTER 9 Testing a Claim
Chapter 2: Modeling Distributions of Data
Comparing Two Means: Paired Data
Chapter 2: Modeling Distributions of Data
CHAPTER 9 Testing a Claim
Chapter 9: Testing a Claim
Chapter 2: Modeling Distributions of Data
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
CHAPTER 9 Testing a Claim
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
Chapter 2: Modeling Distributions of Data
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
Chapter 2: Modeling Distributions of Data
Chapter 2: Modeling Distributions of Data
CHAPTER 9 Testing a Claim
Chapter 2: Modeling Distributions of Data
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
CHAPTER 9 Testing a Claim
CHAPTER 9 Testing a Claim
Chapter 2: Modeling Distributions of Data
Chapter 9: Testing a Claim
Chapter 9: Testing a Claim
15 Chi-Square Tests Chi-Square Test for Independence
Inference for Means: Paired Data
Presentation transcript:

Visual EDF Software to Check the Normality Assumption Dr. Maria E. Calzada Dr. Stephen M. Scariano Loyola University New Orleans

Problem 7.11 of Moore’s the Basic Practice of Statistics reads (paraphrased): Our subjects are 11 people diagnosed as being dependent on caffeine. Each subject was barred from coffee, colas, and other substances containing caffeine. Instead they took capsules containing their normal caffeine intake. During a different time period, they took placebo capsules… Table 7.3 contains data on two of several tests given to the subjects. “Depression” is the score of the Beck Depression inventory. Higher scores show more symptoms of depression. “Beats” is the beats per minute the subject achieved when asked to press a button 200 times as quickly as possible. We are interested in whether being deprived of caffeine affects these outcomes.

Does this matched pairs study give evidence that being deprived of caffeine raises depression scores? Check that the differences are not strikingly nonnormal. Now check the differences in beats per minute with and without caffeine. You should hesitate to use the t procedures on these data. Why?

Histogram of the difference in depression scores, as given by SPSS. Are these data normal?

Histogram of the difference in Beats score. Are these data normal?

Normal probability plot for difference in depression data

Normal probability plot for difference in beats data

While slight departures from normality are usually inconsequential, substantive departures from normality can seriously impair the validity of statistical procedures framed under the “Normality Assumption.” The problem is: How can we help our students distinguish between “slight departures” and “substantive departures” from normality.

EDF TESTS FOR NORMALITY We have written a Visual Basic Program that implements the Lilliefors test for normality and the Anderson-Darling Test for normality. We like the visual aspects of the Lilliefors test. We like the power of the Anderson-Darling Test.

Lilliefors Test We compute the distances between the data’s Empirical Distribution Function and the Cumulative Standard Normal Distribution. If these distances are “not too large” we do not have evidence to reject the “Normality Assumption.” Upper- percentiles for the largest distance have been tabulated using Monte-Carlo Simulations.

Anderson-Darling Test Based on the accumulating square distance function Usually Our program implements the discrete Anderson- Darling test statistic developed by Stephens (1974) to approximate

Conclusions The Anderson-Darling and Lilliefors tests are important tools that should be routinely used alongside probability plots to check the normality assumption. The stand-alone desktop program enhances the craft of exploratory data analysis. Program is available for downloading at

Some References David S. Moore. The Basic Practice of Statistics, Second Edition. Freeman, H. Lilliefors(1969). “On the Kolmogorov- Smirnov Test for Normality with Mean and Variance Unknown.” Journal of the American Statistical Association 64, pp M. A. Stephens (1974). “EDF Statistics for Goodness of Fit and Some Comparisons.” Journal of the American Statistical Association 69, pp