Goodness of Fit x² -Test

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Module 16: One-sample t-tests and Confidence Intervals
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Fundamentals of Data Analysis Lecture 6 Testing of statistical hypotheses pt.3.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Sampling Distributions (§ )
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Hypothesis Testing Using a Single Sample
Sample size computations Petter Mostad
Chapter 9 Chapter 10 Chapter 11 Chapter 12
Chapter 11 Chi-Square Procedures 11.1 Chi-Square Goodness of Fit.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
IENG 486 Statistical Quality & Process Control
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Sampling Theory Determining the distribution of Sample statistics.
1 Ch6. Sampling distribution Dr. Deshi Ye
Estimation Basic Concepts & Estimation of Proportions
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Fundamentals of Data Analysis Lecture 4 Testing of statistical hypotheses.
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Tests for Random Numbers Dr. Akram Ibrahim Aly Lecture (9)
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
CHAPTER 11 DAY 1. Assumptions for Inference About a Mean  Our data are a simple random sample (SRS) of size n from the population.  Observations from.
Ch9. Inferences Concerning Proportions. Outline Estimation of Proportions Hypothesis concerning one Proportion Hypothesis concerning several proportions.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
Statistical Methods II&III: Confidence Intervals ChE 477 (UO Lab) Lecture 5 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
Confidence intervals and hypothesis testing Petter Mostad
Statistical Methods II: Confidence Intervals ChE 477 (UO Lab) Lecture 4 Larry Baxter, William Hecker, & Ron Terry Brigham Young University.
INTRODUCTORY MATHEMATICAL ANALYSIS For Business, Economics, and the Life and Social Sciences  2011 Pearson Education, Inc. Chapter 16 Continuous Random.
Virtual University of Pakistan Lecture No. 44 of the course on Statistics and Probability by Miss Saleha Naghmi Habibullah.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Virtual University of Pakistan
Chapter Nine Hypothesis Testing.
Virtual University of Pakistan
Chapter 4 Basic Estimation Techniques
Nonparametric test Nonparametric tests are decoupled from the distribution so the tested attribute may also be used in the case of arbitrary distribution,
Quality Control İST 252 EMRE KAÇMAZ B4 /
One-Sample Inference for Proportions
CHAPTER 11 CHI-SQUARE TESTS
Chapter 4. Inference about Process Quality
Statistical Process Control
Basic Estimation Techniques
Chapter 9 Hypothesis Testing.
Multiple Choice Review 
CONCEPTS OF ESTIMATION
Chapter 9 Hypothesis Testing.
Discrete Event Simulation - 4
Confidence Intervals for a Standard Deviation
Hypothesis Tests for Two Population Standard Deviations
Analyzing the Association Between Categorical Variables
CHAPTER 11 CHI-SQUARE TESTS
Hypothesis Tests for a Standard Deviation
The Examination of Residuals
Sampling Distributions (§ )
Last Update 12th May 2011 SESSION 41 & 42 Hypothesis Testing.
Chapter 8 Estimation.
Testing a Claim About a Standard Deviation or Variance
Presentation transcript:

Goodness of Fit x² -Test İST 252 EMRE KAÇMAZ B4 / 14.00-17.00

Goodness of Fit 𝜒² -Test To test for goodness of fit means that we wish to test that a certain function F(x) is the distribution function of a distribution from which we have a sample x1,…..xn . Then we test whether the sample distribution function 𝐹 (x) defined by fits F(x) ‘sufficiently well.’

Goodness of Fit x² -Test If this is so, we shall accept the hypothesis that F(x) is the distribution function of the population; if not, we shall reject the hypothesis. This test is of considerable practical importance, and it differs in character from the tests for parameters (μ, σ², etc.) considered so far. To test in that fashion, we have to know how much 𝐹 (x) can differ from F(x) if the hypothesis is true.

Goodness of Fit x² -Test Hence we must first introduce a quantity that measures the deviation of 𝐹 (x) from F(x), and we must know the probability distribution of thi quantity under the assumption that the hypothesis is true. Then we proceed as follows. We determine a number c such that, if the hypothesis is true, a deviation greater than c has a small preassigned probability.

Goodness of Fit x² -Test If, nevertheless, a deviation greater than c occurs, we have reason to doubt that the hypothesis is true and we reject it. On the other hand, if the deviation does not exceed c, so that 𝐹 (x) approximates F(x) sufficiently well, we accept the hypothesis. Of course, if we accept the hypothesis, this means that we have insufficient evidence to reject it, and this does not exclude the possibility that there are other functions that would not be rejected in the test. In this respect the situation is quite similar to that in Sec 25.4

Goodness of Fit x² -Test Table 25.7 shows a test of that type, which was introduced by R.A.Fisher. This test is justified by the fact that if the hypothesis is true, than x0² is an observed value of random variable whose distribution with K-1 degrees of freedom (or K – r – 1 degrees of freedom if r parameters are estimated) as n approaches infinity.

Goodness of Fit x² -Test The requirement that at least five sample values lie in each interval Table 25.7 results from the fact that for finite n that random variable has only approximately a chi-square distribution. If the sample is so small that the requirement cannot be satisfied, one may continue with the test, but then use the result with caution.

Table 25.7 Chi-square Test for the Hypothesis That F(x) is the Distribution Function of a Population from Which a Sample 𝒙 𝟏 ,…………., 𝒙 𝒏 is Taken Step 1: Subdivide the x-axis into K intervals Ι 1 , Ι 2 ,….., 𝛪 𝐾 such that each interval contains at least 5 values of the given sample 𝒙 𝟏 ,…………., 𝒙 𝒏 . Determine the number 𝑏 𝑗 of sample values in the interval Ι 𝑗 , where j = 1,…..,K. If a sample value lies at a common boundary ppoint of two intervals, add 0.5 to each of the two corresponding 𝑏 𝑗 .

Table 25.7 Step 2: Using F(x), compute the probability 𝑝 𝑗 that the random variable X under consideration assumes any value in the interval Ι 𝑗 , where j = 1,……,K. Compute 𝑒 𝑗 = n 𝑝 𝑗 (This is the number of sample values theoretically expected in Ι 𝑗 if the hypothesis is true.) Step 3 : Compute the deviation

Table 25.7 Step 4 : Choose a significance level (5%, 1%, or the like). Step 5 : Determine the solution c of the equation from the table of the chi-square distribution with K – 1 degrees of freedom. If r parameters of F(x) are unknown and their maximum likelihood estimates are used, then use K – r – 1 degrees of freedom (instead of K – 1). If 𝑥 0 ² ≦ c, accept the hypothesis. If 𝑥 0 ²> c, reject the hypothesis.

Table 25.8

Example 1 Test of Normality Test whether the population from which the sample in table 25.8 was taken is normal.

Solution Table 25.8 shows the values (column by column) in the order obtained in the experiment. Table 25.9 gives the frequency distribution and Fig. 542 the histogram. It is hard to guess the outcome of the test – does the histogram resemble a normal density curve sufficiently well or not?

Solution The maximum likelihood estimates for 𝜇 and 𝜎² are 𝜇 = 𝑥 = 364.7 and 𝜎 ² = 712.9. The computation in Table 25.10 yields 𝑥 0 ² = 2.688. It is very interesting that the interval 375….385 contributes over 50% of 𝑥 0 ². From the histogram we see that the corresponding frequency looks much too small. The second largest contribution comes from 395….405, and the histogram shows that the frequency seems somewhat too large, which is perhaps not obvious from inspection.

Table 25.9

Figure 542 We choose 𝛼 = 5%. Since K = 10 and we estimated r = 2 parameters we have to use with K – r – 1 = 7 degrees of freedom. We find c = 14.07 as the solution of P(x² ≦𝑐) = 95%. Since 𝑥 0 ² < c, we accept the hypothesis that the population is normal.

Figure 542

Table 25.10