How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-fit Statistics in Categorical Data Analysis Alberto Maydeu-Olivares.

Slides:



Advertisements
Similar presentations
CHI-SQUARE(X2) DISTRIBUTION
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
Statistics II: An Overview of Statistics. Outline for Statistics II Lecture: SPSS Syntax – Some examples. Normal Distribution Curve. Sampling Distribution.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
AN ALGORITHM FOR TESTING UNIDIMENSIONALITY AND CLUSTERING ITEMS IN RASCH MEASUREMENT Rudolf Debelak & Martin Arendasy.
Factor Analysis Ulf H. Olsson Professor of Statistics.
Goodness of Fit and Chi- square test The application of contingency table.
Simple Linear Regression and Correlation
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
Identification of Misfit Item Using IRT Models Dr Muhammad Naveed Khalid.
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
AM Recitation 2/10/11.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Hypothesis Testing:.
Hypothesis Testing.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
ANOVA (Analysis of Variance) by Aziza Munir
Multinomial Distribution
Chi-squared Tests. We want to test the “goodness of fit” of a particular theoretical distribution to an observed distribution. The procedure is: 1. Set.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
1 G Lect 11a G Lecture 11a Example: Comparing variances ANOVA table ANOVA linear model ANOVA assumptions Data transformations Effect sizes.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
Chapter Outline Goodness of Fit test Test of Independence.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
1 Chi-square Test Dr. T. T. Kachwala. Using the Chi-Square Test 2 The following are the two Applications: 1. Chi square as a test of Independence 2.Chi.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
MathematicalMarketing Slide 4b.1 Distributions Chapter 4: Part b – The Multivariate Normal Distribution We will be discussing  The Multivariate Normal.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.
I. ANOVA revisited & reviewed
Inference about the slope parameter and correlation
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Lecture8 Test forcomparison of proportion
Lecture Nine - Twelve Tests of Significance.
Statistical Analysis Professor Lynne Stokes
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Hypothesis Testing Review
Active Learning Lecture Slides
CJT 765: Structural Equation Modeling
John Loucks St. Edward’s University . SLIDES . BY.
Goodness of Fit Tests The goal of χ2 goodness of fit tests is to test is the data comes from a certain distribution. There are various situations to which.
Data Analysis for Two-Way Tables
SA3202 Statistical Methods for Social Sciences
Chapter 9 Hypothesis Testing.
Goodness-of-Fit Tests
CONCEPTS OF ESTIMATION
Chi Square Two-way Tables
POINT ESTIMATOR OF PARAMETERS
I. Statistical Tests: Why do we use them? What do they involve?
Chapter 10 Analyzing the Association Between Categorical Variables
Lecture 36 Section 14.1 – 14.3 Mon, Nov 27, 2006
STAT 312 Introduction Z-Tests and Confidence Intervals for a
Categorical Data Analysis
CHAPTER 6 Statistical Inference & Hypothesis Testing
Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007
Analyzing the Association Between Categorical Variables
Statistics II: An Overview of Statistics
Chapter Outline Goodness of Fit test Test of Independence.
Introductory Statistics
Lecture 43 Section 14.1 – 14.3 Mon, Nov 28, 2005
Presentation transcript:

How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-fit Statistics in Categorical Data Analysis Alberto Maydeu-Olivares Rosa Montano

Outline Introduction Rasch-Type Models for Binary Data Rationale of Goodness-of-Fit Statistics ◦ Full Picture ◦ M2, R1 and R2 Estimating the Power Empirical Comparison of R1, R2 and M2 Numerical Examples Discussion and Conclusion

Introduction Two properties of Rasch-Type models ◦ Sufficient statistics ◦ Specific objectivity Estimation methods ◦ Specific for Rasch-Type models (CML) ◦ General procedures (MML via EM) Goodness-of-fit testing procedures ◦ Specific to Rasch-Type models ◦ General to IRT or multivariate discrete data models

Introduction Compare the performance of certain goodness- of-fit statistics to test Rasch-Type models in MML via EM ◦ Binary data ◦ 1PL (random effects) R1 and R2 for 1PL M2 for multivariate discrete data

Rasch model and 1PL Fixed effects ◦ The distribution of ability is not specified Random effects ◦ Specify a standard normal distribution for ability ◦ The less restrictive definition of specific objectivity still hold

Rationale (000)(100)(010)(001)(110)(101)(011)(111) Marginal Total for each cell > 5 1. High-dimensional contingency table C = 2^n cells which n is the number of items. For example, 20 items test C = 2^20 = cells To fulfill the rule of thumb >5, at least *5 sample size is needed.

(000)(100)(010)(001)(110)(101)(011)(111) … Marginal Total Observed proportion 0.07 Probability Under Model

When order r = 2, Mr -> M2 M2 used the univariate and bivariate information The degree of freedom is It is statistics of choice for testing IRT models 3. Limited information approach (M2) Pooling cells of the contingency table

Degree of freedom is n(n-2) Specific to the monotone increasing and parallel item response functions assumptions 3. Limited information approach (R1 and R2) Degree of freedom is (n(n-2)+2)/2 Specific to the unidimensionality assumption

Estimating the Asymptotic Power Rate Under the sequence of local alternatives ◦ The noncentrality parameter of a chi-square distribution can be calculated given the df for M2, R1 and R2 The Kullback-Leibler discrepancy function can be used ◦ The minimizer of DKL is the same as the maximizer of the maximum likelihood function between a “true” model and a null model

Study 1: Accuracy of p-values under correct model df = Mean; df = ½ Var Another Study by Montano (2009), M2 is better than R1 and the discrepancies between the empirical and asymptotic rate were not large. Group the sum scores ->

The degree of freedom is also adjust An iterative procedure When appropriate score ranges are used, the empirical rejection rate of R1 should be closely match the theoretical rejection rates. This should be also done in R2

Study 2: Asymptotic Power to reject a 2PL

Study 3: Empirical Power to reject a 2PL

Study 4: Asymptotic Power to reject a 3PL

Study 5: Asymptotic Power to reject a multidimensional model

Empirical Example 1: LSAT 7 Data The agreement in ordering between value/df ratio and power

Empirical Example 2: Chilean Mathematical Proficiency Data

Discussion and Conclusions Generally, M2 is more powerful than R1, R2. That is, the R1 and R2 which developed specific to Rasch-type models is not superior than the general M2