The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning W. Holmes Finch.

Slides:



Advertisements
Similar presentations
Handling attrition and non- response in longitudinal data Harvey Goldstein University of Bristol.
Advertisements

Non response and missing data in longitudinal surveys.
DIF Analysis Galina Larina of March, 2012 University of Ostrava.
Treatment of missing values
Chapter 6 – Normal Probability Distributions
Item Response Theory in Health Measurement
Welcome to the World of Investigative Tasks
CJT 765: Structural Equation Modeling Class 3: Data Screening: Fixing Distributional Problems, Missing Data, Measurement.
Adapting to missing data
How to Handle Missing Values in Multivariate Data By Jeff McNeal & Marlen Roberts 1.
Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy LSU ---- Geaux Tigers! April 2009.
Point estimation, interval estimation
Dynamic Treatment Regimes, STAR*D & Voting D. Lizotte, E. Laber & S. Murphy Psychiatric Biostatistics Symposium May 2009.
Efficient Estimation of Emission Probabilities in profile HMM By Virpi Ahola et al Reviewed By Alok Datar.
Normal Distributions What is a Normal Distribution? Why are Many Variables Normally Distributed? Why are Many Variables Normally Distributed? How Are Normal.
Topic 3: Regression.
How to deal with missing data: INTRODUCTION
Modeling Achievement Trajectories When Attrition is Informative Betsy J. Feldman & Sophia Rabe- Hesketh.
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Statistical Methods for Missing Data Roberta Harnett MAR 550 October 30, 2007.
INFERENTIAL STATISTICS – Samples are only estimates of the population – Sample statistics will be slightly off from the true values of its population’s.
Identification of Misfit Item Using IRT Models Dr Muhammad Naveed Khalid.
Introduction to Multilevel Modeling Using SPSS
Multiple imputation using ICE: A simulation study on a binary response Jochen Hardt Kai Görgen 6 th German Stata Meeting, Berlin June, 27 th 2008 Göteborg.
DIFFERENTIAL ITEM FUNCTIONING AND COGNITIVE ASSESSMENT USING IRT-BASED METHODS Jeanne Teresi, Ed.D., Ph.D. Katja Ocepek-Welikson, M.Phil.
Introduction to plausible values National Research Coordinators Meeting Madrid, February 2010.
Measurement Error.
You got WHAT on that test? Using SAS PROC LOGISTIC and ODS to identify ethnic group Differential Item Functioning (DIF) in professional certification exam.
Multivariate Statistical Data Analysis with Its Applications
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
1 Multiple Imputation : Handling Interactions Michael Spratt.
Types of Data in FCS Survey Nominal Scale – Labels and categories (branch, farming operation) Ordinal Scale – Order and rank (expectations, future plans,
Rasch trees: A new method for detecting differential item functioning in the Rasch model Carolin Strobl Julia Kopf Achim Zeileis.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
Handling Attrition and Non- response in the 1970 British Cohort Study Tarek Mostafa Institute of Education – University of London.
1 Introduction to Survey Data Analysis Linda K. Owens, PhD Assistant Director for Sampling & Analysis Survey Research Laboratory University of Illinois.
RDPStatistical Methods in Scientific Research - Lecture 41 Lecture 4 Sample size determination 4.1 Criteria for sample size determination 4.2 Finding the.
SW 983 Missing Data Treatment Most of the slides presented here are from the Modern Missing Data Methods, 2011, 5 day course presented by the KUCRMDA,
© John M. Abowd 2007, all rights reserved General Methods for Missing Data John M. Abowd March 2007.
Chapter 10: Introduction to Statistical Inference.
Evaluating the Quality of Editing and Imputation: the Simulation Approach M. Di Zio, U. Guarnera, O. Luzi, A. Manzari ISTAT – Italian Statistical Institute.
Missing Values Raymond Kim Pink Preechavanichwong Andrew Wendel October 27, 2015.
A REVIEW By Chi-Ming Kam Surajit Ray April 23, 2001 April 23, 2001.
Special Topics in Educational Data Mining HUDK5199 Spring term, 2013 March 13, 2013.
Latent regression models. Where does the probability come from? Why isn’t the model deterministic. Each item tests something unique – We are interested.
Item Response Theory in Health Measurement
Tutorial I: Missing Value Analysis
Introduction to Data Analysis Why do we analyze data?  Make sense of data we have collected Basic steps in preliminary data analysis  Editing  Coding.
Pre-Processing & Item Analysis DeShon Pre-Processing Method of Pre-processing depends on the type of measurement instrument used Method of Pre-processing.
Two Approaches to Estimation of Classification Accuracy Rate Under Item Response Theory Quinn N. Lathrop and Ying Cheng Assistant Professor Ph.D., University.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Advanced Methods and Analysis for the Learning and Social Sciences PSY505 Spring term, 2012 April 9, 2012.
A framework for multiple imputation & clustering -Mainly basic idea for imputation- Tokei Benkyokai 2013/10/28 T. Kawaguchi 1.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
DATA STRUCTURES AND LONGITUDINAL DATA ANALYSIS Nidhi Kohli, Ph.D. Quantitative Methods in Education (QME) Department of Educational Psychology 1.
Logistic Regression: Regression with a Binary Dependent Variable.
Handling Attrition and Non-response in the 1970 British Cohort Study
Ch3: Model Building through Regression
Maximum Likelihood & Missing data
Introduction to Survey Data Analysis
Multiple Imputation Using Stata
Dealing with missing data
The bane of data analysis
The European Statistical Training Programme (ESTP)
CH2. Cleaning and Transforming Data
Non response and missing data in longitudinal surveys
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Clinical prediction models
Chapter 2 Examining Your Data
Chapter 13: Item nonresponse
Presentation transcript:

The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning W. Holmes Finch

Outline Introduction DIF detection Missing data – Types – Methods for dealing with missing data Listwise deletion Omitted as incorrect Multiple imputation Stochastic regression imputation Objective of the present study Method Results Discussion

Introduction Researchers have focused on the impact of missing data on uniform DIF analyses in the presence of missing data. Results showed that type I error rates were inflated so that items were mistakenly identified as displaying DIF and power for DIF detection in presence of missing data was low. This paper focused on examining the impact of missing data on nonuniform DIF.

DIF Detection Uniform DIF: – The reference group have a consistent advantage in the likelihood of responding correctly to an item for all levels, as compared with the focal group. Nonuniform DIF: – The reference group have an advantage in correctly responding to an item for some levels, whereas for other levels, the focal group has an advantage in correctly responding to the item.

Methods of Nonuniform DIF Detection IRT likelihood ratio test (IRTLR) Logistic regression (LR) Crossing SIBTEST (CSIB)

Types of Missing Data Missing completely at random (MCAR) – Some respondents leave an item unanswered in a completely random fashion, with no systematic mechanism associated with the missingness. Missing at random (MAR) – The probability of an observation containing missing data is associated directly with a measurable variable. Missing not at random (MNAR) – The likelihood of being missing is associated with the value of the variable itself.

Listwise Deletion (LD) If an individual fails to respond to any item on the instrument, his or her data would be excluded from DIF analyses. Easy to employ and is the default for many statistical software packages. It reduces the effective sample size, which can in turn lead to a notable reduction in statistical power for hypothesis testing of DIF. It has been associated with biased estimates in some situations except data of MCAR.

Omitted as incorrect Zero imputation (ZI) Missing responses are assigned an incorrect value, or a zero in the case of dichotomously scored items. This approach would lead to biased parameter estimation and hypothesis test results.

Multiple imputation (MI) MI can incorporate information from all variables in a data set to derive imputed values for those that are missing. The MI algorithm assumes a multivariate normal probability distribution among the variables and that the data are MAR or MCAR. Accurate parameter estimation and statistical power rates comparable with those obtained with complete data.

Stochastic regression imputation (SRI) SRI involves a two-step process in which the distribution of relative frequencies for each response category for each member of the sample is first obtained from the observed data. For each member of the sample, missing values are then replaced by random draws from the multinomial distribution with parameters equal to the distribution of relative frequencies of the categories. The second step of SRI, LR is conducted for the target variable for each of the M complete data sets with the independent variables being the other variables in the data set.

Prior research Sedivy et al. (2006) – GRM model – LR and Poly-SIBTEST (uniform DIF) – Lowest value imputation – Type I error rates were rarely inflated and power was diminished for higher levels of missing. Banks and Walker (2006) – 3PL dichotomous model – LD and ZI – Type I error rates were inflated for ZI but not LD and power for detecting DIF was higher for ZI than LD.

Prior research Robitzsch and Rupp (2009) – MH and LR – LD, ZI, MI, and tow-way imputation – ZI resulted in inflated type I error rates – DIF method, sample size, and number of items had relatively little impact on the type I error and power rates. Finch (2011) – MI, LD, and ZI – ZI was associated with type I error inflation and in some cases low power. – Methods of DIF detection used (SIBTEST, MH, or LR) were not affected differentially by the presence of missing data.

Method 3PL model 20 and 40 items 1 DIF item Sample size: 250/250, 500/500, 1000/1000 Impact: (0,0), (0,-0.5), (0,0.5) Percentage of missing data: 0, 10%, 20%, 30% Magnitude of DIF: 0, 0.4, 0.8, and 1

Type of missing data MCAR: responses from across both groups on the target item were randomly selected to be missing. MAR1: only members of the focal group were randomly selected to have missing data on the target item (missing data mechanism was associated with group membership). MAR2: examinees with total scores at or below the 30 th percentile were selected to have missing data (individuals with relatively lower trait levels tend to leave target item blank). MNAR: missing data were taken only from those who had an incorrect response to the target item (examinees who did not know the correct answer to an item left it blank).

Results

LR Power was higher for greater levels of DIF Impact = 0/0 Power for the LD method was slightly lower than that of the complete data condition, except when the data were MNAR. For ZI, power rates were relatively low in the MAR1 and MCAR conditions. Impact = 0/-5 Power for all conditions was somewhat lower than for the other two impact conditions. Power for LD was slightly lower than for the complete data except MAR2. Higher power for SRI might resulted from inflated type I error. Impact = 0/+5 When impact = 0/+5, power under most of the conditions simulated here was higher than when impact = 0/-5. Power for MI was typically comparable with or higher than for LD, with the exception of MAR1 data and the lowest DIF condition.

Results

Discussion Prior research on uniform DIF and missing data – No single approach could be identified as optimal for all conditions. – ZI can always be viewed as the least optimal missing data approach for uniform DIF detection. The current study on nonuniform DIF and missing data – ZI did not always result in type I error inflation for nonuniform DIF detection when data were MCAR and MNAR. – LD produced results very similar to those obtained with the complete data. – Overall MI appears to be much preferable to SRI. The inflation for SRI was much more severe than that of MI.