His Name Shall Be Revered …

Slides:



Advertisements
Similar presentations
ASSESSING RESPONSIVENESS OF HEALTH MEASUREMENTS. Link validity & reliability testing to purpose of the measure Some examples: In a diagnostic instrument,
Advertisements

 Degree to which inferences made using data are justified or supported by evidence  Some types of validity ◦ Criterion-related ◦ Content ◦ Construct.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Work Disability, Work, and Justification Bias in Europe and the US Arie Kapteyn (RAND) James P. Smith (RAND) Arthur van Soest (Netspar, Tilburg University)
Item Response Theory in Health Measurement
Associations between Obesity and Depression by Race/Ethnicity and Education among Women: Results from the National Health and Nutrition Examination Survey,
Experimental Research Designs
Health-related quality of life in diabetic patients and controls without diabetes in refugee camps in Gaza strip: a cross-sectional study By: Ashraf Eljedi:
Jan Weiss, PT, DHS, CLT-LANA
© UCLES 2013 Assessing the Fit of IRT Models in Language Testing Muhammad Naveed Khalid Ardeshir Geranpayeh.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Research Review Anxiety Disorder. Study 1 Whiteside and Brown (2008) explore in their research the Spence Children’s Anxiety Scale (SCAS) in a North American.
DIFFERENTIAL ITEM FUNCTIONING AND COGNITIVE ASSESSMENT USING IRT-BASED METHODS Jeanne Teresi, Ed.D., Ph.D. Katja Ocepek-Welikson, M.Phil.
Translation and Cross-Cultural Equivalence of Health Measures.
Introduction Neuropsychological Symptoms Scale The Neuropsychological Symptoms Scale (NSS; Dean, 2010) was designed for use in the clinical interview to.
Miller Function & Participation Scales (M-FUN)
A Framework of Mathematics Inductive Reasoning Reporter: Lee Chun-Yi Advisor: Chen Ming-Puu Christou, C., & Papageorgiou, E. (2007). A framework of mathematics.
Rasch trees: A new method for detecting differential item functioning in the Rasch model Carolin Strobl Julia Kopf Achim Zeileis.
Measurement Validity.
Evaluating Impacts of MSP Grants Hilary Rhodes, PhD Ellen Bobronnikov February 22, 2010 Common Issues and Recommendations.
1 Differential Item Functioning in Mplus Summer School Week 2.
Evaluating Impacts of MSP Grants Ellen Bobronnikov Hilary Rhodes January 11, 2010 Common Issues and Recommendations.
Validity and Item Analysis Chapter 4.  Concerns what instrument measures and how well it does so  Not something instrument “has” or “does not have”
Assessing Responsiveness of Health Measurements Ian McDowell, INTA, Santiago, March 20, 2001.
Safety, Health and Work Environment – a Study of Employees in the Norwegian Offshore Oil & Gas Industry Anne Mette Bjerkan PhD Student Centre for Technology,
Translation and Cross-Cultural Equivalence of Health Measures
Item Response Theory in Health Measurement
Parental, Temperament, & Peer Influences on Disordered Eating Symptoms Kaija M. Muhich, Alyssa Collura, Jessica Hick and Jennifer J. Muehlenkamp Psychology.
The Invariance of the easyCBM® Mathematics Measures Across Educational Setting, Language, and Ethnic Groups Joseph F. Nese, Daniel Anderson, and Gerald.
The Process of Psychometric Validation of an Instrument across Language and Culture Halfway around the World Huey-Shys Chen PhD, RN, CHES Assistant Professor,
Chapter 17 STRUCTURAL EQUATION MODELING. Structural Equation Modeling (SEM)  Relatively new statistical technique used to test theoretical or causal.
Test-Retest Reliability of the Work Disability Functional Assessment Battery (WD-FAB) Dr. Leighton Chan, MD, MPH Chief, Rehabilitation Medicine Department.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 25 Critiquing Assessments Sherrilene Classen, Craig A. Velozo.
Gender and Race-Ethnic Differentials in the Criterion Structure of Alcohol Use Disorder Tulshi D. Saha, Sharon M. Smith and Bridget F. Grant.
Copyright © 2009 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 47 Critiquing Assessments.
Association of Body Mass Index (BMI) and Depression Severity
Katie Galvin: Systematic Review
Friday Harbor Laboratory University of Washington August 22-26, 2005
Logic of Hypothesis Testing
Florida International University, Miami, FL
using Denver II subscales
QDET2, Miami, FL, Hibiscus A
SAINT LOUIS UNIVERSITY
Carina Omoeva, FHI 360 Wael Moussa, FHI 360
Background and Context Research Question and Proposition
Associations between Depression and Obesity: Findings from the National Health and Nutrition Examination Survey, Arlene Keddie, Ph.D. Assistant.
Questions What are the sources of error in measurement?
Evaluating Multi-Item Scales
Concept of Test Validity
Ron D. Hays GIM-HSR Friday Noon Seminar Series November 4, 2016
Experimental Research Designs
Change Score Analysis versus ANCOVA in Pretest/Posttest Designs:
The role of Emotion Regulation Difficulties and Anxiety Sensitivity
Bowden, Shores, & Mathias (2006): Failure to Replicate or Just Failure to Notice. Does Effort Still Account for More Variance in Neuropsychological Test.
CHAPTER 5 MEASUREMENT CONCEPTS © 2007 The McGraw-Hill Companies, Inc.
Understanding Results
SAINT LOUIS UNIVERSITY
Study Limitations and Future Directions See Handout for References
Shudong Wang NWEA Liru Zhang Delaware Department of Education
Virginia Tech, Educational Research and Evaluation
Chapter Eight: Quantitative Methods
Reliability and Validity of Measurement
The relationship between job-related stressors and stress responses of nurses working in intermediate nursing homes in Japan Y.Momose1, A.Fujino1, N.Amaki1,
Rhematoid Rthritis Respiratory disorders
Cross Sectional Designs
ELM DICIPE Mozambique Gaza, Nampula, and Tete Midline 2016
DIF detection using OLR
Evaluating Multi-item Scales
Res.Asst. Nurten TERKES, Prof. Hicran BEKTAS
International Academic Multidisciplinary Research Conference in Rome
Presentation transcript:

His Name Shall Be Revered … IN THE NAME OF GOD His Name Shall Be Revered …

Differential Item Functioning (DIF)

Latent Construct Unlike directly observable measures such as height or weight, researchers may not be able to measure variables such as depression directly. Such measures that are unobserved are considered latent constructs. To measure such a latent construct, we can capture indicators from a multiple item scale that represent the underlying construct.

Comparisons among individuals or group The analysis each individual may have a different view of how to fill out a test questionnaire symptoms or test results are common to two or more co-occurring conditions Differential item functioning (DIF) analysis

DIF Type Uniform DIF Non-uniform DIF such overlap, diagnoses, treatment decisions and inferences about the effectiveness of treatments for these conditions can be biased Uniform DIF Non-uniform DIF

DIF Detection Methods Non-parametric methods Parametric methods exploratory factor analysis(EFA) confirmatory factor analysis(CFA) the multiple indicator multiple cause (MIMIC) Item Response Theory IRT Ordinal Logistic Regression OLR Structural Equation Modeling (SEM)

Item-level informant discrepancies across obese–overweight children and their parents on the PedsQLTM 4.0 instrument: an iterative hybrid ordinal logistic regression Abstract Purpose Child obesity has become a major health concern worldwide. In order to provide successful intervention strategies, it is necessary to understand how obese–overweight children and their parents perceive obesity and its consequences on child’s health-related quality of life (HRQoL). This study aimed to assess measurement equivalence of the PedsQLTM 4.0 across obese–overweight children and their parents. Methods The items in the PedsQLTM 4.0 were analysed for differential item functioning (DIF) across obese–overweight children and their parents using an iterative hybrid ordinal logistic regression/item response theory approach. The sample included 647 overweight–obese children and their parents, who completed child and parent reports of the PedsQLTM 4.0, respectively. Results Overall, 17 out of 23 (74 %) items were flagged with DIF across two groups: eight items exhibited uniform DIF and nine items non-uniform DIF. In addition, parents of obese children rated the child’s HRQoL significantly lower than their children in all domains of the PedsQLTM 4.0, and this finding did not change whether or not items with uniform DIF were included. Conclusions Although obese–overweight children and their parents interpret items of the PedsQLTM 4.0 in a conceptually different manner, removing or retaining DIF items in the subscales had no significant effects on group differences. Accordingly, it appears that observed differences in HRQoL scores across child and parent reports are a true difference and not a reflection of measurement artefact. Keywords Obese Children Parents Quality of life Differential item functioning

Differential Item Functioning in a Computerized Adaptive Test of Functional Status for People with Shoulder Impairments is Negligible across Pain Intensity, Gender, and Age Groups Abstract People with shoulder impairments (N = 3,767) reported upper extremity function using a 37-item shoulder-specific computerized adaptive test (shoulder CAT). The authors determined whether items of the shoulder CAT have differential item functioning (DIF) by pain intensity (low and high), gender (men and women), and age groups (young-adult, middle-aged and old-adult). They assessed whether items have uniform and/or non-uniform DIF using an ordinal logistic regression and item response theory approaches and applied large and small DIF criteria to assess the magnitude of DIF. The analyses revealed that uniform DIF was absent in all 37 items. Only six items exhibited non-uniform DIF using the large DIF criterion. Adjusting the person-ability measures for DIF had minimal practical impact on the overall measure of shoulder function estimated using the shoulder CAT. The shoulder CAT provided a precise measurement of function without discriminating for pain intensity, gender, and age among patients referred to rehabilitation with shoulder impairment. Keywords Item bias, shoulder CAT, orthopedic disorders

Application of item response theory to achieve cross-cultural comparability of occupational stress measurement. Abstract Our objective was to examine cross-cultural comparability of standard scales of the Effort-Reward Imbalance occupational stress scales by item response theory (IRT) analyses. Data were from 20,256 Japanese employees, 1464 Dutch nurses and nurses' aides, 2128 representative employees from post-communist countries, 963 Swedish representative employees, 421 Chinese female employees, 10,175 employees of the French national gas and electric company and 734 Spanish railroad employees, sanitary personnel and telephone operators. The IRT likelihood ratio model was used for differential item functioning (DIF) and differential test functioning (DTF) analyses. Despite the existence of DIF, most comparisons did not show discernible differences in the relations between Effort-Reward total score and level of the underlying trait across cultural groups. In the case that DTF was suspected, excluding an item with significant DIF improved the comparability. The full cross-cultural comparability of Effort-Reward Imbalance scores can be achieved with the help of IRT analysis.

Item bias in indices measuring psychosocial work environment and health. Abstract OBJECTIVES: The main purpose of this study was to demonstrate the relevance of testing indices concerning the psychosocial work environment by item bias or differential item functioning (DIF) analysis. Especially when the work environment for different groups is compared, this kind of construct validation is important. As exogenous variables gender, age, and occupational group were selected. METHODS: Data were taken from a cross-sectional study of Danish employees aged 19-59 years (N=5940). The study was carried out in 1990 and followed-up in 1995. RESULTS: Item bias was demonstrated in all indices when analyzed in relation to gender, age, and occupational groups of the total population. Item bias was much weaker or disappeared as the population was divided into main occupational groups and analyzed in relation to the same exogenous variables. CONCLUSIONS: For a heterogeneous group of employees, gender, age, and occupational status are significant determinants of the response pattern in relation to indices of the psychosocial work environment. It was concluded that, if the psychosocial work environment for different groups is to be compared, indices should always be tested for item bias in relation to the exogenous variables included in the final analyses. Indices should only be used if there is no item bias. If such indices cannot be constructed, it is suggested that researchers either concentrate on constructing indices that are valid in subgroups or report results based on single-item analyses.

Thank You