A MULTIDIMENSIONAL APPROACH TO THE IDENTIFICATION OF TEST FAIRNESS EXPLORATION OF THREE MULTIPLE-CHOICE SSC PAPERS IN PAKISTAN Syed Muhammad Fahad Latifi.

Slides:



Advertisements
Similar presentations
Critical Reading Strategies: Overview of Research Process
Advertisements

Designing Accessible Reading Assessments National Accessible Reading Assessment Projects General Advisory Committee December 7, 2007 Overview of DARA Project.
Copyright © 2004 Educational Testing Service Listening. Learning. Leading. Using Differential Item Functioning to Analyze a State English-language Arts.
Copyright © 2004 Educational Testing Service Listening. Learning. Leading. Using DIF to Examine the Validity and Fairness of Assessments for Students With.
Desmond Thomas LTU Developing effective reading strategies and productive routines Dr Desmond Thomas, University of Essex.
Cross Cultural Research
Children’s subjective well-being Findings from national surveys in England International Society for Child Indicators Conference, 27 th July 2011.
VALIDITY AND RELIABILITY
Is College Success Associated With High School Performance? Elizabeth Fisk, Dr. Kathryn Hamilton (Advisor), University of Wisconsin - Stout Introduction.
Experimental Research Designs
Copyright © 2015, 2011, 2008 Pearson Education, Inc. Chapter 5, Unit B, Slide 1 Statistical Reasoning 5.
Abstract Human beings are a social species and the human face is arguably the most pertinent aspect of social interaction and communication (Wilhelm et.
Copyright © Allyn & Bacon (2007) Data and the Nature of Measurement Graziano and Raulin Research Methods: Chapter 4 This multimedia product and its contents.
Data and the Nature of Measurement
Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI International Robert J. Mislevy & Min Liu University of Maryland Geneva Haertel SRI.
Examining Differential Item Functioning of "Insensitive" Test Items Examining Differential Item Functioning of "Insensitive" Test Items Juliya Golubovich,
Company LOGO B2C E-commerce Web Site Quality: an Empirical Examination (Cao, et al) Article overview presented by: Karen Bray Emilie Martin Trung (John)
Teaching with Depth An Understanding of Webb’s Depth of Knowledge
Dr. Henry Owolabi Faculty of Education University of Ilorin Ilorin Nature of Objective Tests and Resources for Item Development.
© UCLES 2013 Assessing the Fit of IRT Models in Language Testing Muhammad Naveed Khalid Ardeshir Geranpayeh.
Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.
Measurement of Abstract Concepts Edgar Degas: Madame Valpincon with Chrysantehmums, 1865.
Sampling and Participants
Experimental Design The Gold Standard?.
Item Response Theory for Survey Data Analysis EPSY 5245 Michael C. Rodriguez.
Teaching with Depth An Understanding of Webb’s Depth of Knowledge
Dr. MaLinda Hill Advanced English C1-A Designing Essays, Research Papers, Business Reports and Reflective Statements.
Ch 6 Validity of Instrument
Day 6: Non-Experimental & Experimental Design
Subject Matter Expert/Author: Assoc. Prof. Dr Rashid Johar (OUM) Faculty of Science and Foundation Studies Copyright © ODL Jan 2005 Open University Malaysia.
Is the Force Concept Inventory Biased? Investigating Differential Item Functioning on a Test of Conceptual Learning in Physics Sharon E. Osborn Popp, David.
Program Evaluation. Program evaluation Methodological techniques of the social sciences social policy public welfare administration.
Chapter 21 Preparing a Research Report Gay, Mills, and Airasian
Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.
Modified from Depth of Knowledge presentation by Dr. Robin Smith at 2009 PRESA Leadership Conference… Adapted from Kentucky Department of Education, Mississippi.
Science Fair How To Get Started… (
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
The Information School of the University of Washington LIS 570 Session 8.2 Notes on Presentations and Papers.
The Teaching Process. Problem/condition Analyze Design Develop Implement Evaluate.
The Discussion Section. 2 Overall Purpose : To interpret your results and justify your interpretation The Discussion.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
Research Methods in Psychology Chapter 2. The Research ProcessPsychological MeasurementEthical Issues in Human and Animal ResearchBecoming a Critical.
Cultural Issues in Testing There is a great range along which cultural factors may affect psychological testing. They range from cultural issues which.
Differential Item Functioning. Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the.
Validity Validity is an overall evaluation that supports the intended interpretations, use, in consequences of the obtained scores. (McMillan 17)
The Development and Validation of the Evaluation Involvement Scale for Use in Multi-site Evaluations Stacie A. ToalUniversity of Minnesota Why Validate.
Teacher’s English Proficiency Test (TEPT) and Process Skills Test (PST) in Science and Mathematics TEPT-PST: Overview 2015.
Translation and Cross-Cultural Equivalence of Health Measures
Chapter 6 - Standardized Measurement and Assessment
Evaluation Institute Qatar Comprehensive Educational Assessment (QCEA) 2008 Summary of Results.
2. Main Test Theories: The Classical Test Theory (CTT) Psychometrics. 2011/12. Group A (English)
Experimental Psychology PSY 433 Chapter 5 Research Reports.
Depth of Knowledge Civic Literacy Teacher Network Social Studies.
NCEXTEND1 Alternate Assessments of: English Language Arts/Reading 3  8, Mathematics 3  8, and Science 5 & 8 English II, Math I, and Biology at Grade.
Questionnaire-Part 2. Translating a questionnaire Quality of the obtained data increases if the questionnaire is presented in the respondents’ own mother.
Significance of Findings and Discussion
Data and the Nature of Measurement
VALIDITY by Barli Tambunan/
Experimental Psychology
Assessment and Evaluation
Outline What is Literature Review? Purpose of Literature Review
Data, conclusions and generalizations
Which of these is “a boy”?
Reliability & Validity
Mosby items and derived items © 2005 by Mosby, Inc.
Simple Steps to Completing a Literature Review
Classroom Assessment: Bias
Technology and Living Third Consultation
Investigations into Comparability for the PARCC Assessments
Mosby items and derived items © 2005 by Mosby, Inc.
Presentation transcript:

A MULTIDIMENSIONAL APPROACH TO THE IDENTIFICATION OF TEST FAIRNESS EXPLORATION OF THREE MULTIPLE-CHOICE SSC PAPERS IN PAKISTAN Syed Muhammad Fahad Latifi Dr. Thomas Christie

-Perspective of Test Fairness -Substantive / judgmental analysis - Statistical - Dimensionality of content. Primary dimension Secondary dimension ( impact or bias) - Test item measuring dimension other than primary dimension is produce differential Item functioning (DIF). - Bundle of Test items measuring dimension other than primary dimension produces differential bundle functioning (DBF).

 Multidimensionality of test item / bundle of items.  DIF/DBF can be uniform and non-uniform.

 DIF / DBF can produce interesting explanations. e.g., it may be due to item format characteristics, subject matter related factors and cognitive skills measured on the test.  Males are considered as reference group (majority group)  Females are considered as focal group (minority group)

RESULTS

Phase –one DIF One item in each subject were found with severe DIF. i.e. only 3.7% of the total item pool.

Phase –one DIF (cont.): Three Severe DIF Items

Phase One-DIF (cont.):

Three Severe DIF actual item text

 Analogous to DIF, DBF is conceptualized as several DIF items acting in concert to produce an item bundle favoring matched examinees from one group over another, as judge by bundle score.  The term bundle indicate a set of items organized/ grouped together because they share a common content dimension, cognitive similarity or share a common item structure.  Four organizing principles are suggested in literature.  1) Test Specification, 2) Content Analysis, 3) Psychological Analysis, and 4) Empirical Analysis.  Test Specification as organizing principle is used in this study. Phase Two-DBF

Phase Two-DBF (cont.) :

However, DBF is controversial due to amplification and cancellation effect. The small item-level differences, which may go unnoticed, can be magnified when the same difference is evaluated with a bundle, also called DIF-amplification. DIF cancellation is caused when the bundle of items exhibiting DIF against one group while another bundle of items exhibits DIF against the alternate group and therefore each is canceled out. Authors of the 1999 Standards for Educational and Psychological Testing state: “Although DIF procedures may hold some promise for improving test quality, there has been little progress in identifying the causes or substantive themes that characterize items exhibiting DIF. That is, once items on a test have been statistically identified as functioning differently from one examinee group to another, it has been difficult to specify the reasons for the differential performance…" (p. 78). Phase Two-DBF (cont.) :

CONCLUSION  The results of this study indicates that there were only three items with Level ‑ C DIF in the AKU-EB`s SSC May 2011 English, Mathematics and Physics examinations.  DBF is controversial and has limited significance from practitioners’ perspective. Further, to-date, no guidelines exist to interpret the effect size measure for DBF and thus, research is needed to identify and evaluate effect size guidelines for interpreting differential bundle functioning.  Taken together, the present study suggests that the small amount of DIF found does not confound the validity of the interpretation of the examinees’ test scores on SSC examination and likewise test development practices are fair for both male and females.