Download presentation
Presentation is loading. Please wait.
Published byIris McDaniel Modified over 9 years ago
1
Using statistics in small-scale language education research Jean Turner © Taylor & Francis 2014
2
Reliability is the extent to which a test, assessment, or data collection instrument or procedure measures consistently. © Taylor & Francis 2014
3
There are five different types of test/data collection tool reliability. test-retest reliability equivalent form reliability* intrarater reliability interrater reliability internal consistency * a.k.a. parallel form reliability or alternate form reliability © Taylor & Francis 2014
4
The instruments and procedures used to collect data must measure consistently—they should be reliable. When data collection tools aren’t reliable: ◦ the systematicity that characterizes research is damaged ◦ the internal validity (the soundness) of the study is threatened © Taylor & Francis 2014
5
Definition—the extent to which a data collection tool measures consistently across different data collection events Contributing factors: ◦ (1) Clear instructions for administrators, research participants, and raters ◦ (2) Tasks/questions in participants’ first language or target language at appropriate level of difficulty ◦ (3) Unambiguously phrased tasks/questions © Taylor & Francis 2014
6
Definition—the extent to which different forms of the same tool measure in a similar way (the extent to which the forms are interchangeable) Contributing factors: ◦ (1) The development of equivalent forms from specifications that describe tool content ◦ (2) Trial of tools before data collection to ensure equivalence © Taylor & Francis 2014
7
Definition—the extent to which an individual scorer is consistent in how the criteria are applied Contributing factors: ◦ (1) Unambiguous criteria for scoring ◦ (2) Rater’s thorough training (and practice!) in applying the criteria © Taylor & Francis 2014
8
Definition—the extent to which multiple scorers are consistent in how the criteria are applied Contributing factors: ◦ (1) Unambiguous criteria for scoring ◦ (2) Raters’ thorough training (and practice!) in applying the criteria © Taylor & Francis 2014
9
Definition—the extent to which individual test items are congruent with other items on the data collection tool High internal consistency is a necessity for norm- referenced tools Contributing factors: ◦ (1) Careful item writing, guided by item specifications ◦ (2) Field test and item analysis ◦ (3) Construction of tests with reference to item performance © Taylor & Francis 2014
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.