Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lies, Damned Lies & Statistical Analysis for Language Testing

Similar presentations


Presentation on theme: "Lies, Damned Lies & Statistical Analysis for Language Testing"— Presentation transcript:

1 Lies, Damned Lies & Statistical Analysis for Language Testing
Stephen Walker UECA Assessment Symposium, Saturday, 14 July 2018

2 Hands up if you know what these mean?
Dichotomous Vs Polytomous items P values Point-biserial correlations CTT Vs IRT

3 1 2 3 4 Presentation Aims Why do you need to do statistical analysis?
How do you actually do it? 3 What information do you get? 4 How do you use the results?

4 Why do we need to do statistical analysis?

5 An Art and a Science “…good test developers and creative item writers are probably born rather than trained.” Charles Alderson

6 Statistical Analysis is…
an absolutely essential, but often the most misunderstood step in developing a defensible test…

7 Numbers … - reveal how well items & tests work, or don’t work, and lead to an understanding of why provide feedback to test designers & item writers; as teachers we know the value of feedback to learning - are to applied statistics what language is to applied linguistics - help to make the results of tests meaningful and useful to test users

8 How do you actually do it?

9 Prepare the Data

10

11 Get yourself a Matrix Not that kind of Matrix

12 This kind of Matrix! Student ID Item 1 Item 2 Item 3 Item 4 Item 5
C D A B E F G

13 A Control File Contains the answers Tells the software what to do
Looks something like this

14 Get some software

15 What information can we get from different analyses?

16 P value P value = item difficulty = item facility = item easiness
- the probability that examinees will get an item correct - to calculate P value, count the number of test takers who got it right and divide be the total number of test takers - the result is a proportion, like a percentage but on a 0-1 scale rather than 0-100

17 P value Item 1 Item 2 Item 3 Item 4 Item 5 Item 6 Ann   Tony Jim
Ruth Hong P Value 0.0 0.2 0.4 0.6 0.8 1.0 Everyone got Item 6 right. It’s very easy for these test takers. It’s P Value is 1.0 5÷5=1 This approach to calculating difficulty is sample-dependent. If we had a different sample of people, the statistics could be quite different. Only 1 person got this right. It’s difficult for these test takers. It’s P Value is 0.2 1÷5=0.2

18 P value interpretation
Range Possible Interpretation Notes Too difficult Your item might be mis-keyed or have other issues so need to be checked Difficult to moderately difficult Test takers are finding items in this range challenging Moderately easy Most test takers are getting these items correct Too easy These items are too easy to provide much info on examinees, and can be detrimental to reliability.

19 Rpbis - point-biserial correlation
Measures how well items differentiate between high and low ability test takers Ranges from -1.0 to 1.0 Items which discriminate well have higher Rpbis values but rarely above 0.5 A negative Rpbis means high-ability test takers answer incorrectly while those of low ability answer correctly. Usually indicates that the specified answer is actually wrong! no to little discrimination (noise) Rpbis and P value are considered together

20 Rpbis value interpretation
0.20+ = Good items - higher ability test takers tend to get these items correct = maybe OK item - review it = Problems suggested - revise or replace <0.0 = Problematic items- replace NB: if the correct answer has a negative Rpbis and a distractor has a positive Rpbis the distractor is probably correct

21 Using the results within the test development cycle?

22 UQ-ICTE Reading & Listening Test Development Cycle

23

24 Pre-test Review Meeting
Item writer team should be involved Use common wrong answers, item analysis results, pilot-test, script for listening tests, and the answer key and meet somewhere to discuss

25 Don’t forget to show your examples here Stephen!

26 Decisions made in Pre-test review
Which items should be cut because they are too easy or too hard for these learners? Which items should be re-written? Which distractors are not tempting or too tempting because they are actually correct (double keys)? Are test takers lost?

27 I hope this presentation encourages you to:
- use statistics as a tool to help you understand your own tests - produce better tests with evidence to support any claims made - explain to others why piloting & statistical analysis are an essential part of reliable test development - do the analysis yourself along with those involved in the test development cycle

28 Thank you Stephen Walker, Academic Manager: Assessment
E: T: (07)


Download ppt "Lies, Damned Lies & Statistical Analysis for Language Testing"

Similar presentations


Ads by Google