+ Old Reliable Testing accurately for thousands of years
+ Reliability Validity def.? Validity is how closely a test measures what it says it will measure. Reliability def.? Reliability is the consistency of your measurement,... or the degree to which an instrument measures the same way each time it is used… under the same conditions with the same subjects.
+ Why aren’t you more reliable? Imagine your students are taking a long multiple choice test, What are some common threats to reliability? Guessing Ambiguous questions Physical/emotional state of test takers Environmental distractions Mechanical problems, e.g. computer glitches, not filling in bubbles properly.
+ Why aren’t you more reliable? Imagine your students are taking an essay test, What are some common threats to reliability? Assessor or Inter-rater reliability Object/Person-related reliability Instrument-related reliability
+ Do we measure reliability??? No, we “estimate” it. How can we estimate reliability? Test-Retest Split-half What are the pros and cons of each?
+ Measuring reliability??? What do you get after you perform these tests? A reliability coefficient, like.91 or.78 What would be the range of reasonable reliability coefficients for a grammar test or vocabulary test?.9 to.99 How about for a speaking test?.7 to.79
+ Investigar Go to the class wiki located at Then click “Materials to be used in class” and follow the instructions under reliability of SAT.
+ Standard Error and True Score How high can you jump off of both feet with no steps? Average of several attempts = closer to “True Score” Can we really know what someone’s “True Score” truly is? Impossible, but... with lots of attempts, u get close.
+ Top Several Ways to Enhance Reliability Take enough samples Exclude items don’t distinguish between students Don’t give too much freedom Write unambiguous items/questions Provide clear instructions Ensure legibility, clarity of design, white space etc. Train Scorers Proved detailed scoring key/rubric/instructions Identify students/participants by number, not name Multiple raters for subjective stuff. Which of these work for teachers in a college or pub school setting?
+ Testing Tasks You are giving the following writing assignment to your students in Spanish Write a paper in Spanish giving your thoughts on a current event. With a partner, rewrite the instructions to improve reliability.
+ Testing Tasks Discuss with a partner… You are testing students’ oral proficiency as a final exam in your Spanish 3 high school class. What can you do to improve the reliability of scoring? You want to know which multiple choice questions on your test distinguish between stronger and weaker students. How might you analyze test results to determine this? Think of the main steps and then explain to a partner.