What Do Teachers Need to Know About Assessment? Professor Norland ED342
Assessment versus Test? Informal or Formal variables?
When most people use the word TEST, they automatically think of traditional paper-pencil tests. The term ASSESSMENTS has been increasingly used by many educators. It’s a broader descriptor of the various kinds of educational measuring that teachers do.
2 different types of assessments: •FORMATIVE ASSESSMENT •SUMMATIVE ASSESSMENT (Last paragraph on page 10.) Evidence of what needs to improve during the PROCESS of learning; helps FORM or modify instruction. Purpose is to SUMMARIZE results of success/failure. FINAL GRADE: Not modifiable.
•FORMATIVE ASSESSMENT FORMAL : determine a student’s status regarding their knowledge, skills, or attitudes. Paper/pencil tests Unit / Chapter tests Written essays, papers Quizzes, post-tests Graded
•FORMATIVE ASSESSMENT INFORMAL: conclusions, observations or judgments teachers make on students’ understanding and progress; to guide decision-making Student journals Visual Observations, teacher notes on students Checklists Pre-tests; surveys Oral questions; discussions Not always “graded” but “noted” Not “graded” but “noted”
•FORMATIVE ASSESSMENT FORMAL AND INFORMAL Paper/pencil test Student journals Unit / Chapter tests Observations, notes Written essays, papers Checklists, worksheets Quizzes, post-tests Pre-tests, surveys Oral questions, discussion Graded Not always “graded”
2 different types of assessments •FORMATIVE ASSESSMENT •SUMMATIVE ASSESSMENT (Last paragraph on page 10.) Helps “FORM” instruction; NOT always graded (informal) Helps teachers make better decisions about instruction, make adjustments BEFORE and DURING instruction “SUMMARY” of accomplishments Graded: level of achievements END of instruction.
CRITERION-REFERENCED TESTS Measures the mastery of specific objectives Test scores are compared to a standard of performance, not compared to other students. •Tells the teacher how well students are performing or mastering specific goals.
NORM-REFERENCED TESTS Cover a wide-range of general objectives Measure the overall achievement of students Usually standardized achievement tests which rank and compare students to other groups.
NORM GROUPS Comparison group that is a class or school Comparison group that is a school district Comparison group that is national.
RELIABILITY •Consistency of scores over time: Scores are fairly consistent when taking the same test on 2 different occasions. Take the same test a week later and your score is about the same, then it’s a reliable test. •If the two test scores are very different, it is reasonable to conclude that the difference is due to test error and that the scores do not really reflect what the test taker knows or can do.
RELIABILITY •Scores are consistent on all parts of a test; for example, if half of the test is multiple/choice questions, and the other half is made up of short responses or essays, students should do as well on both halves, if the test is reliable. Or, scores remain fairly consistent with both the odds and evens test items. • Essay-type questions, however, require human judgment and are therefore more difficult to score. If two people read the same essay, it's likely that each person will give the essay a slightly different score. Therefore, a reliable test should have very clear directions and criteria expected for the essay responses. For example,” give 3 examples and definitions of each”…
RELIABILITY •The more items on an educational assessment, the more reliable it will tend to be. • (Example: a 100-item test on mathematics will give you a more reliable fix on a student’s ability in math, than a 20 item test..)
VALIDITY • The test measures knowledge of content of what it was designed to measure •(example) A Math achievement test would lack content validity if good scores depended more on the ability to read higher level text than to show knowledge in solving and understanding math problems…. •…Or there are more questions of one type of math computation and only one question on algebra, rather than an even spread of all types.
VALIDITY •Judgments and decisions made on the grading/scoring of test items can be supported. •Validity centers on the accuracy of the inferences that teachers make about their students through evidence gathered formally or informally
ABSENCE OF BIAS •Items on an assessment that offend or unfairly penalize a group of students because of the students’ gender, ethnicity, socioeconomic status, religion, or other group characteristics.
EXAMPLES OF ‘OFFENSIVE CONTENT’ ON TEST ITEMS: Only males are shown in high-paying and prestigious positions (attorneys, doctors); while women are portrayed in low-paying and unimpressive positions (housewives, clerks). Word problems are based on competitive sports, using mostly boys’ names, suggesting that girls are less skilled.
EXAMPLES OF ‘OFFENSIVE CONTENT’ ON TEST ITEMS: •Problem solving items (such as, dealing with attending operas and symphonies); may be more advantageous for children from upper class families, than lower socioeconomic students who may lack these experiences
EXAMPLES OF ‘GENDER BIAS’ which is mostly geared towards females In the Sisterhood of the Traveling Pants, the story revolves around: Tibby Carmen Lena Bridget All of the above
EXAMPLES OF ‘GENDER BIAS’ which is mostly geared towards males •Test items that refer to “he” or male names in word problems about doctors, lawyers, politicians, scientists, etc., instead of using female names in these roles. •Test items should display equal numbers of references to males and females in varying roles.
EXAMPLES OF ‘REGIONAL BIAS’ with specific cultural expressions Where are you most likely to find Sundrop? In Wisconsin In Florida In a movie theater None of the above
EXAMPLES OF ‘LANGUAGE BIAS’ Language bias in tests occurs: • When second language learners (ELL, ESL) are penalized because of their lack of knowledge of the English Language. •For example, they may be unable to read the questions accurately, or understand English vocabulary or expressions, which prevents them from demonstrating their skill in the content being tested.