Summative Assessment: Rubrics and Tests Effective Teaching and Learning English Study Program FKIP – UNSRI April
Outcomes Apply a systematic process for creating a test blueprint Apply a systematic process for creating a test blueprint Identify attributes of effective test questions Identify attributes of effective test questions Explain the advantages and disadvantages of different types of test questions Explain the advantages and disadvantages of different types of test questions Assess the quality of tests and test items Assess the quality of tests and test items Create samples of effective questions Create samples of effective questions 2
What type of assessment? Procedural knowledge Procedural knowledge Declarative knowledge Declarative knowledge 3
Test Writing Process Planning the test Planning the test Writing test items Writing test items Selecting test items Selecting test items Formatting the test Formatting the test Assessing the test Assessing the test Revising the test Revising the test Using the test Using the test After the test After the test 4
Planning the Test Content Blueprint Content Blueprint –Learning outcomes –Weight Length Length –Types of items –Number of items 5
Test Blueprint Learning outcomesWeight Apply a systematic process for creating a test blueprint Identify attributes of effective test questions Explain the advantages and disadvantages of different types of test questions Assess the quality of tests and test items Create samples of effective questions 6
Types of Items Recognition Recognition –True-false –Multiple-choice –Multiple-answer –Matching –Ordering Recall Recall –Short Answer –Completion –Essay 7
Average Response Time Item Type True-false30 seconds Multiple-choice and Multiple- answer 60 – 90 seconds Matching and Ordering30 seconds per response Short Answer120 seconds Completion60 seconds Essay10 – 30 minutes 8
Writing Test Items Simple and direct wording Simple and direct wording Avoid jargon Avoid jargon Avoid trivia items Avoid trivia items Match items to learning outcomes Match items to learning outcomes Each item has an agreed upon correct answer Each item has an agreed upon correct answer Write more questions than you will need Write more questions than you will need 9
Multiple-choice Items Stem Stem –Direct question –Incomplete statement Responses Responses –One correct answer –Multiple distracters 10
Stem Clearly worded Clearly worded One idea One idea Avoid the use of negatives Avoid the use of negatives Enough information to answer the question Enough information to answer the question Direct questions preferred Direct questions preferred Blanks at the end of the stem Blanks at the end of the stem Include words repeated in all responses Include words repeated in all responses 11
Responses 3-5 per item 3-5 per item Avoid “all of the above” and “none of the above” Avoid “all of the above” and “none of the above” Grammatically correct with stem Grammatically correct with stem Similar length and structure Similar length and structure Avoid absolute words Avoid absolute words Listed in a logical order Listed in a logical order Mutually exclusive and not overlapping Mutually exclusive and not overlapping 12
Distracters Plausible Plausible Common misconceptions Common misconceptions Logical misinterpretations Logical misinterpretations Clichés Clichés Partial answers Partial answers Technical terms or jargon Technical terms or jargon 13
Example What is the minimum number of responses for a multiple-choice item? A) 2 B) 3 C) 4 D) 5 14
Application Example What problem exists in the following multiple- choice stem: ________ is the most common type of test item. A)Absolute words should be avoided in the stem. B)The stem contains more than one idea or concept. C)Not enough information is presented to answer the question. D)The fill-in the blank should come at the end. 15
Analysis and Evaluation Example Stem An instructor was asked the following question: "Briefly list and explain how you develop a test.” As an answer, this instructor wrote the following: I begin by going through the chapter and writing questions based on the material in the text and my lectures. Then I decide how many questions I want and select the best questions from the list that I have developed. I format the test and add instructions. After a few days, I review the questions and make any revisions that need to be done and remove any jargon or wording lifted directly from the text. Then I use the test in class. Based on how the class does, I may make changes for the next time I teach the class. 16
Analysis and Evaluation Example Responses Based on the process described in “Effective Classroom Tests,” how would you judge this answer? A) EXCELLENT (all steps in the right order with correct, clear, and complete descriptions) B) GOOD (all stages correct in the right order, but the descriptions are not as complete as they should be). C) MEDIOCRE (one or two stages are missing, OR the stages are in the wrong order, OR the explanations are not complete, OR the explanations are irrelevant) D) UNACCEPTABLE (one or more stages are missing AND the explanations are not complete AND/OR they are irrelevant) 17
Poor Question 1 Good multiple choice items: A) are easy to write B) can only test memorized content C) are better than essay items D) there is no such thing E) can test a wide range of content 18
Poor Question 2 Which of the following characteristics is not true of completion test items but is an important distinguishing attribute of matching tests, multiple-choice questions, and true-false items? A) They are objective test items. B) They require knowledge recognition but not production. C) Much more difficult to construct. 19
Poor Question 3 Which of the following statements is FALSE? A) Misfeasance is the improperly doing of an illegal act. B) Nonfeasance is improperly doing a legal act. C) Nonfeasance is the failure to do an act that one must do legally. D) Misfeasance is the failure to PROPERLY do an act that one has a duty to perform. E) None of the above. 20
Poor Question 4 __________ is/are the best method to determine if students have learned something. A) Comprehensive Exam B) Homework Assignments C) Pop Quizzes D) Research Paper 21
Selecting Test Items Outcome Weight x # Questions by Type = #Questions of Each Type for Outcome 22
Formatting the Test Group items by type Group items by type Sort items by increasing difficulty Sort items by increasing difficulty Add instructions Add instructions Review layout and pagination Review layout and pagination Write answer key Write answer key 23
Assessing the Test Self Self –2-3 days after writing the test –Clarity –Clues in items to other items Non-expert Non-expert –Clarity –Contextual clues Peer Peer –Content –Weighting to outcomes –Answer key Students Students –Clarity –Content 24
Test Taking Procedures Use of notes or other materials Use of notes or other materials Time limits Time limits 25
After the Test Item Analysis Item Analysis Areas for review Areas for review Test revisions Test revisions 26
Activity Write two test questions on any topic Write two test questions on any topic One question should be an example of a good test item One question should be an example of a good test item One question should be an example of a poorly written test item One question should be an example of a poorly written test item 27
Share Share your two questions with a partner Share your two questions with a partner Can they determine which is good and which is bad? Can they determine which is good and which is bad? Can they explain what makes one poorly written? Can they explain what makes one poorly written? As a team, how can you fix the poorly written questions? As a team, how can you fix the poorly written questions? 28
Intermission 29
Outcomes Determine what characteristics are important in evaluating student work Determine what characteristics are important in evaluating student work Evaluate rubrics, analytic scales, and other evaluation methods Evaluate rubrics, analytic scales, and other evaluation methods Describe the contents of a good rubric Describe the contents of a good rubric Identify rubrics already in use at Baker College Identify rubrics already in use at Baker College Begin work on a rubric for a class Begin work on a rubric for a class 30
What is a Rubric? A rubric is a scoring tool or guide that lists the specific criteria and the ranges for multiple levels of achievement for a piece of work or performance. A rubric consists of a set of well-defined factors and criteria describing the dimensions of an assignment to be assessed or evaluated. 31
Parts of a Rubric Scale (columns) Scale (columns) Dimensions (rows) Dimensions (rows) Criteria descriptions (cells) Criteria descriptions (cells) 32
Benefits of Rubrics Communicates the instructor’s expectations Communicates the instructor’s expectations Streamlines the process for feedback to the student Streamlines the process for feedback to the student Facilitates equitable grading Facilitates equitable grading Standardizes assessment across different instructors Standardizes assessment across different instructors 33
Uses for Rubrics Papers Papers Presentations Presentations Projects Projects Essays Essays Homework Homework Case Studies Case Studies Participation/Class Discussion Participation/Class Discussion Portfolios Portfolios 34
Types of Rubrics Analytic Analytic –Page 11 Holistic Holistic –Page 12 and 13 Check List Check List –Page 14 Scoring Guide Scoring Guide –Page 15 35
Creating a Rubric Identify components/outcomes of the assignment Identify components/outcomes of the assignment Determine a scale Determine a scale Add criteria Add criteria Assign points Assign points Set component weights (optional) Set component weights (optional) Assess the rubric Assess the rubric Test and revise Test and revise 36
Activity Split into groups of 3-4 Split into groups of 3-4 Determine team roles Determine team roles Select an assignment that needs a rubric Select an assignment that needs a rubric –Can be a specific assignment, such as a research paper for ENG 102 –Can be of a more general nature such as a class presentation 37
Step 1: Identify Components List 5 major objectives/outcomes of the assignment List 5 major objectives/outcomes of the assignment Write these items as the row headers of the sheet provided Write these items as the row headers of the sheet provided 38
Step 2: Determine a Scale Aim for 3-5 levels Aim for 3-5 levels Can use an odd or even number of items Can use an odd or even number of items Use the headings on the next slide for ideas Use the headings on the next slide for ideas Write these as column headings on the sheet provided Write these as column headings on the sheet provided 39
Potential Column Headings Outstanding | Accomplished | Proficient | Developing | Beginning Outstanding | Accomplished | Proficient | Developing | Beginning Accomplished | Average | Developing | Beginning Accomplished | Average | Developing | Beginning Excellent | Good | Needs Improvement | Unsatisfactory Excellent | Good | Needs Improvement | Unsatisfactory Exceptional | Acceptable | Marginal | Unacceptable Exceptional | Acceptable | Marginal | Unacceptable Expert | Practitioner | Apprentice | Novice Expert | Practitioner | Apprentice | Novice Professional | Adequate | Needs Work | You’re Fired Professional | Adequate | Needs Work | You’re Fired Exceeds Expectation | On Target | Beginning Exceeds Expectation | On Target | Beginning Exemplary | Competent | Developing Exemplary | Competent | Developing High | Medium | Low High | Medium | Low Outstanding | Proficient | Shows Potential Outstanding | Proficient | Shows Potential 40
Step 3: Add Criteria Create descriptions for each level of performance for each criteria in the cells of the rubric Create descriptions for each level of performance for each criteria in the cells of the rubric –Bullet points –Paragraphs Write these criteria in the cells of the sheet provided Write these criteria in the cells of the sheet provided 41
Step 4: Assign Points Assign points for each level of performance Assign points for each level of performance Can use either of the following: Can use either of the following: –Discrete values (5, 4, 3, 2, 1) –Ranges (10-9) for each level Indicate the point value on the sheet provided Indicate the point value on the sheet provided –Normally placed with the scale 42
Step 5: Set Component Weights Allows for different levels of importance Allows for different levels of importance –Spelling/grammar – more or less important than content? Determine if weights are necessary for your rubric Determine if weights are necessary for your rubric Assign weights accordingly Assign weights accordingly –See example on page of the handout 43
Step 6: Assess the Rubric Assess your rubric using a metarubric Assess your rubric using a metarubric –See examples on page of your handout Conduct a peer review Conduct a peer review –Ask one or two other instructors to review your rubric Provide time for student review Provide time for student review –Allow students to ask questions and make comments 44
Group Project Trade rubrics with another group Trade rubrics with another group Assess the rubric using a metarubric from page Assess the rubric using a metarubric from page
Discussion What metarubric(s) did you use? Why? What metarubric(s) did you use? Why? What did you see on the other team’s rubric that you liked? What did you see on the other team’s rubric that you liked? Could you understand the assignment easily by reviewing the rubric? Could you understand the assignment easily by reviewing the rubric? 46
Step 7: Implement and refine Refine your rubric based on feedback from other instructors and students Refine your rubric based on feedback from other instructors and students Make notes each time you use the rubric for continuous improvement purposes Make notes each time you use the rubric for continuous improvement purposes Share with others Share with others 47
Rubric Reliability & Validity Reliability Reliability –“the likelihood that a given measurement procedure will yield the same description of a given phenomena if the measurement is repeated.” Validity Validity –“the extent to which a specific measurement provides data that relate to commonly accepted meanings of a particular concept.” Babbie, 1986 Babbie,
Reliability Requires Instructor should reach same conclusion each time Instructor should reach same conclusion each time Different instructors should reach similar conclusion (interrater reliability) Different instructors should reach similar conclusion (interrater reliability) 49
Interrater Reliability Independently score a set of student samples Independently score a set of student samples Review responses for consistent and inconsistent responses Review responses for consistent and inconsistent responses Discuss and reconcile inconsistencies Discuss and reconcile inconsistencies Repeat with second group of samples Repeat with second group of samples Maki, 2004, p. 127 Maki, 2004, p
Validity Requires Reliability Reliability Comprehensiveness Comprehensiveness –Cover all outcomes Economy Economy –Space is usually limited, so be selective about what goes into the rubric Balanced scoring and weighting Balanced scoring and weighting 51
End of Slides Any Question? 52