Download presentation
Presentation is loading. Please wait.
Published byArthur Hampton Modified over 8 years ago
1
Assessment Instruments and Rubrics Workshop Series Part 3: Test Your Rubric and Introduction to Data Reporting March 30, 2016 Dr. Summer DeProw
2
Workshop Agenda Follow-up from Part 2’s Workshop—How did it go? Test your rubric Best, formal practices Modifications Inter-rater Reliability Intro to data reporting (if time allows)
3
Follow-up? How have your progressed? Did you ask a colleague to review your rubric? Have you used it to score student work? Would you like to test it with your workshop colleagues now?
4
Best Practices for Implementing a New Rubric First, let’s set the context……. This rubric is for program-level student learning assessment The intent for using the rubric is to report student-learning on a subjectively scored essay, paper, oral presentation, capstone project, clinical-type examination, or original research study, etc. The summarized results will be used to determine the level of student learning for a particular program-level outcome/s And, the summarized results will be eventually reported to an accreditor/s
5
Best Practices for Implementing a New Rubric All student work must be scored by at least two professors Scoring should be blind scoring Professors should be trained to use the rubric consistently Inter-rater Reliability (IRR) statistics should be calculated IRR is particularly important when a rubric is first deployed Low IRR statistics should guide any future modifications of the rubric and training sessions for professors IRR statistics should be reported with the summarized student-learning data
6
Acceptable Modifications to the Best Practices Scenario 1: You have hundreds, perhaps thousands of student artifacts for at least two professor’s to score Take a random sample Use many professors to score an acceptable number without extreme fatigue Scenario 2: You have professor’s who say “I scored this work as part of my class, why do I have to score it again?” Take the scores from the class Ask a second professor or group of professors to score the second reads Scenario 3: It would be a serious burden to blind score the students’ work Do the best you can and report that student identity was known during scoring Scenario 4: You have no idea what IRR is or how to calculate it Call the Assessment Office
7
Inter-Rater Reliability Why is IRR important? “Nearly universal” agreement that reliability is an important property in educational measurement (Bresciani, et al., 2009) We are scoring student work that cannot be scored objectively but requires a rating of degree The goal is to minimize the variations between raters to formalize the criteria for scoring student work Gives an opportunity for training Less formally, it should spark robust conversation between professors about what students should know and be able to do and/or the data professors need to make student-learning action plans
8
Inter-Rater Reliability Definition Wide-range of applications across a variety of situations so there is no single definition (Gwet, 2014) Basic concept is to determine the intentional agreement between raters and remove the happenstance agreement due to rater guessing Usually reported between 0 to 1, but some measures can offer negative scores between -1 and 0.
9
Inter-Rater Reliability Three common approaches: 1.Consensus Estimates Use when exact agreement is needed to provide defensible evidence for the existence of the construct (definition of construct: an idea or theory containing various conceptual elements, typically one considered to be subjective and not based on empirical evidence) Use when exact agreement is needed to provide defensible evidence for the existence of the construct (definition of construct: an idea or theory containing various conceptual elements, typically one considered to be subjective and not based on empirical evidence) Cohen’s Kappa, Krippendorff’s Alpha, percent agreement Cohen’s Kappa, Krippendorff’s Alpha, percent agreement 2.Consistency Estimates Use when it’s not necessary for raters to share a common interpretation of the rating scale, as long as each judge is consistent in classifying the phenomenon according to his/her own definition of the scale Use when it’s not necessary for raters to share a common interpretation of the rating scale, as long as each judge is consistent in classifying the phenomenon according to his/her own definition of the scale Differences are OK as long as the differences are predictable Differences are OK as long as the differences are predictable Correlation coefficients, Cronbach’s Alpha, intra-class correlation Correlation coefficients, Cronbach’s Alpha, intra-class correlation 3.Measurement Estimates Use when all the information available from all raters is important, you have many raters, and it is impossible for all raters to rate all items Use when all the information available from all raters is important, you have many raters, and it is impossible for all raters to rate all items Factor analysis, many-facets Rasch model Factor analysis, many-facets Rasch model
10
Inter-Rater Reliability Which approach is best for student-learning assessment? Answer in order of priority: 1.Consensus 2.Consistency 3.Measurement
11
Intro to Data Reporting Let’s start with reporting data about student learning in which a rubric was used (since we are on the topic, right?) Later we will discuss data reporting from objective measures of student learning, such as a multiple-choice tests
12
Intro to Data Reporting Let’s assume we used the VALUE Rubric for written communication
13
Written Communication Rubric Dimensions Capstone 4 Milestones 32 Benchmark 1 Context of and Purpose for Writing Includes considerations of audience, purpose, and the circumstances surrounding the writing task(s). Demonstrates a thorough understanding of context, audience, and purpose that is responsive to the assigned task(s) and focuses all elements of the work. Demonstrates adequate consideration of context, audience, and purpose and a clear focus on the assigned task(s) (e.g., the task aligns with audience, purpose, and context). Demonstrates awareness of context, audience, purpose, and to the assigned tasks(s) (e.g., begins to show awareness of audience's perceptions and assumptions). Demonstrates minimal attention to context, audience, purpose, and to the assigned tasks(s) (e.g., expectation of instructor or self as audience). Content DevelopmentUses appropriate, relevant, and compelling content to illustrate mastery of the subject, conveying the writer's understanding, and shaping the whole work. Uses appropriate, relevant, and compelling content to explore ideas within the context of the discipline and shape the whole work. Uses appropriate and relevant content to develop and explore ideas through most of the work. Uses appropriate and relevant content to develop simple ideas in some parts of the work. Genre and Disciplinary Conventions Formal and informal rules inherent in the expectations for writing in particular forms and/or academic fields (please see glossary). Demonstrates detailed attention to and successful execution of a wide range of conventions particular to a specific discipline and/or writing task (s) including organization, content, presentation, formatting, and stylistic choices Demonstrates consistent use of important conventions particular to a specific discipline and/or writing task(s), including organization, content, presentation, and stylistic choices Follows expectations appropriate to a specific discipline and/or writing task(s) for basic organization, content, and presentation Attempts to use a consistent system for basic organization and presentation. Sources and EvidenceDemonstrates skillful use of high-quality, credible, relevant sources to develop ideas that are appropriate for the discipline and genre of the writing Demonstrates consistent use of credible, relevant sources to support ideas that are situated within the discipline and genre of the writing. Demonstrates an attempt to use credible and/or relevant sources to support ideas that are appropriate for the discipline and genre of the writing. Demonstrates an attempt to use sources to support ideas in the writing. Control of Syntax and Mechanics Uses graceful language that skillfully communicates meaning to readers with clarity and fluency, and is virtually error- free. Uses straightforward language that generally conveys meaning to readers. The language in the portfolio has few errors. Uses language that generally conveys meaning to readers with clarity, although writing may include some errors. Uses language that sometimes impedes meaning because of errors in usage.
14
Intro to Data Reporting Report summarized (aggregated) data from the rubric rows and the total Here is an example from the Multi-State Collaborative to Advance Learning Outcomes Assessment sponsored by State Higher Education Executive Officers Association (SHEEO)
15
Note: Each work product was scored on 5 dimensions of written communication using a common AAC&U VALUE Rubric. These results are not generalizable across participating states or the nation in any way. Please use appropriately.
16
Next Workshop April 6, 2016 Reporting your assessment data continued Assessment Office’s Report form for non-specialized accredited programs Assessment Offices’ Status Report form for specialized accredited programs Final Workshop: April 27, 2016 Focus will be action plans to improve student learning and/or the student-learning assessment process
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.