Assessment Instruments and Rubrics Workshop Series Part 3: Test Your Rubric and Introduction to Data Reporting March 30, 2016 Dr. Summer DeProw.

Slides:

Advertisements

Similar presentations

Students MAKE LEARNING VISIBLE. Part II A permanent record of their work that can be useful to them in later studies, jobs, etc. A record that allows.

Advertisements

Strategies to Measure Student Writing Skills in Your Disciplines Joan Hawthorne University of North Dakota.

Linking assignments with course and program goals Lehman College Assessment Council and Lehman Teaching and Learning Commons November 28, :30 -

Measurement Concepts Operational Definition: is the definition of a variable in terms of the actual procedures used by the researcher to measure and/or.

Carol Ann Gittens, Gail Gradowski & Christa Bailey Santa Clara University WASC Academic Resource Conference Session D1 April 25, 2014.

A Guide for College Assessment Leaders Ursula Waln, Director of Student Learning Assessment Central New Mexico Community College.

Erica Schurter and Molly Mead Department of Information Access.

Measurement. Scales of Measurement Stanley S. Stevens’ Five Criteria for Four Scales Nominal Scales –1. numbers are assigned to objects according to rules.

Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.

Authentic Assessment Abdelmoneim A. Hassan. Welcome Authentic Assessment Qatar University Workshop.

Research Methods in MIS

Classroom Assessment A Practical Guide for Educators by Craig A

Validity and Reliability

Assessment in Education Patricia O’Sullivan Office of Educational Development UAMS.

The Essential Skill of Writing An Introductory Training for High School Teachers Penny Plavala, Multnomah ESD Using the Writing Scoring Guide.

Higher Learning Commission Annual Conference Chicago, IL ▪ April 8, 2013 Christine Keller VSA Executive Director Teri Lyn Hinds VSA Associate Director.

EQAO Assessments and Rangefinding

VALUE/Multi-State Collaborative (MSC) to Advance Learning Outcomes Assessment Pilot Year Study Findings and Summary These slides summarize results from.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.

Chapter 7 Measuring of data Reliability of measuring instruments The reliability* of instrument is the consistency with which it measures the target attribute.

The College of Engineering Writing Resource Center.

Developing Program Learning Outcomes To help in the quality of services.

Rubrics: Using Performance Criteria to Evaluate Student Learning PERFORMANCE RATING PERFORMANCE CRITERIABeginning 1 Developing 2 Accomplished 3 Content.

Assessing Learning in Programs: A Crash Course for Faculty Office of University Assessment Tara Rose, Director

MUS Outcomes Assessment Workshop University-wide Program-level Writing Assessment at The University of Montana Beverly Ann Chin Chair, Writing Committee.

第一版 Rubric( 從前陳執行長 ) 去年六月挑選幾個科目做，非常費事。去年九月向系上同仁宣告每個科目都要做 Rubric 去年十月向陳執行長請教 : 經院長與周副等決定整個 Program 做 assessment.

Academic Decathlon Essay Judge Training. What Is Academic Decathlon? Scholastic competition for high school students Scholastic competition for high school.

Assessment Instruments and Rubrics Workshop Series Part 4: Data Reporting Continued April 6, 2016 Drs. Summer DeProw and Topeka Small.

An Essay Rubric for Grading Essays. What is an essay? A series of paragraphs discussing a single topic The intro paragraph explains the topic and gives.

The Rocket Science of Score Points Holistic Scoring and the New Jersey HSPA Writing Assessment.

KWL Take a minute to discuss with a friend/ jot down your KWL thoughts

Critical Information Literacy

EVALUATING EPP-CREATED ASSESSMENTS

Assessment Planning and Learning Outcome Design Dr

WORKSHOP Computer Science Curriculum Development

CRITICAL CORE: Straight Talk.

Classroom Assessment A Practical Guide for Educators by Craig A

Consider Your Audience

Taking the TEAM Approach: Writing with a Purpose

Reliability and Validity

Assessment of Student Learning

Elayne Colón and Tom Dana

SAT Notes: Please get out your notebook and turn to the writing section. We are taking notes today.

Writing Tasks and Prompts

Assessment Instruments and Rubrics Workshop Series

2nd Annual Faculty e-Learning Symposium 26th May 2016 Mrs E D Joubert

Creating Analytic Rubrics April 27, 2017

CASE STUDY BY: JESSICA PATRON.

Understanding by Design

Academic Rubric Slides

AP Seminar: irr directions & rubric analysis

© Cengage Learning. All rights reserved.

Literature Response Papers

Reliability and Validity of Measurement

Multi-State Collaborative (MSC) to Advance Learning Outcomes Assessment Pilot Year Study Findings and Summary These slides summarize results from.

Technical Communication and the RosE-Portfolio

© Cengage Learning. All rights reserved.

Multi-State Collaborative (MSC) to Advance Learning Outcomes Assessment Pilot Year Study Findings and Summary These slides summarize results from.

Sociology Outcomes Assessment

Assignment Design Workshop

Writing Learning Outcomes

Developing a Rubric for Assessment

Evaluation and Testing

“Language is the most complicated human behaviour” ”

© Cengage Learning. All rights reserved.

Testing Schedule.

Chapter 8 VALIDITY AND RELIABILITY

Training Chapter for the Advanced Field Experience Form (Pre-CPAST

Presentation transcript:

Assessment Instruments and Rubrics Workshop Series Part 3: Test Your Rubric and Introduction to Data Reporting March 30, 2016 Dr. Summer DeProw

Workshop Agenda Follow-up from Part 2’s Workshop—How did it go? Test your rubric Best, formal practices Modifications Inter-rater Reliability Intro to data reporting (if time allows)

Follow-up? How have your progressed? Did you ask a colleague to review your rubric? Have you used it to score student work? Would you like to test it with your workshop colleagues now?

Best Practices for Implementing a New Rubric First, let’s set the context……. This rubric is for program-level student learning assessment The intent for using the rubric is to report student-learning on a subjectively scored essay, paper, oral presentation, capstone project, clinical-type examination, or original research study, etc. The summarized results will be used to determine the level of student learning for a particular program-level outcome/s And, the summarized results will be eventually reported to an accreditor/s

Best Practices for Implementing a New Rubric All student work must be scored by at least two professors Scoring should be blind scoring Professors should be trained to use the rubric consistently Inter-rater Reliability (IRR) statistics should be calculated IRR is particularly important when a rubric is first deployed Low IRR statistics should guide any future modifications of the rubric and training sessions for professors IRR statistics should be reported with the summarized student-learning data

Acceptable Modifications to the Best Practices Scenario 1: You have hundreds, perhaps thousands of student artifacts for at least two professor’s to score Take a random sample Use many professors to score an acceptable number without extreme fatigue Scenario 2: You have professor’s who say “I scored this work as part of my class, why do I have to score it again?” Take the scores from the class Ask a second professor or group of professors to score the second reads Scenario 3: It would be a serious burden to blind score the students’ work Do the best you can and report that student identity was known during scoring Scenario 4: You have no idea what IRR is or how to calculate it Call the Assessment Office

Inter-Rater Reliability Why is IRR important? “Nearly universal” agreement that reliability is an important property in educational measurement (Bresciani, et al., 2009) We are scoring student work that cannot be scored objectively but requires a rating of degree The goal is to minimize the variations between raters to formalize the criteria for scoring student work Gives an opportunity for training Less formally, it should spark robust conversation between professors about what students should know and be able to do and/or the data professors need to make student-learning action plans

Inter-Rater Reliability Definition Wide-range of applications across a variety of situations so there is no single definition (Gwet, 2014) Basic concept is to determine the intentional agreement between raters and remove the happenstance agreement due to rater guessing Usually reported between 0 to 1, but some measures can offer negative scores between -1 and 0.

Inter-Rater Reliability Three common approaches: 1.Consensus Estimates Use when exact agreement is needed to provide defensible evidence for the existence of the construct (definition of construct: an idea or theory containing various conceptual elements, typically one considered to be subjective and not based on empirical evidence) Use when exact agreement is needed to provide defensible evidence for the existence of the construct (definition of construct: an idea or theory containing various conceptual elements, typically one considered to be subjective and not based on empirical evidence) Cohen’s Kappa, Krippendorff’s Alpha, percent agreement Cohen’s Kappa, Krippendorff’s Alpha, percent agreement 2.Consistency Estimates Use when it’s not necessary for raters to share a common interpretation of the rating scale, as long as each judge is consistent in classifying the phenomenon according to his/her own definition of the scale Use when it’s not necessary for raters to share a common interpretation of the rating scale, as long as each judge is consistent in classifying the phenomenon according to his/her own definition of the scale Differences are OK as long as the differences are predictable Differences are OK as long as the differences are predictable Correlation coefficients, Cronbach’s Alpha, intra-class correlation Correlation coefficients, Cronbach’s Alpha, intra-class correlation 3.Measurement Estimates Use when all the information available from all raters is important, you have many raters, and it is impossible for all raters to rate all items Use when all the information available from all raters is important, you have many raters, and it is impossible for all raters to rate all items Factor analysis, many-facets Rasch model Factor analysis, many-facets Rasch model

Inter-Rater Reliability Which approach is best for student-learning assessment? Answer in order of priority: 1.Consensus 2.Consistency 3.Measurement

Intro to Data Reporting Let’s start with reporting data about student learning in which a rubric was used (since we are on the topic, right?) Later we will discuss data reporting from objective measures of student learning, such as a multiple-choice tests

Intro to Data Reporting Let’s assume we used the VALUE Rubric for written communication

Written Communication Rubric Dimensions Capstone 4 Milestones 32 Benchmark 1 Context of and Purpose for Writing Includes considerations of audience, purpose, and the circumstances surrounding the writing task(s). Demonstrates a thorough understanding of context, audience, and purpose that is responsive to the assigned task(s) and focuses all elements of the work. Demonstrates adequate consideration of context, audience, and purpose and a clear focus on the assigned task(s) (e.g., the task aligns with audience, purpose, and context). Demonstrates awareness of context, audience, purpose, and to the assigned tasks(s) (e.g., begins to show awareness of audience's perceptions and assumptions). Demonstrates minimal attention to context, audience, purpose, and to the assigned tasks(s) (e.g., expectation of instructor or self as audience). Content DevelopmentUses appropriate, relevant, and compelling content to illustrate mastery of the subject, conveying the writer's understanding, and shaping the whole work. Uses appropriate, relevant, and compelling content to explore ideas within the context of the discipline and shape the whole work. Uses appropriate and relevant content to develop and explore ideas through most of the work. Uses appropriate and relevant content to develop simple ideas in some parts of the work. Genre and Disciplinary Conventions Formal and informal rules inherent in the expectations for writing in particular forms and/or academic fields (please see glossary). Demonstrates detailed attention to and successful execution of a wide range of conventions particular to a specific discipline and/or writing task (s) including organization, content, presentation, formatting, and stylistic choices Demonstrates consistent use of important conventions particular to a specific discipline and/or writing task(s), including organization, content, presentation, and stylistic choices Follows expectations appropriate to a specific discipline and/or writing task(s) for basic organization, content, and presentation Attempts to use a consistent system for basic organization and presentation. Sources and EvidenceDemonstrates skillful use of high-quality, credible, relevant sources to develop ideas that are appropriate for the discipline and genre of the writing Demonstrates consistent use of credible, relevant sources to support ideas that are situated within the discipline and genre of the writing. Demonstrates an attempt to use credible and/or relevant sources to support ideas that are appropriate for the discipline and genre of the writing. Demonstrates an attempt to use sources to support ideas in the writing. Control of Syntax and Mechanics Uses graceful language that skillfully communicates meaning to readers with clarity and fluency, and is virtually error- free. Uses straightforward language that generally conveys meaning to readers. The language in the portfolio has few errors. Uses language that generally conveys meaning to readers with clarity, although writing may include some errors. Uses language that sometimes impedes meaning because of errors in usage.

Intro to Data Reporting Report summarized (aggregated) data from the rubric rows and the total Here is an example from the Multi-State Collaborative to Advance Learning Outcomes Assessment sponsored by State Higher Education Executive Officers Association (SHEEO)

Note: Each work product was scored on 5 dimensions of written communication using a common AAC&U VALUE Rubric. These results are not generalizable across participating states or the nation in any way. Please use appropriately.

Next Workshop April 6, 2016 Reporting your assessment data continued Assessment Office’s Report form for non-specialized accredited programs Assessment Offices’ Status Report form for specialized accredited programs Final Workshop: April 27, 2016 Focus will be action plans to improve student learning and/or the student-learning assessment process