Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.

Slides:



Advertisements
Similar presentations
21 st Century Special Educators: Leading the Change! WVDE SETLA 2009 Using Writing Roadmap Data to Individualize Instruction.
Advertisements

WRITING ROADMAP 2.0 VAUGHN RHUDY, COORDINATOR West Virginia Department of Education Office of Assessment and Accountability.
Office of Student Assessment Services Online Writing Assessment at 7 th and 10 th Grade Assessed writing skills of 45,000 students in 7 th & 10 th grade.
Effective Writing Strategies and Resources Training Dr. Vaughn G. Rhudy, Coordinator West Virginia Department of Education Office of Assessment and Accountability.
1 New York State English as a Second Language Achievement Test (NYSESLAT) Presented by: Vanessa Lee Mercado Assistant in Educational Testing Office of.
You can use this presentation to: Gain an overall understanding of the purpose of the revised tool Learn about the changes that have been made Find advice.
WV Writes: A Roadmap to Writing Success Dr. Vaughn G. Rhudy, NBCT, WESTEST 2 RLA/Online Writing Coordinator West Virginia Department of Education Office.
Tablet Computers and Standards of Learning Testing: Insights from the Virginia Department of Education Monday, August 12, 2013.
Iowa Assessment Update School Administrators of Iowa November 2013 Catherine Welch Iowa Testing Programs.
NeSA-W Grade 11 Online Pilot Assessment Nebraska Department of Education Statewide Assessment Pat Roschewski –
Florida Standards Assessment (FSA) Parent Night
© 2010 Mountain View Whisman School District. All rights reserved. State Assessment Update Cathy Baur December 17, 2013.
Goals for this session Participants will know:  Requirements for demonstrating proficiency in the Essential Skill of Writing  Official State Scoring.
CORE California Office to Reform Education Fall Performance Assessment Pilot October-December 2012.
Race to the Top Technology and Innovation in Assessments Boston, MA Tony Alpert Oregon Department of Education.
Developing Rubrics Presented by Frank H. Osborne, Ph. D. © 2015 EMSE 3123 Math and Science in Education 1.
Administering the WESTEST 2 Online Writing Assessment A Session of WESTEST 2 Reading/Language Arts.
WV Writes Updates Stacey Murrell and Sandy Foster Office of Assessment and Accountability Bridgeport, WV April 26, 2012.
Bilingual Coordinators Network & Title III May 14 – 16, 2015 California Department of Education.
COE Reading Basics Lesley Klenk October 5,
San Mateo County Results from the 2014 SBAC Field Test Survey Deann Walsh Manager, Learning Analytics & Program Evaluation.
Welcome to the Practicing for the Smarter Balanced Writing Performance Task Online Session West Virginia Department of Education Office of Assessment and.
Update on the Kansas Writing Assessment Fall 2008 Matt Copeland Language Arts and Literacy Consultant Standards and Assessment Services Team Kansas State.
WV Writes: A Roadmap to Writing Success Stacey Murrell RLA Acuity and WV Writes Coordinator West Virginia Technology Conference Morgantown, WV August 3,
PARCC Assessment Administration Guidance
Ohio’s Assessment Future The Common Core & Its Impact on Student Assessment Evidence by Jim Lloyd Source doc: The Common Core and the Future of Student.
PARCC Update June 6, PARCC Update Today’s Presentation:  PARCC Field Test  Lessons Learned from the Field Test  PARCC Resources 2.
January 11, A few FAQS from districts regarding the 2013 pilot.
Standards of Learning SOL Online testing February, 2012.
Overview of English language arts (ELA) assessment Vaughn G. Rhudy, Ed.D., NBCT Assessment Coordinator Stacey Murrell, Ed.D. Interim.
Overview In late February, Alabama fifth, seventh, and tenth graders participate in the Alabama Direct Assessments of Writing (ADAW). This criterion-referenced.
Using Data to Make Decisions Regarding Professional and Personalized Learning: WV General Summative Assessment – Part 1 County Chief Instructional Leadership.
Common Core State Standards (CCSS) September 12, 2012.
WESTEST 2 Online Writing Principal/Building Level Coordinator’s Responsibilities.
PARCC Assessments Updates Updates Arrived 2/6/13! general specifics.
Invention Convention Seth Krivohlavek Angie Deck.
South Dakota Department of Education WriteToLearn Gay Pickner Director of Assessment June 22, 2011.
The Essential Skill of Writing An Introductory Training for High School Teachers Penny Plavala, Multnomah ESD Using the Writing Scoring Guide.
WESTEST 2 Online Writing: Administering the Assessment ********* ***********
Alternate Assessments: A Case Study of Students and Systems: Gerald Tindal UO.
Davis Junior High Action Plan English Department Tier II Indicator: Writing 2010 – 2011.
Teacher Evaluation Wednesday, July 24, :30 a.m. – 12:00 p.m.
Kansas State Department of Education June 23, 2014.
Assessment and Testing
1 NYSESLAT TRAINING: SCORING THE WRITING TEST Copyright © 2004 by Harcourt Assessment, Inc.
SD WriteToLearn Assessment Pilot TIE Conference April 20, 2010.
Online Formative Assessment Plan Supporting Data Driven Decisions through Online Formative Assessments Date: 11/01/2014 Presenter: First and last name.
1 Scoring Provincial Large-Scale Assessments María Elena Oliveri, University of British Columbia Britta Gundersen-Bryden, British Columbia Ministry of.
What parents are asking about West Virginia Educational Standards Test () What parents are asking about West Virginia Educational Standards Test (WESTEST)
Why a writing plan? Consistency among k-12 writing instruction Need to address state expectations for writing Data indicated very few students were exceeding.
An Institutional Writing Assessment Project Dr. Loraine Phillips Texas A&M University Dr. Yan Zhang University of Maryland University College October 2010.
Update on the Kansas Writing Assessment Matt Copeland Language Arts and Literacy Consultant Standards and Assessment Services Team Kansas State.
For the Students Students in elementary school right now have always used technology, classes seem outdated and boring to most because of the lack of.
AP Lang by the Numbers. Scoring Systems -When we talk about scores, there are two separate scoring systems that matter to you. What is my grade in class?
A PRESENTATION BY DR. PAUL SEVILLANO, ASSISTANT SUPERINTENDENT EDUCATION AND CYNTHIA VASQUEZ PETITT, ASSESSMENT AND EVALUATION ANALYST PRESENTED TO THE.
The Arizona English Language Learner Assessment (AZELLA)
Classroom Assessment A Practical Guide for Educators by Craig A
3rd-5th Grade FSA Parent Night
3rd-5th Grade FSA Parent Night
A System of Assessments for College and Career Readiness For the Anaheim Union High School District Prepared and presented by Dr. Paul Sevillano, Assistant.
The Arizona English Language Learner Assessment (AZELLA)
“Understanding the Elements of 21st Century Assessment”
Administering the 2010 WESTEST 2 Online Writing
Third Grade FSA Parent Meeting
Foundations for Writing Success
Third Grade FSA Parent Meeting
3rd-5th Grade FSA Parent Night
2009 WESTEST 2 ONLINE WRITING
Presentation transcript:

Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department of Education June 22, 2015

Statewide writing assessment began in Traditional paper-pencil assessment administered from o Grades 4, 7 and 10 o Approximately 20,000 students per grade level o Hand scored o Grade-level rubrics for scoring o Modified holistic scoring on 4-point scale o Four genres – narrative, descriptive, expository, persuasive o Results not included as part of state accountability data West Virginia Writing Assessment

Grade 4 Rubric West Virginia Writing Assessment

Grades 7 and 10 Rubric West Virginia Writing Assessment

Online Writing Assessment from , except grade 4. Paper-pencil test in grade 4 o Hand scored Computer-based assessment in grades 7 and 10 o Artificial intelligence engine scoring Approximately 20,000 students per grade level Grade-level rubrics for scoring Analytic trait scoring on a 6-point scale o Five traits – Organization, Development, Sentence Structure, Word Choice, Mechanics Four genres – narrative, descriptive, informative, persuasive Results not included as part of state accountability data Scores on each analytic trait added to obtain a Summative Score and Performance Level WV Online Writing Assessment

2007 Grade 4 Writing Prompt Imagine that you are on a magic carpet that takes you anywhere you want to go. Tell about where you might go and what you might do. WV Online Writing Assessment

Grade 7 Rubric WV Online Writing Assessment

Grade 10 Rubric WV Online Writing Assessment

Initial Challenges Bandwidth - Connectivity Number of testing devices/computer labs Computer classes in labs Security updates Length of testing window Concerns about keyboarding skills, particularly younger students Validity and reliability of artificial intelligence scoring engine WV Online Writing Assessment

Actions State and districts increased bandwidth. State and districts increased the number of testing devices. Nine-week testing window was established to address technology concerns and reduce daily testing load.  Window spanned from February to April. From , fourth graders continued paper- pencil testing because of concerns about keyboarding skills. State engaged teachers in reviewing computer scoring to help with teacher buy-in. WV Online Writing Assessment

New Online Writing Assessment Field Test Expanded to grades 3-11 o Hand scored Approximately 20,000 students per grade level Grade-level rubrics for scoring Analytic trait scoring on a 6-point scale o Five traits – Organization, Development, Sentence Structure, Word Choice/Grammar Usage, Mechanics Four genres – narrative, descriptive, informative and persuasive o Only narrative and descriptive at grade 3 Passages added to prompts Results not included as part of state accountability data WV Online Writing Assessment

2008 Field Test New prompts with passages 136 prompts field tested – 2 genres at grade 3, 4 genres at grades 4-9, 4 prompts per genre 2 operational prompts selected per genre New grade-specific, 6-point analytic writing rubrics All student essays were hand scored State staff and selected teachers participate in range- finding Hand scored essays used to training new AI scoring engine WV Online Writing Assessment

Sample Grade 3 Descriptive Writing Prompt WV Online Writing Assessment

This is where you will begin typing your essay. At the end of the paragraph, hit the enter key at least once to skip a line between paragraphs. Do not hit the tab key to indent your paragraph. It will not work.

Grade 7 Rubric WV Online Writing Assessment

Grade 3 Student Survey 85 percent of grade 3 students indicated they preferred writing their essays on the computer than using traditional paper-pencil. WV Online Writing Assessment

WESTEST 2 Online Writing Assessment – Grades 3-11 o Artificial intelligence engine scoring Approximately 20,000 students per grade level Grade-level rubrics for scoring Analytic trait scoring on a 6-point scale o Five traits – Organization, Development, Sentence Structure, Word Choice/Grammar Usage, Mechanics WV Online Writing Assessment

WESTEST 2 Online Writing Assessment – Four genres – narrative, descriptive, informative and persuasive o Only narrative and descriptive at grade 3 Passages/prompts Results not included as part of state accountability data Online formative assessment practice program available for schools to use WV Online Writing Assessment

Later Challenges Bandwidth/Connectivity o Continued in some districts and schools but improved overall Number of testing devices/computer labs o Continued in some districts and schools but improved overall Computer classes in labs o Continued to be an issue but improved overall Browser updates o Test platform only allowed the use of Internet Explorer o Microsoft auto updates sometimes created problems Accuracy and reliability of AI scoring in practice program o Created lack of confidence in summative scoring engine WV Online Writing Assessment

Formative Assessment Practice Program – Writing Roadmap – shelf product o Shelf prompts o Shelf rubric o AI scoring West Virginia Writes – customization of Writing Roadmap for West Virginia o WV passages and prompts (field tested) o WV writing rubrics o Student responses from field test used to train AI engine WV Online Writing Assessment

AI Scoring Challenges Teacher buy-in and understanding of AI scoring Field testing sufficient number of prompts o WV lost some prompts during psychometric analysis resulting in the need to repeat prompts in alternate years Rubric development for use in AI scoring Initial hand scoring Range finding Training sets Sufficient number of student responses to train engine o Particularly finding sufficient number of student responses scored in the high range WV Online Writing Assessment

Scoring Reliability Validation Papers/Iterations Second Reads Comparability Studies Artificial Intelligence Scoring

Importance of Comparability Engine to Professional Hand Scorers Engine to West Virginia Teachers Artificial Intelligence Scoring

Vendor Validation WV Online Writing Assessment

WV Comparability Studies WV Online Writing Assessment

Benefits of Teacher Participation Professional development in using rubrics for hand scoring of student essays Improvement of instructional practices Teacher buy-in of artificial intelligence scoring WV Online Writing Assessment

Considerations Involve teachers in prompt and rubric development Pilot testing and field testing important Sufficient number of prompts should be included in field test depending on sample size Include teachers in range finding Sufficient number of essays at each score point necessary to train engine, particularly for highest score point Artificial Intelligence Scoring

Considerations Quality of training sets important Engine must be calibrated to the scoring rubric(s) Engine training is key Vendor validation and read-behind studies State comparability studies with state teachers Ongoing engine training to account for potential drift Provide practice program for teachers and students Professional development for state teachers Artificial Intelligence Scoring

Scoring Strengths and Weaknesses Human ScoringEngine Scoring Scoring accuracy dependent on training Get tired, hungry, boredDoesn’t get tired, hungry, bored Individual Scorer BiasNo Bias Easier to train – quickerMore difficult to train- time- consuming Can make inferencesHas difficulty with inferences Slow, expensive scoringQuick, less expensive scoring Artificial Intelligence Scoring