Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department.

Similar presentations


Presentation on theme: "Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department."— Presentation transcript:

1 Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department of Education June 22, 2015

2 1984-2004 Statewide writing assessment began in 1984. Traditional paper-pencil assessment administered from 1984-2004. o Grades 4, 7 and 10 o Approximately 20,000 students per grade level o Hand scored o Grade-level rubrics for scoring o Modified holistic scoring on 4-point scale o Four genres – narrative, descriptive, expository, persuasive o Results not included as part of state accountability data West Virginia Writing Assessment

3 1984-2004 Grade 4 Rubric West Virginia Writing Assessment

4 1984-2004 Grades 7 and 10 Rubric West Virginia Writing Assessment

5 2005-2007 Online Writing Assessment from 2005-2007, except grade 4. Paper-pencil test in grade 4 o Hand scored Computer-based assessment in grades 7 and 10 o Artificial intelligence engine scoring Approximately 20,000 students per grade level Grade-level rubrics for scoring Analytic trait scoring on a 6-point scale o Five traits – Organization, Development, Sentence Structure, Word Choice, Mechanics Four genres – narrative, descriptive, informative, persuasive Results not included as part of state accountability data Scores on each analytic trait added to obtain a Summative Score and Performance Level WV Online Writing Assessment

6 2007 Grade 4 Writing Prompt Imagine that you are on a magic carpet that takes you anywhere you want to go. Tell about where you might go and what you might do. WV Online Writing Assessment

7 2005-2007 Grade 7 Rubric WV Online Writing Assessment

8 2005-2007 Grade 10 Rubric WV Online Writing Assessment

9 Initial Challenges Bandwidth - Connectivity Number of testing devices/computer labs Computer classes in labs Security updates Length of testing window Concerns about keyboarding skills, particularly younger students Validity and reliability of artificial intelligence scoring engine WV Online Writing Assessment

10 Actions State and districts increased bandwidth. State and districts increased the number of testing devices. Nine-week testing window was established to address technology concerns and reduce daily testing load.  Window spanned from February to April. From 2005-2007, fourth graders continued paper- pencil testing because of concerns about keyboarding skills. State engaged teachers in reviewing computer scoring to help with teacher buy-in. WV Online Writing Assessment

11 New Online Writing Assessment Field Test - 2008 Expanded to grades 3-11 o Hand scored Approximately 20,000 students per grade level Grade-level rubrics for scoring Analytic trait scoring on a 6-point scale o Five traits – Organization, Development, Sentence Structure, Word Choice/Grammar Usage, Mechanics Four genres – narrative, descriptive, informative and persuasive o Only narrative and descriptive at grade 3 Passages added to prompts Results not included as part of state accountability data WV Online Writing Assessment

12 2008 Field Test New prompts with passages 136 prompts field tested – 2 genres at grade 3, 4 genres at grades 4-9, 4 prompts per genre 2 operational prompts selected per genre New grade-specific, 6-point analytic writing rubrics All student essays were hand scored State staff and selected teachers participate in range- finding Hand scored essays used to training new AI scoring engine WV Online Writing Assessment

13 2009-2014 Sample Grade 3 Descriptive Writing Prompt WV Online Writing Assessment

14 This is where you will begin typing your essay. At the end of the paragraph, hit the enter key at least once to skip a line between paragraphs. Do not hit the tab key to indent your paragraph. It will not work.

15 2009-2014 Grade 7 Rubric WV Online Writing Assessment

16 Grade 3 Student Survey 85 percent of grade 3 students indicated they preferred writing their essays on the computer than using traditional paper-pencil. WV Online Writing Assessment

17 WESTEST 2 Online Writing Assessment – 2009-2014 Grades 3-11 o Artificial intelligence engine scoring Approximately 20,000 students per grade level Grade-level rubrics for scoring Analytic trait scoring on a 6-point scale o Five traits – Organization, Development, Sentence Structure, Word Choice/Grammar Usage, Mechanics WV Online Writing Assessment

18 WESTEST 2 Online Writing Assessment – 2009-2014 Four genres – narrative, descriptive, informative and persuasive o Only narrative and descriptive at grade 3 Passages/prompts Results not included as part of state accountability data Online formative assessment practice program available for schools to use WV Online Writing Assessment

19 Later Challenges Bandwidth/Connectivity o Continued in some districts and schools but improved overall Number of testing devices/computer labs o Continued in some districts and schools but improved overall Computer classes in labs o Continued to be an issue but improved overall Browser updates o Test platform only allowed the use of Internet Explorer o Microsoft auto updates sometimes created problems Accuracy and reliability of AI scoring in practice program o Created lack of confidence in summative scoring engine WV Online Writing Assessment

20 Formative Assessment Practice Program – 2009-2014 Writing Roadmap – shelf product o Shelf prompts o Shelf rubric o AI scoring West Virginia Writes – customization of Writing Roadmap for West Virginia o WV passages and prompts (field tested) o WV writing rubrics o Student responses from field test used to train AI engine WV Online Writing Assessment

21 AI Scoring Challenges Teacher buy-in and understanding of AI scoring Field testing sufficient number of prompts o WV lost some prompts during psychometric analysis resulting in the need to repeat prompts in alternate years Rubric development for use in AI scoring Initial hand scoring Range finding Training sets Sufficient number of student responses to train engine o Particularly finding sufficient number of student responses scored in the high range WV Online Writing Assessment

22 Scoring Reliability Validation Papers/Iterations Second Reads Comparability Studies Artificial Intelligence Scoring

23 Importance of Comparability Engine to Professional Hand Scorers Engine to West Virginia Teachers Artificial Intelligence Scoring

24 Vendor Validation WV Online Writing Assessment

25 WV Comparability Studies WV Online Writing Assessment

26 Benefits of Teacher Participation Professional development in using rubrics for hand scoring of student essays Improvement of instructional practices Teacher buy-in of artificial intelligence scoring WV Online Writing Assessment

27 Considerations Involve teachers in prompt and rubric development Pilot testing and field testing important Sufficient number of prompts should be included in field test depending on sample size Include teachers in range finding Sufficient number of essays at each score point necessary to train engine, particularly for highest score point Artificial Intelligence Scoring

28 Considerations Quality of training sets important Engine must be calibrated to the scoring rubric(s) Engine training is key Vendor validation and read-behind studies State comparability studies with state teachers Ongoing engine training to account for potential drift Provide practice program for teachers and students Professional development for state teachers Artificial Intelligence Scoring

29 Scoring Strengths and Weaknesses Human ScoringEngine Scoring Scoring accuracy dependent on training Get tired, hungry, boredDoesn’t get tired, hungry, bored Individual Scorer BiasNo Bias Easier to train – quickerMore difficult to train- time- consuming Can make inferencesHas difficulty with inferences Slow, expensive scoringQuick, less expensive scoring Artificial Intelligence Scoring


Download ppt "Artificial Intelligence Scoring of Student Essays: West Virginia’s Experience Vaughn G. Rhudy, Ed.D., NBCT Office of Assessment West Virginia Department."

Similar presentations


Ads by Google