Download presentation
Presentation is loading. Please wait.
Published byJoanna Hudson Modified over 9 years ago
1
1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board Tuesday, Oct 25, 2005 (9:15 AM - 10:15 AM) Coral Tower, Lobby Level Coral Tower, Lobby Level
2
2 AES, AI, ACCUPLACER/WritePlacer When essays are scored by human experts, the scoring characteristics can be mapped by Artificial Intelligence (AI) and used in Automatic Essay Scoring (AES). AI is used to identify and internalize essay features into scoring models (algorithms). The algorithms are verified in simulation and subsequently on live essays. The algorithms are used by AES to score an essay.
3
3 Automatic Essay Scoring AI maps salient characteristics of freshman essays (about 300) into a linear model of each score (for example, 6s ; 8s ; 10s, etc.) AES is carried out by mathematically matching live essays to these predetermined linear models to predict a score. AES algorithms specify whether an essay’s characteristics mathematically match the semantic space previously specified by human graders.
4
4 AES AES therefore emulates human raters by repeatedly evaluating characteristic essay features such as Structure, Content, Style, Syntax, Discourse, and Word choice to predict a maximum likelihood estimate of a score according to the algorithms copied from the 300 human-expert scored essays. AES’s performance has been verified in national level studies and now waits for users to conduct performance tests at local levels. We conducted our local performance study with ACCUPLACER/WritePlacer.
5
5 WritePlacer employs AI called Intellimetrics WritePlacer infers and internalizes the rubric and pooled judgments of human scorers by analyzing over 300 semantic, syntactic and discourse features in five categories: Focus and Unity Development and Elaboration Organization and Structure Sentence Structure Mechanics and Conventions
6
6 ACCUPLACER/WritePlacer is Online ACCUPLACER Online offers an option for AES called WritePlacer Plus. Delivery is online, testing time is reduced, reliability is enhanced, and scoring is immediate. At U. of Houston Downtown we asked whether this AES is the same as human- expert scoring. In other words, does this AES differ from human scoring?
7
7 We Conducted a local Study Research Question 1 What is the correlation between WritePlacer scores and human expert scores? Is it significant? Research Question 2 Do distributions of scores differ? (Are the medians equal?)
8
8 Our Hypotheses Hypothesis 1 A significant correlation exists between WritePlacer scores and human expert scores. (Ho : correlation = 0) Hypothesis 2 The Median WritePlacer score is equal to the Median human expert score. (Ho: Medians are equal.)
9
9 Our Method Participants were 112 randomly selected, college freshmen examinee essay takers. Their essays were twice scored : 1st by WritePlacer’s AES and 2nd by human experts. Correlation between scores was obtained. To see whether the median scores differed, a non- parametric test statistic was obtained.
10
10 Table 1 Frequencies of Differences DifferenceFrequencyPercentWho Scored higher? -244%AES 76%AES 06760%identical 12825%Human 265%Human Total 112 Total 100%
11
11 Table 2 – Significance Tests Medians Test n Mean Rank Sum of Ranks Wilcoxon Test Statistic AES112119.6313398 11802 p>.05 Human112105.3811802 Correlation rho =.724 p<.05
12
12 Discussion of Tables Table 1 indicates that 91% of the paired scores were identical or agreed within 1 point and that 9% differed by 2 points. [The 10 (9%) that differed by 2 points were split 60%-40%: 6 where Human > AES and 4 where AES >Human)]. Table 2 shows inferential statistics supporting a conclusion that AI scoring assigns the same scores to essays as human experts assign to (the same) essays.
13
13 Findings The correlation between WritePlacer scores and human-expert scores is significant : r =.72 p<.05. The distributions of WritePlacer scores and human-expert scores are the same): Wilcoxon W 11802 p>.05
14
14 Conclusions Scoring essays by AES (as implemented within ACCUPLACER/WritePlacer) is consistent with scoring essays by human experts. (Interrater reliability is significant.) AES scoring of essays is not subject to unreliability (inconsistency) due to fatigue. AES never gets tired ! AES scoring is efficient and effective.
15
15 Additional Issues: 1. Measurement error is eliminated. 2. Essay supplemented by MC items = increased confidence about placement. 3. Efficiency/ faculty freed for instruction. 4. GMAT/MCAT/SAT are adopting AES. 5. Deep Blue learned chess moves.
16
16 Thank you
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.