James Christie Automated Marking for Essay Content ~ does it work?

Slides:



Advertisements
Similar presentations
Assessment types and activities
Advertisements

UNIT-2 Data Preprocessing LectureTopic ********************************************** Lecture-13Why preprocess the data? Lecture-14Data cleaning Lecture-15Data.
Introduction to: Automated Essay Scoring (AES) Anat Ben-Simon Introduction to: Automated Essay Scoring (AES) Anat Ben-Simon National Institute for Testing.
Spiros Papageorgiou University of Michigan
Teaching writing.
Guide to Computer Forensics and Investigations Fourth Edition
1 Testing Writing Pertemuan 21 Matakuliah: >/ > Tahun: >
OHT 3.1 Galin, SQA from theory to implementation © Pearson Education Limited 2004 The need for comprehensive software quality requirements Classification.
Agenda What is TOEFL PBT? Sections of the TOEFL PBT Test of Written English (TWE) Listening Comprehension Structure and Written Expression Reading Comprehension.
Edit the text with your own short phrases. The animation is already done for you; just copy and paste the slide into your existing presentation. SUMMER.
 A data processing system is a combination of machines and people that for a set of inputs produces a defined set of outputs. The inputs and outputs.
 DIAGNOSTIC: provides instructors with information about student's prior knowledge and misconceptions before beginning a learning activity.  FORMATIVE:
Introduction.  Classification based on function role in classroom instruction  Placement assessment: administered at the beginning of instruction 
Essay Assessment Tasks
CIS 375 Final Presentation Doug Code § Brad Lloyd § Michelle Zukowski.
Lesson 4 — Keyboarding Unit 1 — Computer Basics. Lesson 4 – Keyboarding 2 Objectives Define keyboarding. Identify the parts of the standard keyboard.
Automated Essay Evaluation Martin Angert Rachel Drossman.
An English Proficiency Test for Today’s Student Using Today’s Technology Marcie Mealia,
Beyond Usability: Measuring Speech Application Success Silke Witt-Ehsani, PhD VP, VUI Design Center TuVox.
Welcome Orientation. Introduction to the Course Course Objectives By the end of this course students will be able to: · Master the grammatical uses and.
The Developmental Reading & English Placement Test
MÁSTER OFICIAL EN INGLÉS Y ESPAÑOL PARA FINES ESPECÍFICOS Y EMPRESARIALES General Principles for technical and scientific communication ENGLISH FOR SCIENCE.
The Writing Section of the SAT Strategies for the Multiple Choice Questions.
Using Turnitin® and ETS e-rater® with myWriteSmart
Software Development Software Testing. Testing Definitions There are many tests going under various names. The following is a general list to get a feel.
ASSESSMENT IN EDUCATION ASSESSMENT IN EDUCATION. Copyright Keith Morrison, 2004 ITEM TYPES IN A TEST Missing words and incomplete sentences Multiple choice.
Assessment Professional Learning Module 5: Making Consistent Judgements.
James Christie Automated Marking of Essays for Content.
Form Follows Function Kathleen Dudden Rowlands, Ph.D.
1 Assessment Professional Learning Module 5: Making Consistent Judgements.
This is where your writing is being assessed as opposed to your reading.
The Four P’s of an Effective Writing Tool: Personalized Practice with Proven Progress April 30, 2014.
Measurement Validity.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Software quality factors
Using Machine Learning Techniques in Stylometry Ramyaa, Congzhou He, Dr. Khaled Rasheed.
Assessment. Workshop Outline Testing and assessment Why assess? Types of tests Types of assessment Some assessment task types Backwash Qualities of a.
Distant Course Master English English Language Course For Masters of Mathematical and Mechanical Faculty Saint Petersburg State University The Faculty.
Xianggang Putonghua Yanxishe Primary School of Science and Creativity
1 Chapter 18: Selection and training n Selection and Training: Last lines of defense in creating a safe and efficient system n Selection: Methods for selecting.
Scratching the Surface: ↗Dealing with Grammar, Mechanics, and Editing Problems ↗Writing Program Conversation ↗October 28, 2015.
Essay Questions. Two Main Purposes for essay questions 1. to assess students' understanding of and ability to think with subject matter content. 2. to.
Assessment Item Types: SA/C, TF, Matching. Assessment Item Types Objective Assessments Objective Assessments Performance Assessments Performance Assessments.
ID Identification in Online Communities Yufei Pan Rutgers University.
Test Question Writing Instructor Development ANSF Nurse Training Program.
TEST SCORES INTERPRETATION - is a process of assigning meaning and usefulness to the scores obtained from classroom test. - This is necessary because.
* Statutory Assessment Tasks and Tests (also includes Teacher Assessment). * Usually taken at the end of Key Stage 1 (at age 7) and at the end of Key.
EVALUATION SUFFECIENCY Types of Tests Items ( part I)
MyWritingLabPlus and Psychology. What is MyWritingLabPlus? MyWritingLabPlus is an online program designed to help you with writing and grammar necessary.
Design Evaluation Overview Introduction Model for Interface Design Evaluation Types of Evaluation –Conceptual Design –Usability –Learning Outcome.
Key Stage 2 SATs Monday 13th May – Thursday 16 th May 2013 If a child is ill during SATs week, it is possible for them to take the test up to 5.
The ACT What is the ACT? How is it different from the SAT? Should you take both?
Assessment in Education ~ What teachers need to know.
MY Access! ® Product Research base
Testing and Assessment
Error Free Writing Error Free writing is unlikely, but you can definitely minimize the number of errors in your papers.
ERT 445/2 FINAL YEAR PROJECT 1
Automated Essay Scoring Tools – New approaches in Editing Process
Multimedia Information Retrieval
Singulier et pluriel un une le la des des les les (a, the, some)
Chapter 3 – Critical Thinking and Viewing
Lindsay E. Lassen, M.ed. Shannon l. meers, mat, med.
Singulier et pluriel un une le la des des les les (a, the, some)
Guide to Computer Forensics and Investigations Fourth Edition
Deputy Commissioner Jeff Wulfson Associate Commissioner Michol Stapel
Lesson Planning (2) (A.E.T. Wk 11).
Guide to Computer Forensics and Investigations Fourth Edition
Passive Voice Revision
Techniques to Proofread Your Grammar and Spelling in Essay Writing
Presentation transcript:

James Christie Automated Marking for Essay Content ~ does it work?

Essay Definition 1 of 2 … requires a response composed by the examinee, usually in the form of one or more sentence, of a nature that no single response or pattern of responses can be listed as correct, and the accuracy and quality of which can be judged subjectively only by one skilled or informed in the subject, …

Essay Definition 2 of 2 … but even an expert cannot usually classify a response as categorically right or wrong. Rather, there are different degrees of quality or merit which can be recognized. … attributed to Stalnaker, 1951

Possible criteria for automated essay marking Ease of creating a scoring schema Ability to score on various mark regimes Ease of identification on non-scoring elements Ease of modification should scoring error(s) occur Consistent and reproducible scoring Acceptability of results to human markers, essayists, … Defensibility Accuracy and precision Coachability avoidance Cost

Model Essay The cat sat on the mat.

Marking Schema MarkItem 3catsatmat [max:3]

Content Data Structure 1 z a a z f cat z f sat z f mat

Essay Set ALPHA –The cat sat on the mat. BRAVO –The cat sat on the floor. CHARLIE –The dog lay on the floor. MODEL –The cat sat on the mat.

Process interface What LEVEL of Diagnostics to use [0... 3] : 0 What ESSAY SET to use : catmat Enter SCHEMA to use : catmat ALPHA.EXT. BRAVO.EXT. CHARLIE.EXT. MODEL.EXT. Started on Thursday, February at 15:59:05 Finished on Thursday, February at 15:59:06

Schema Report Essay set catmat Schema Report using... catmat Entities : Entity ID : Entity Type : a f f f Part ID's : z Essay Name : ALPHA.EXT: y y y y BRAVO.EXT: _ y y _ CHARLIE.EXT: _ _ _ _ MODEL.EXT: y y y y

Content Report Essay set catmat Content Report using... catmat Essay Name : Words Sentences Usage[%]Coverage[%]Part: z[ 3] Mark[ 3] %[100] ALPHA.EXT: BRAVO.EXT: CHARLIE.EXT: MODEL.EXT: Started on Thursday, February at 15:59:05 Finished on Thursday, February at 15:59:06 Marked 4 file(s): scanned 4 file(s)

Marking Performance Essay Set First v Second Markers Human v SEAR A 0.704** / 0.700**0.594** / 0.596** B 0.810** / 0.740**0.404** / 0.376** C / * / 0.394** D N/A0.238* / 0.336** Pearson / Spearman Significance **= 0.01 *= 0.05

Commercial packages – v – SEAR ProductEssaysFree Text StyleContentGrading Scheme Plain Text Word Processed EnglishNon English PEG 4 point e-Rater 4 point Intelligent Assessor marks Intelligent Essay Assessor 4 point Intellimetrics 4 point SEAR % & marks ?

Future work [content] maximise use of active and passive voices cope with spelling (and grammar) errors increased coverage of Bloom’s Taxonomy include non-textual feature(s) develop –better feedback to the essayist –better feedback to the examiner –plagiarism detection mechanism(s)

If manual marking equals Da Vinci’s Helical Screw,

then does SEAR equal the first powered flight?

Is the future equal to the ISS?

Is this the future [for style and content]? EnglishThe cat sat on the mat. ItalianIl gatto era seduto sullo zerbino. GreekI gata ekatse ston kanape. RussianKoshka sidit na matrase. FrenchLe chat s’est assis sur le tapis. GermanDie Katze sass auf dem Teppich. DutchDe kat zat op de mat. SpanishEl gato se sent’s en la alfombra. etc

Future work [style] obtain marked essays for style marking –plain ASCII essays using a common set of metrics –word-processed essays using a common set of metrics augmented with word-processing based metrics

James R Christie Does it work? Yes, but …