Download presentation
Presentation is loading. Please wait.
Published byCaroline Johnson Modified over 9 years ago
1
Introduction
2
Classification based on function role in classroom instruction Placement assessment: administered at the beginning of instruction Formative assessment: monitor learning progress during instruction Diagnostic assessment: diagnose learning difficulties during instruction Summative assessment: assess achievement at the end of instruction
3
How the results of tests and assessment are interpreted? Norm referenced: performance in terms of relative position in a known group Criteria referenced: specific performance criteria (type 40 word/min without error)
4
Fixed-Choice/ Complex Performance assessment Fixed-choiceComplex-performanceShort answerEssay Factual knowledge Low level skills (recall) Objective assessment Highly reliable Critical thinking skills May extend beyond classroom Inferential skills Subjective assessment
5
Essay type questions Freedom of response ▪ Free to construct, relate and present ideas in own words Assess higher order skills ▪ Critical thinking Freedom in the cost of ▪ reliability in scoring ▪ time for evaluation
6
Prompt of an essay a topic around which you start jotting down ideas. single word, a short phrase, a complete paragraph or even a picture Trait of essay Characteristics of essay on which it is evaluated Scoring rubrics depend on traits
7
Ideas or content Organization Voice Word choice Sentence fluency
8
“the process of evaluating and scoring written prose via computer programs” NLP has helped to go beyond numeric scoring to qualitative feedback Multi-disciplinary AEE/AES systems PEG E-rater Intelligent Essay Assessor C-rater
9
Commercial AES by Education Testing Services (ETS), 1999 Employed in high stake assessment in Graduate Management Admission Test (GMAT) Shown to agree with expert raters Scoring depend on tangible markers related to writing constructs Organization and development of ideas Variation in syntactic constructs Vocabulary usage Technical correctness in terms of grammar, usage and mechanics
10
Grammatical errors Automatic grammatical error detection Article and preposition errors Discourse structure and organization Rhetorical Structure Theory motivated features Topic relevant word usage Content Vector Analysis (CVA) Style-related word usage Overly repetitious word usage
11
Grammatical error detection Rule-based approach ▪ Rules are defined over syntactic parse Statistical approach ▪ Word n-gram and POS n-grams Discourse analysis Linear representation of essay sentences Segment essay into ▪ Introductory material ▪ Thesis statement ▪ Main ideas ▪ Supporting ideas ▪ Conclusion
13
Content Vector Analysis (CVA) Essay to be graded Higher quality essay Lower quality essay Higher grade Lower grade
14
Collocation detection To test proper usage of word that depend on other words Collocation patterns ▪ Noun-of-noun (swarm of bees) ▪ Adjective+noun (strong tea) ▪ Noun+noun (house arrest)
15
Model is trained with human-scored essays Training Converting essay to vector of linguistic features Learning of weights through regression Different models Topic-specific model ▪ Training is done by drawing human scored essays on a given topic Generic model ▪ Topic agnostic Hybrid model ▪ Some feature weights are trained on generic essays while others are from prompt-specific essays.
16
Commercial AES by Pearson Knowledge Technologies, 1998 Features Automated scoring and feedback of paragraphs Grading summary writing to improve reading comprehension Performance task scoring Short answer scoring for students
17
Essay Score Mechanics Content Lexical Sophistication Style, Organization, Development Grammar SpellingCapitalization Punctuation LSA Similarity Vector Length Word Maturity Word Variety Confusable Word Inter-sentence coherence Essay coherence Topic development N-gram features Grammatical errors
18
Short answers are not short essays Evaluation of essays focuses on traits like grammar, style, vocabulary, organization etc. ▪ Computational syntax and stylistics Evaluation of short answers emphasizes on content ▪ Computational semantics Short answers are harder to evaluate Smaller amount of exploitable information
19
C-rater by ETS Grades free-text responses with length ranging from a single word, phrase or 4-5 sentences Supports both summative and formative assessment Perform well for test that solicit specific information from student Perform poor for open-ended task
20
Model of correct answer provided by the content expert C-rater goal Student response model Model is manual but mapping a automatic The difficulty The question is designed to elicit from students one or more concepts that constitute the correct answer There are several no of ways that a concept can be realized in natural language The solution correct responses are paraphrases of the model answer
21
Try to model human graders with following normalization Syntactic variation Pronoun reference Morphological variation Synonymous words Typographical and spelling errors
22
Content assessment Content Vector Analysis ▪ Vector space model Semantics based assessment ▪ Latent Semantic Analysis Meaning/Concept assessment Paraphrasing and textual entailment Organizational assessment Argument structure mining Discourse structure analysis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.