Ontology-Based Argument Mining and Automatic Essay Scoring Nathan Ong, Diane Litman, Alexandra Brusilovsky University of Pittsburgh First Workshop on Argumentation.

Slides:



Advertisements
Similar presentations
Elements of an Argument
Advertisements

Using Syntax to Disambiguate Explicit Discourse Connectives in Text Source: ACL-IJCNLP 2009 Author: Emily Pitler and Ani Nenkova Reporter: Yong-Xiang Chen.
Finding Topic-sensitive Influential Twitterers Presenter 吴伟涛 TwitterRank:
Testing Theories: Three Reasons Why Data Might not Match the Theory.
Ke Liu1, Junqiu Wu2, Shengwen Peng1,Chengxiang Zhai3, Shanfeng Zhu1
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives 1 Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
AIDEN YEH, PH.D. WENZAO URSULINE COLLEGE OF LANGUAGES Writing from Research researchwriting2012.pbworks.com/w/file/fetch/ /Researchwriting1.ppt.
Context-Enhanced Citation Sentiment Analysis Awais Athar & Simone Teufel.
English 345 Essay Workshop.  Clear sense of why writers have selected two specific films for comparison; analyzing films together allows writer/readers.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.
1 Do Summaries Help? A Task-Based Evaluation of Multi-Document Summarization Kathleen McKeown, Rebecca Passonneau, David Elson, Ani Nenkova, Julia Hirschberg.
Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.
Faculty of Computer Science © 2006 CMPUT 605March 31, 2008 Towards Applying Text Mining and Natural Language Processing for Biomedical Ontology Acquisition.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
1 Towards a manipulative mediator Lecture for Statistical Methods (89-326) Yehoshua (Yoshi) Gev Joint work with: S. Kraus, M. Gelfand,
Correlational Designs
Computer Science: A Structured Programming Approach Using C1 3-7 Sample Programs This section contains several programs that you should study for programming.
Mining and Summarizing Customer Reviews
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
Annotated bibliographies
Computational Methods to Vocalize Arabic Texts H. Safadi*, O. Al Dakkak** & N. Ghneim**
Call to Write, Third edition Chapter Twelve, The Research Process: Critical Essays and Research Papers.
Using Computational Linguistics to Support Students and Teachers during Peer Review of Writing Diane Litman Professor, Computer Science Department Senior.
POSTER TEMPLATE BY: om Sex Differences in Associations between Fear of Negative Evaluation (FNE) and Substance Use Lesley A.
EERQI Final Conference, Brussels, March 2011 This project is funded by the Socioeconomic Sciences and Humanities Section. Interrelations Of Indicators.
Learning to Predict Readability using Diverse Linguistic Features Rohit J. Kate 1 Xiaoqiang Luo 2 Siddharth Patwardhan 2 Martin Franz 2 Radu Florian 2.
Improving Learning from Peer Review with NLP and ITS Techniques (July 2009 – June 2011) Kevin Ashley Diane Litman Chris Schunn.
The Impact of Including Predictors and Using Various Hierarchical Linear Models on Evaluating School Effectiveness in Mathematics Nicole Traxel & Cindy.
Chris Luszczek Biol2050 week 3 Lecture September 23, 2013.
The Computational Linguistics Summarization Pilot TAC 2014 Kokil Jaidka †, Muthu Kumar Chandrasekaran* ‡, Min-Yen Kan* ‡, Ankur Khanna ‡ Nanyang.
Relationship between Physics Understanding and Paragraph Coherence Reva Freedman November 15, 2012.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
L 1 Chapter 12 Correlational Designs EDUC 640 Dr. William M. Bauer.
A Comparison of Statistical Significance Tests for Information Retrieval Evaluation CIKM´07, November 2007.
Statistical Analysis. Statistics u Description –Describes the data –Mean –Median –Mode u Inferential –Allows prediction from the sample to the population.
1 Automatic Essay Scoring is Here and Now Online Welcome to CIT S234 Gary Greer University of Houston Downtown & Michelle Overstreet The College Board.
Intro: “BASIC” STATS CPSY 501 Advanced stats requires successful completion of a first course in psych stats (a grade of C+ or above) as a prerequisite.
Peer review systems, e.g. SWoRD [1], need intelligence for detecting and responding to problems with students’ reviewing performance E.g. problem localization.
Collaborative Research: Monitoring Student State in Tutorial Spoken Dialogue Diane Litman Computer Science Department and Learning Research and Development.
Dependency Parser for Swedish Project for EDA171 by Jonas Pålsson Marcus Stamborg.
Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011.
Tokenization & POS-Tagging
Using Artificial Intelligence to Support Peer Review of Writing Diane Litman Department of Computer Science, Intelligent Systems Program, & Learning Research.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Workshop Activity Comparative politics: How do we compare political systems?
6 Making Sense of Statistical Significance.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
1 Jong Hee Kang, William Welbourne, Benjamin Stewart, Gaetano Borriello, October 2004, Proceedings of the 2nd ACM international workshop on Wireless mobile.
1 Fuzzy Versus Quantitative Association Rules: A Fair Data-Driven Comparison Shih-Ming Bai and Shyi-Ming Chen Department of Computer Science and Information.
All AP essays are written in response to an essay “prompt.” Understanding what this prompt asks you to do is the first important skill you need to acquire.
The Comparative Essay Pay Attention well... For this is the format you should follow for this Essay & possibly your Exam.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
Prior Learning Assessment (PLA) Pilot Project At VSU Prepared by the PLA Assessors Group.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Computer Science and Engineering PhD in Computer Science Monday, November 07, :00 a.m. – 11:00 a.m. Swearingen Conference Room 3A75 Network Based.
Sample paper in APA style Sample paper in APA style.
Assessing Students' Understanding of the Scientific Process Amy Marion, Department of Biology, New Mexico State University Abstract The primary goal of.
Web News Sentence Searching Using Linguistic Graph Similarity
Reading & Writing assignments in Chemistry
Temporal Argument Mining for Writing Assistance
Formation of relationships Matching Hypothesis
AIDEN YEH, PH.D. WENZAO URSULINE COLLEGE OF LANGUAGES
The LEq AP World History
Argument Essay Notes.
Presentation transcript:

Ontology-Based Argument Mining and Automatic Essay Scoring Nathan Ong, Diane Litman, Alexandra Brusilovsky University of Pittsburgh First Workshop on Argumentation Mining (52 nd ACL) June 26, 2014

ArgumentPeer Project (w/ Kevin Ashley & Chris Schunn) Teach Writing and Argumentation with AI- Supported Diagramming and Peer Review – Diagrammatic Argument Outlines (via LASAD) – Argumentative/Persuasive Essays (via SWoRD) – Peer review of both diagrams and essays (via SWoRD) Allocate to computers and humans the tasks that each does best

Argument Mining in ArgumentPeer Expert defines diagram ontology – Current Study, Hypothesis, Opposes, Supports, Claim, Citation System recognizes diagram ontology elements in associated essays System scores essays based on recognized ontology elements

Corpus 52 first-draft essays from two undergraduate psychology courses – Written after diagramming and peer-feedback – Average length: 5.2 paragraphs, 28.6 sentences – Expert scores: Average = 3.03

Argument Mining I/O Current Study Claim Citation Hypothesis Supports Opposes 5

Essay Processing Pipeline 1.Discourse Processing – Tag essays with discourse connective senses – Expansion, Contingency, Comparison, Temporal Tagger from UPenn 2.Argument Ontology Mining – Tag essays with diagram ontology elements Rule-based algorithm 3.Ontology-Based Scoring – Use the mined argument to score the essays Rule-based algorithm

Example of Argument Mining This is the first sentence of the example essay Tagged as Current Study

Ordered Rule Applications Rule 1: Opposes Does the sentence begins with a Comparison discourse connective? – no Does the sentence contains any of the string prefixes from {conflict, oppose} and a four- digit number (intended as a year for a citation)? – no

Example Ontology tag Rule 6 (broken down, yes to all questions): Current Study Is the sentence is in the first or last paragraph? Does the sentence contains at least one word from {study, research}? Does the sentence not contain the words from {past, previous, prior} (first letter case-insensitive)? Does the sentence not contain the string prefixes from {hypothes, predict}? Does the sentence not contain a four-digit number?

Computing the Score 10

Scoring Example In this document: 3 Current Study 3 Hypothesis 1 Opposes 1 Supports 2 Claim 3 Citation CStudy = 1 Hyp = 1 Op = 1 SupOrClaim = 1 Cite = 1 AutoScore = 5 Expert score = 3 11

Experimental Results Hypotheses – Automatically generated scores should be similar to expert scores – Automatically generated scores should correlate with expert scores Evaluation – extrinsic evaluation of argument mining via essay scoring

Results One sample T-Test: Automatic scores are generally significantly different from expert scores Algorithm tends to overscore 13 Expert ScoreAverageT-valuenP-value

Results Spearman Correlation between automatically generated and expert scores is significant Thus, scores can be ranked However, Pearson Correlation is not significant 14 rho p2.313E-59

Conclusions Hypothesis 2 (automatically generated scores should correlate with expert scores): supported – number of automatically generated tags for diagram elements are positively correlated with score Hypothesis 1 (automatically generated scores should be similar to expert scores): not supported – the scoring algorithm, ontology-recognition algorithm, or both, are currently not good enough 15

Future Work Improve ontology-mining and scoring algorithms – Parsing more discourse information (e.g. PDTB, RST) – Exploiting the diagrams directly – Data-driven algorithm development Intrinsic as well as extrinsic evaluation – Newly annotated essay corpus

Questions? Acknowledgements – National Science Foundation More Information –