Machine Reading.

Slides:



Advertisements
Similar presentations
1 A proposed approach to developing indicators Use the Strategic Targets document as the basis –Recent; explicitly addresses outcomes; relatively concise.
Advertisements

Michigan Department of Education School Improvement Plan SIP February 2011.
Validating and Improving Test- Case Effectiveness Yuri Chernak Presented by Michelle Straughan.
Donald T. Simeon Caribbean Health Research Council
SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Large-Scale Entity-Based Online Social Network Profile Linkage.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
1 CLEF 2012, Rome QA4MRE, Question Answering for Machine Reading Evaluation Anselmo Peñas (UNED, Spain) Eduard Hovy (USC-ISI, USA) Pamela Forner (CELCT,
1 CLEF 2011, Amsterdam QA4MRE, Question Answering for Machine Reading Evaluation Question Answering Track Overview Main Task Anselmo Peñas Eduard Hovy.
1 Question Answering in Biomedicine Student: Andreea Tutos Id: Supervisor: Diego Molla.
KnowItNow: Fast, Scalable Information Extraction from the Web Michael J. Cafarella, Doug Downey, Stephen Soderland, Oren Etzioni.
3rd Answer Validation Exercise ( AVE 2008) QA subtrack at Cross-Language Evaluation Forum 2008 UNED Anselmo Peñas Álvaro Rodrigo Felisa Verdejo Thanks.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Introduction to Machine Learning Approach Lecture 5.
LABDA Group – Carlos III University of Madrid Validation of a XBRL Document Instance in a RDBMS, Proof of Concept. 15th EuroFiling Workshop: International.
Answer Validation Exercise - AVE QA subtrack at Cross-Language Evaluation Forum 2007 UNED (coord.) Anselmo Peñas Álvaro Rodrigo Valentín Sama Felisa Verdejo.
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( Bridging Languages for Question Answering: DIOGENE at CLEF-2003.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( The Multiple Language Question Answering Track at CLEF 2003.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Understanding the Decision to Participate in a Survey and the Choice of the Response Mode Anders Holmberg and Boris Lorenc European Conference on Quality.
INTRODUCTION TO RESEARCH. Learning to become a researcher By the time you get to college, you will be expected to advance from: Information retrieval–
Answer Validation Exercise - AVE QA subtrack at Cross-Language Evaluation Forum UNED (coord.) Anselmo Peñas Álvaro Rodrigo Valentín Sama Felisa Verdejo.
A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.
Interactive Probabilistic Search for GikiCLEF Ray R Larson School of Information University of California, Berkeley Ray R Larson School of Information.
A Language Independent Method for Question Classification COLING 2004.
CLEF 2007 Workshop Budapest, September 19, 2007  ELDA 1 Overview of QAST Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1),
Lectures ASSESSING LANGUAGE SKILLS Receptive Skills Productive Skills Criteria for selecting language sub skills Different Test Types & Test Requirements.
C. Lawrence Zitnick Microsoft Research, Redmond Devi Parikh Virginia Tech Bringing Semantics Into Focus Using Visual.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
QA Pilot Task at CLEF 2004 Jesús Herrera Anselmo Peñas Felisa Verdejo UNED NLP Group Cross-Language Evaluation Forum Bath, UK - September 2004.
Evaluating Answer Validation in multi- stream Question Answering Álvaro Rodrigo, Anselmo Peñas, Felisa Verdejo UNED NLP & IR group nlp.uned.es The Second.
DeepDive Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
EVALUATION HELPDESK Quality assessment of Evaluation Plans Evaluation Network Meeting 5-6 November 2015.
AQUAINT AQUAINT Evaluation Overview Ellen M. Voorhees.
EVALUATION AND SELFASSESSMENT SYSTEMS FOR THE VIRTUAL TEACHING: MULTIPLE CHOICE QUESTIONNAIRES Claudio Cameselle, Susana Gouveia Departamento de Enxeñería.
READING ACROSS DISCIPLINES Caroline Gordon Messenger, English Kelly Leary, Social Studies The Common Core and construction of meaning.
Textual Analysis Introduction. What is Textual Analysis? Textual Analysis, as the name suggests, involves the Analysis of a literary Text. It is very.
Overview Introduction to marketing research Research design Data collection Data analysis Reporting results.
New Advanced Higher Subject Implementation Events English: Unit Assessment at Advanced Higher.
A Trainable Multi-factored QA System Radu Ion, Dan Ştefănescu, Alexandru Ceauşu, Dan Tufiş, Elena Irimia, Verginica Barbu-Mititelu Research Institute for.
COMMUNICATION ENGLISH III September 27/28 th 2012.
Writing Workshop
READING COMPREHENSION
Object-Oriented Software Engineering Using UML, Patterns, and Java,
ACT Preview.
Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao.
Business and Management Research
Preparing for the Verbal Reasoning Measure
Social Knowledge Mining
PERFORMANCE ASSESSMENT
Prepared by: Mahmoud Rafeek Al-Farra
Discovering Logical Knowledge for Deep Question Answering
iSRD Spam Review Detection with Imbalanced Data Distributions
Automatic Detection of Causal Relations for Question Answering
Business and Management Research
ADVERTISING UNIT The Art of Selling.
FCE (FIRST CERTIFICATE IN ENGLISH) General information.
Summarization for entity annotation Contextual summary
What is the Entrance Exams Task
UNED Anselmo Peñas Álvaro Rodrigo Felisa Verdejo Thanks to…
The quality of choices determines the quantity of Key words
Information Retrieval
STEPS Site Report.
Soft System Stakeholder Analysis
Presentation transcript:

Machine Reading

Introduction Definition: a task that deals with the automatic understanding of texts the “KnowItAll” research group at the University of Washington Target: understanding text

Related work Question Answering Information Extraction utilize supervised learning techniques, which rely on hand-tagged training examples Information Extraction Often utilizes extraction rules learned from example extractions of each target relation Impractical to generate a set of hand-tagged examples

Difference Fact MR the relations are not known in advance impractical to generate hand-tagged examples of each relation of interest. MR Inherently unsupervised forging and updating connections between beliefs vs. focusing on isolated “nuggets” obtained from text

Reference [1] Etzioni, Oren, Michele Banko, and Michael J. Cafarella. "Machine Reading."AAAI. Vol. 6. 2006.

QA4MRE In 2011, 2012, 2013

Summary Main objective Develop a methodology for evaluating Machine Reading systems through Question Answering and Reading Comprehension Tests. Systems should be able to extract knowledge from large volumes of text and use this knowledge to answer questions. The methodology should allow the comparison of systems' performance and the study of the best approaches.

QA4MRE Organization, tasks, participants, results, winner’s methods ..

CLEF 2011 Host Time Place the University of Amsterdam The Netherlands  19-22 September 2011 Place Amsterdam, the Kingdom of the Netherlands

CLEF 2012 Host Time Place the University "La Sapienza"  17-20 September 2012  Place Rome, Italy

CLEF 2013 Host Time Place the Technical University of Valencia  23-26 September 2013   Place Spain

The Task Reading of single documents and the identification of the answers to a set of questions In the form of multiple choice Only one correct answer Require both semantic understanding and reasoning process

Requirements Understand the test questions Analyze the relation among entities contained in questions and entities expressed by the candidate answers Understand the information contained in the documents Extract useful pieces of knowledge from the background collections Select the correct answer from the five alternatives proposed.

Testing Data Available in several languages Arabic, Bulgarian, English, German, Italian, Romanian, and Spanish Test sets were divided into topics AIDS, Climate Change, Music, Society and Alzheimer’s disease Background collection, testing documents, candidate answers are provided

Background Collections 2011 2012&2013

Evaluation In order to improve results, systems might reduce the amount of incorrect answers while keeping the proportion of correct ones, by leaving some questions unanswered consider the possibility of leaving questions unanswered Responses: R W NoA NoA_R ( with a right hypothetical answer ) NoA_W ( with a wrong hypothetical answer )

Difference In 2013 Systems able to decide whether all candidate answers were incorrect or not should be rewarded over systems that just rank answers Introducing in our tests a portion of questions (39%) where none of the options are correct and including a new last option in all questions: “None of the above answers is correct” (NCA) (baseline should be 0.39) Obviously the answer NCA is better than NoA

Evaluation Measure nR: number of questions correctly answered. nU: number of questions unanswered. n: total number of questions Baseline:0.2

Evaluation Measure Secondary measure nR: number of questions correctly answered. nU: number of questions unanswered. n: total number of questions Baseline:0.2

Evaluation Measure Evaluate the validation performance nUR: number of unanswered questions whose candidate answer was correct nUW: number of unanswered questions whose candidate answer was incorrect nUE: number of unanswered questions whose candidate answer was empty

Overview of results

Results in 2011

Result in 2012

Result in 2013

Result in 2013

The Use of External Knowledge Run 1 : no external knowledge allowed but the background collections provided by the organization

Websites http://clef2011.clef-initiative.eu/index.php http://clef2012.clef-initiative.eu/index.php http://clef2013.clef-initiative.eu/index.php

Thank you