Answer Validation Exercise Anselmo Peñas UNED NLP Group 2005 Breakout session.

Slides:



Advertisements
Similar presentations
Capítulo 11: Presentación de Programas (Optional) partner project.
Advertisements

October 6, Parte A (15 pts.) Match the illustration with the situation you hear. 1. ______ Parte B (15 pts.) Choose the best answer to the question.
Comparative and Superlative Adjectives
Add title Add a picture or a description. Question 1(True) Click to type your question here I have made this an easy True/false question true False Next.
Statements Yes/No Questions Interrogative questions
Giving Feedback. The right and the wrong. >> giving feedback
Testing Grambulary. Items  Matching – identify the the type of item from the list on your right. 1. Your dad is a (an) ______ A. male, B. female, C.firefighter.
Definite and Indefinite Articles
Evaluating Hierarchical Clustering of Search Results Departamento de Lenguajes y Sistemas Informáticos UNED, Spain Juan Cigarrán Anselmo Peñas Julio Gonzalo.
Question Answering for Machine Reading Evaluation Evaluation Campaign at CLEF 2011 Anselmo Peñas (UNED, Spain) Eduard Hovy (USC-ISI, USA) Pamela Forner.
ResPubliQA 2010: QA on European Legislation Anselmo Peñas, UNED, Spain Pamela Forner, CELCT, Italy Richard Sutcliffe, U. Limerick, Ireland Alvaro Rodrigo,
Definite and Indefinite Articles
A Module of Purdue University’s LeadingEdge Program
1 CLEF 2011, Amsterdam QA4MRE, Question Answering for Machine Reading Evaluation Question Answering Track Overview Main Task Anselmo Peñas Eduard Hovy.
Checking contents from last year Presente simple: forma interrogativa Interrogativa/ Respuesta corta: Do I live in Tenerife? Yes, you do. / No, you don’t.
CLEF 2008 Multilingual Question Answering Track UNED Anselmo Peñas Valentín Sama Álvaro Rodrigo CELCT Danilo Giampiccolo Pamela Forner.
3rd Answer Validation Exercise ( AVE 2008) QA subtrack at Cross-Language Evaluation Forum 2008 UNED Anselmo Peñas Álvaro Rodrigo Felisa Verdejo Thanks.
CLEF 2007 Multilingual Question Answering Track Danilo Giampiccolo, CELCT Anselmo Peñas, UNED.
Quasimodo:  Your team for review games today will be your row. Organize the desks into a group. Make sure that there is room between groups to run to.
Final Exam Questions part 5 ¿Qué no te gusta hacer? Press the speaker icon to hear the question. Repeat the question. Ms. Lincoln’s answer would be:
Para hacer ahora Write the following times in Spanish 1) 2:11 in the afternoon 2) 4:23 in the morning 3) 9:09 at night 4) 12:03 in the afternoon 5) 1:16.
Spanish Contractions and how to say “to the” and “of the”
Short Course on Introduction to Meteorological Instrumentation and Observations Techniques QA and QC Procedures Short Course on Introduction to Meteorological.
Quality Manual for Interoperability Testing Morten Bruun-Rasmussen Presented by Jos Devlies, Eurorec.
Assignment Thank you for participating in the On-Line Learning Mastery Series. The knowledge you gain and the application of the skills you’ve acquired.
Spanish Question Answering Evaluation Anselmo Peñas, Felisa Verdejo and Jesús Herrera UNED NLP Group Distance Learning University of Spain CICLing 2004,
SESSION ONE PERFORMANCE MANAGEMENT & APPRAISALS.
5 Criteria of Performance Measures
Answer Validation Exercise - AVE QA subtrack at Cross-Language Evaluation Forum 2007 UNED (coord.) Anselmo Peñas Álvaro Rodrigo Valentín Sama Felisa Verdejo.
¿Qué es Title 1? Es una fórmula de beca federal. Es la beca federal más grande que recibe Fayette County Public Schools.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Answer Validation Exercise - AVE QA subtrack at Cross-Language Evaluation Forum UNED (coord.) Anselmo Peñas Álvaro Rodrigo Valentín Sama Felisa Verdejo.
1 DoQuP standards for quality assurance of study programmes in partner countries Project strategic outcomes at Qafqaz University AZERBAIJAN Sannur Aliyev.
 A -  B -  C -  D - Yes No Not sure.  A -  B -  C -  D - Yes No Not sure.
ISO 9001 – an overview Tor Stålhane IDI / NTNU. ISO 9001 and software development ISO 9001 is a general standard – equally applicable to software development.
$1,000,000 $500,000 $100,000 $50,000 $10,000 $5000 $1000 $500 $200 $100 Is this your Final Answer? YesNo Question 2? Correct Answer Wrong Answer.
CLEF 2009 Workshop Corfu, September 30, 2009  ELDA 1 Overview of QAST Question Answering on Speech Transcriptions - J. Turmo, P. R. Comas,TALP.
Thinking Mathematically
Department of Healthcare organization and Medical Law YSMU YSMU Center of education quality assessment and assurance A. Markosyan L.Avetisyan A. Mkrtchyan.
By: Reem Addam 8E. Dr. Florian Triebel created the BMW logo design and is from Germany.
QA Pilot Task at CLEF 2004 Jesús Herrera Anselmo Peñas Felisa Verdejo UNED NLP Group Cross-Language Evaluation Forum Bath, UK - September 2004.
Evaluating Answer Validation in multi- stream Question Answering Álvaro Rodrigo, Anselmo Peñas, Felisa Verdejo UNED NLP & IR group nlp.uned.es The Second.
Assuring Learning SUCCESS. Lesson content 1. Research: Q&A 2. Results summary 3. The ’’Perfect Webinar Session’’ 4. Discussion.
CLEF 2008 Workshop Aarhus, September 17, 2008  ELDA 1 Overview of QAST Question Answering on Speech Transcriptions - J. Turmo, P. Comas (1), L.
Factorization : Difference of Two Squares and Perfect Squares.
MODULE FOUR QUIZ. CHOOSE THE STATEMENT WHICH IS FALSE MODULE FOUR QUIZ DIGITAL MARKETING CONSULTING & TRAINING
System Analysis and Design Copyright © Genetic Computer School 2007 SAD14-1 CHAPTER OVERVIEW  Overview Of System Maintenance  Tasks Performed During.
DIGITAL MARKETING CONSULTING & TRAINING ©JASMINE SANDLER MODULE TWO QUIZ.
MODULE Three QUIZ. CHOOSE THE STATEMENT WHICH IS FALSE MODULE THREE QUIZ DIGITAL MARKETING CONSULTING & TRAINING
SINGULARIDAD TECNOLOGICA RIVERA TORRES BORIS. MANIFESTACIONES EXPLOSION DE LA INTELIGENCIA LIMITADA INTELIGENCIA DEL CEREBRO HUMANO Y AVANCE TECNOLOGICO.
Order of Operations Oral examples x 5 x x 5 ÷ ÷ 5 x ÷ 5 ÷ 2 9.
Module 6: Business Application Software Audit Chapter 1: Business Application Software Audit 1.
CLEF Budapest1 Measuring the contribution of Word Sense Disambiguation for QA Proposers: UBC: Agirre, Lopez de Lacalle, Otegi, Rigau, FBK: Magnini.
Objectives Overview of the QC Learning Lab Enrolling in online courses
A 30-Second Training Career Development and Assessment March 2013
Rule Exercises Status of the Ball Definitions and Rule 15
Rule Exercises Status of the Ball Definitions and Rule 15
Module 17 The little Quiz! June 2018.
Presentation and Evaluation
La télé LEARNING OBJECTIVE: to talk about tv programmes and express whether you like them or not SUCCESS CRITERIA: Grade D+ detailed description of tv.
Find the reference angle for the angle measuring {image}
STAR RCAN UPDATE Thank you to 2011 Fall Field test participants!
UNED Anselmo Peñas Álvaro Rodrigo Felisa Verdejo Thanks to…
Welcome to Cyber Recruiter – Utilizing Standard Reports
Actions on Data validation
The Correlation between Relative Frequencies of Roman Numerals in Practice AP Questions and their Answers Joshua Smith.
Hybrid Electric Car Market Hybrid Electric Car Market Overview by Type, Battery and Geography: Industry.
Warm-Up Problems Unit 0.
True or False True or False
CLEF 2008 Multilingual Question Answering Track
Presentation transcript:

Answer Validation Exercise Anselmo Peñas UNED NLP Group 2005 Breakout session

QA System Question DOC + Answer Human assessment Correct What is BMW? Bayerische Motoren Werke

Answer validation What is BMW? Bayerische Motoren Werke + DOC According to DOC, Is it true the following? BMW is Bayerische Motoren Werke YES / NO Similar to a closed question: Is BMW Bayerische Motoren Werke?

Exercise 1. Take all answers of all runs (in the lang) 2. Reformulate as statements 3. Give them to the participant systems 4. Perform the automatic validation Is the statement true according to the document? 5. Compare the automatic validation with the human assessment

Why? A good answer validation can improve systems performance Example: INAOE achieves 80% acuracy in Definitions Expected improvement in systems self-scoring Better criteria for collaborative systems 40% accuracy alone versus 70% accuracy in perfect combination Feedback for a system’s component Deal with closed questions (and may be inference) Newcomers can start with a single module

Training data assessments ¿Qué es BMW? BMW es Bayerische Motoren Werke un tradicional fabricante de automviles grandes Bayerische Motoren Werke incertidumbre sobre el futuro de la firma británica Ricardo Garca Galiano Luis Martnez Osella ¿Qué son las FARC?

QA System Question DOC + Answer Human assessment What is BMW? Uncertain about the future of the company Given DOC, Is it true the following? BMW is uncertain about the future of the company wrong But... Sure?

Thanks!