QA Systems in QALD Hybrid Task

Slides:



Advertisements
Similar presentations
1 gStore: Answering SPARQL Queries Via Subgraph Matching Presented by Guan Wang Kent State University October 24, 2011.
Advertisements

Leveraging Community-built Knowledge For Type Coercion In Question Answering Aditya Kalyanpur, J William Murdock, James Fan and Chris Welty Mehdi AllahyariSpring.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Automatic Discovery of Technology Trends from Patent Text Youngho Kim, Yingshi Tian, Yoonjae Jeong, Ryu Jihee, Sung-Hyon Myaeng School of Engineering Information.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.
Logics for Data and Knowledge Representation SPARQL Protocol and RDF Query Language (SPARQL) Feroz Farazi.
CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.
Open Information Extraction using Wikipedia
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Talk Schedule Question Answering from Bryan Klimt July 28, 2005.
MedKAT Medical Knowledge Analysis Tool December 2009.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Natural Language Interfaces to Ontologies Danica Damljanović
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
An Effective SPARQL Support over Relational Database Jing Lu, Feng Cao, Li Ma, Yong Yu, Yue Pan SWDB-ODBIS 2007 SNU IDB Lab. Hyewon Lim July 30 th, 2009.
CC L A W EB DE D ATOS P RIMAVERA 2015 Lecture 7: SPARQL (1.0) Aidan Hogan
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
From Frequency to Meaning: Vector Space Models of Semantics
Ensembling Diverse Approaches to Question Answering
Event Detection and Opinion Mining
Measuring Monolinguality
Linguistic Graph Similarity for News Sentence Searching
CC La Web de Datos Primavera 2017 Lecture 7: SPARQL [i]
Query Reformulation & Answer Extraction
Clustering of Web pages
A Brief Introduction to Distant Supervision
Web News Sentence Searching Using Linguistic Graph Similarity
Reading Report on Hybrid Question Answering System
Natural Language Processing (NLP)
Relaxed Query Graph for Question Answering in QALD-5
Web IR: Recent Trends; Future of Web Search
ESWC’14 龚赛赛.
Logics for Data and Knowledge Representation
Traditional Question Answering System: an Overview
CC La Web de Datos Primavera 2016 Lecture 7: SPARQL (1.0)
问句理解示例 瞿裕忠.
Lecture 9: Semantic Parsing
Table Cell Search for Question Answering Huan Sun
Introduction Task: extracting relational facts from text
Review-Level Aspect-Based Sentiment Analysis Using an Ontology
Automatic Detection of Causal Relations for Question Answering
Reading Report on Question Answering
CSE 635 Multimedia Information Retrieval
Question Answering & Linked Data
Automatic Extraction of Hierarchical Relations from Text
CS246: Information Retrieval
Natural Language Processing (NLP)
Question Answer System Deliverable #2
Template-based Question Answering over RDF Data
Introduction to Artificial Intelligence
wikiKnows a Qustion Answering System based on Wikipedia Knowledge
Natural Language Processing (NLP)
Presentation transcript:

QA Systems in QALD Hybrid Task Qingxia Liu 2016/02/29

an Overview of the Hybrid Task QALD-4 task3 25 training, 10 test, no system participate Which anti-apartheid activist was born in Mvezo? SELECT DISTINCT ?uri WHERE { ?uri rdf:type text:"anti-apartheid activist" . ?uri dbo:birthPlace res:Mvezo . } Which writers had influenced the philosopher that refused a Nobel Prize? ?x rdf:type dbo:Philosopher . ?x text:"refuse" text:"Nobel Prize" . ?x dbo:influencedBy ?uri . ?uri rdf:type dbo:Writer .

an Overview of the Hybrid Task QALD-5 task2 40 training, 10 test, 5 systems

ISOFT Authors Seonyeong Park, Soonchoul Kwon, Byungsoo Kim and Gary Geunbae Lee Pohang University of Science and Technology, South Korea Idea semantic answer type detection search on multi-tagged text database decompose question by contract right-most phrases 朴善英,浦项工科大学 SAT: train 3-level type ontology: 71.42% 2: 84.62%

if no good answer, generate SPARQL comparative -> prop + compare SAT classification: features kw, literal AT, (not use NE) libSVM training tags: coreference and disambiguation NL tags(POS, dependency, SRL) for each subquery search on sentences all NEs as candidates if no good answer, generate SPARQL comparative -> prop + compare predicate mapping by vector ESA lib 基于疑问词对问题进行分类

ISOFT Who is the architect of the tallest building in Japan? Are there man-made lakes in Australia that are deeper than 100 meters? In which city were Charlie Chaplin's half brothers born?

HAWK: ranking techniques Authors Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo University of Leipzig, Germany Axel-Cyrille Ngonga Ngomo among the organizers in QALD-3,4,5,6 Main Idea Generate hybrid queries by BFS search on dependency tree each node to a triple-pattern (at least one variable) Rank queries by training 3 ranking methods

Which anti-apartheid activist was born in Mvezo? Training: Optimal Ranking (determine bad parts) Feature-based Ranking (rank queries) Overlap-based Ranking (rank answers) Which anti-apartheid activist was born in Mvezo? remove meaningless nodes in the tree by POS-tags auxiliary tokens, e.g. did link classes, properties fuzzy string search use lexicon (lemon.dbpedia) born ( anti-apartheid activist, dbr:Mvezo) “born”: dbo:birthPlace, dbo:birthDate 1. SELECT ?proj {?proj text:query ’anti-apartheid activist’. ?proj dbo:birthPlace dbr:Mvezo.} 2. SELECT ?proj {?proj text:query ’anti-apartheid activist’. ?proj dbo:birthDate dbr:Mvezo.g} 3. SELECT ?proj {?proj text:query ’anti-apartheid activist’. ?const dbo:birthPlace ?proj.} BFS search on pruned tree: possible triple patterns according to node tag Pruning by features: unbound triple pattern e.g. (?s, ?p, ?o) unconnected cyclic no projection violating disjointness ESWC 2015 反种族隔离的,曼德拉 FUSEKI: 支持特定uri上文本信息的检索 feature-based ranking: 节点数、triple数、type数

HAWK Optimal Ranking > overlap-based , feature-based Errors: 不考虑yago: 26/40, 8/10, train on QALD-4 F-measure Errors: failing entity annotation e.g. Jane_T._Austion, G8, Los_Alamos missing type information in the gold standard resource

YodaQA: search-based methods Authors Petr Baudiš and Jan Šedivy Czech Technical University, Czech Republic NLP, ML, speech recognition Main Idea A modular QA system pipeline Search –based methods generate answers by passage analysis Tricks Goal: factoid single-answer questions List-questions: top15 Boolean questions:always return true Extract focus: 6 simple hand-crafted heuristics 原本是基于非结构化数据, 捷克 篇章(这里即句子)

Who wrote Ender’s Game? Title Text Search: title matching (logistic regression) (get related sentences) (type coerion) (class, prop mapping) Focus: who, SV: wrote, LAT: person question Cluse: Ender’s Game, wrote title: 对实体排序,取句子; full-text:先对实体排序并取top6,再对各句子打分,取各文档内top3; doc:直接取实体; lexical answer type (LAT) type coerion: wordNet hypernymy relation Who wrote Ender’s Game? Title Text Search: title matching (first sentences in top 6 docs) Full-text Search: title + article sentences (top 3 passages in each doc in top 6 docs) Document Search: search in wiki doc (top 20 docs)

Errors: missing type information in the gold standard resource

Error Cases id ISOFT HAWK 1 Where was the "Father of Singapore" born? (unable to generate SPARQL query) 2 Which Secretary of State was significantly involved in the United States' dominance of the Caribbean? (wrong answer) 3 Who is the architect of the tallest building in Japan? √ 4 What is the name of the Viennese newspaper founded by the creator of the croissant? (predicate URI not found) 5 In which city were Charlie Chaplin's half brothers born? partial 6 Which German mathematicians were members of the von Braun rocket group? 7 Which writers converted to Islam? 8 Are there man-made lakes in Australia that are deeper than 100 meters? 9 Which movie by the Coen brothers stars John Turturro in the role of a New York City playwright? 10 Which of the volcanoes that erupted in 1550 is still active? (cannot generate appropriate query) property How many scientists graduated from an Ivy League university? SELECT DISTINCT count (? uri) WHERE { ?uri rdf: type dbo: Scientist . ?uri dbo: almaMater ? university . ? university dbo: affiliation dbr:Ivy_League . } Which animals are critically endangered? SELECT DISTINCT ? uri WHERE { ?uri rdf: type dbo: Animal . ?uri dbo: conservationStatus 'CR ' . }

Summary Components ISOFT HAWK YodaQA basic NLP deep NLP dependency tree keywords, focus, LAT deep NLP co-reference textual solution phrase-based search text:query clue-based search structural solution SPARQL template trigged by words dependency tree node to triple class, property mapping question decomposition concatenating the two rightmost phrases and find answer errors missing type info in gold standard resource failing entity annotation

Thank you ~