Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation Joe Ellis (presenter), Justin Mott, Xuansong Li, Jeremy Getman, Jonathan Wright, Stephanie.

Slides:



Advertisements
Similar presentations
IAC (ACCESS INTERFACE CORPUS) DEVELOPED BY BARCELONA MEDIA & UNIVERSITAT POMPEU FABRA TONI BADIA (BARCELONA MEDIA - UNIVERSITAT POMPEU FABRA) JUDITH DOMINGO.
Advertisements

Yansong Feng and Mirella Lapata
Feed Corpus : An Ever Growing Up to Date Corpus Akshay Minocha, Siva Reddy, Adam Kilgarriff Lexical Computing Ltd.
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
Text Analysis Conference Knowledge Base Population 2013 Hoa Trang Dang National Institute of Standards and Technology Sponsored by:
Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji, and.
Large-Scale Entity-Based Online Social Network Profile Linkage.
A Corpus for Cross- Document Co-Reference D. Day 1, J. Hitzeman 1, M. Wick 2, K. Crouch 1 and M. Poesio 3 1 The MITRE Corporation 2 University of Massachusetts,
 TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Creating the Annotated TDT-4 Y2003 Evaluation Corpus Stephanie Strassel, Meghan Glenn Linguistic.
Tri-lingual EDL Planning Heng Ji (RPI) Hoa Trang Dang (NIST) WORRY, BE HAPPY!
LingPipe Does a variety of tasks  Tokenization  Part of Speech Tagging  Named Entity Detection  Clustering  Identifies.
Linguistic Resources for the 2013 TAC KBP Sentiment SF Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic.
Mining Wiki Resources for Multilingual Named Entity Recognition Alexander E. Richman & Patrick Schone Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Linguistic Resources for the 2013 TAC KBP Slot Filling Evaluations Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic.
Tools and resources Summary of working group discussion.
WING Research Group Demos and Posters. Min-Yen Kan, Digital Libraries 22nd CSAIL MIT Workshop Demos SlideSeer (M.-Y. Kan) Coordinating presentation slides.
Ang Sun Ralph Grishman Wei Xu Bonan Min November 15, 2011 TAC 2011 Workshop Gaithersburg, Maryland USA.
OntoNotes/PropBank Participants: BBN, Penn, Colorado, USC/ISI.
4/14/20051 ACE Annotation Ralph Grishman New York University.
The Third Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition Gina-Anne Levow Fifth SIGHAN Workshop July 22, 2006.
 TDT PI Meeting - November 16-17, 2000 Annotation Overview  Background  annotation strategy search-guided complete annotation work with one topic at.
Geant4 Documentation and User Support Geant4 Users Workshop February 2002 Dennis Wright (SLAC)
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Use Case Modelling Visual Annotator for studying ICU Notes Bacchus Beale.
1 The Web as a Parallel Corpus  Parallel corpora are useful  Training data for statistical MT  Lexical correspondences for cross-lingual IR  Early.
Feed Corpus : An Ever Growing Up to Date Corpus Akshay Minocha, Siva Reddy, Adam Kilgarriff Lexical Computing Ltd.
 Official Site: facility.org/research/evaluation/clef-ip-10http:// facility.org/research/evaluation/clef-ip-10.
CEDROM-SNi’s DITA- based Project From Analysis to Delivery By France Baril Documentation Architect.
Disambiguation of References to Individuals Levon Lloyd (State University of New York) Varun Bhagwan, Daniel Gruhl (IBM Research Center) Varun Bhagwan,
Survey of Semantic Annotation Platforms
University of Sheffield, NLP Entity Linking Kalina Bontcheva © The University of Sheffield, This work is licensed under the Creative Commons.
INEX – a broadly accepted data set for XML database processing? Pavel Loupal, Michal Valenta.
A Two Tier Framework for Context-Aware Service Organization & Discovery Wei Zhang 1, Jian Su 2, Bin Chen 2,WentingWang 2, Zhiqiang Toh 2, Yanchuan Sim.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Enriching Word Alignment with Linguistic Tags Linguistic Data Consortium, IBM Xuansong Li, Niyu Ge, Stephen Grimes, Stephanie M. Strassel, Kazuaki Maeda.
Overview of the KBP 2012 Slot-Filling Tasks Hoa Trang Dang (National Institute of Standards and Technology Javier Artiles (Rakuten Institute of Technology)
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
Information Retrieval and Web Search Cross Language Information Retrieval Instructor: Rada Mihalcea Class web page:
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
Enhanced Infrastructure for Creation & Collection of Translation Resources Zhiyi Song, Stephanie Strassel (speaker), Gary Krug, Kazuaki Maeda.
Brava for Open Text ECM SharePoint Connector Copyright © Open Text Corporation All rights reserved. Slide 1.
Experiments of Opinion Analysis On MPQA and NTCIR-6 Yaoyong Li, Kalina Bontcheva, Hamish Cunningham Department of Computer Science University of Sheffield.
Detecting Dominant Locations from Search Queries Lee Wang, Chuang Wang, Xing Xie, Josh Forman, Yansheng Lu, Wei-Ying Ma, Ying Li SIGIR 2005.
TDT 2002 Straw Man TDT 2001 Workshop November 12-13, 2001.
Ang Sun Director of Research, Principal Scientist, inome
Information Retrieval at NLC Jianfeng Gao NLC Group, Microsoft Research China.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
MedKAT Medical Knowledge Analysis Tool December 2009.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
 TDT 2004 Evaluation Workshop, NIST, December 2-3, 2004 Creating the TDT5 Corpus and 2004 Evaluation Topics at LDC Stephanie Strassel, Meghan Glenn, Junbo.
Linguistic Resources for the 2013 TAC KBP Cold Start Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data.
Mining Wiki Resoures for Multilingual Named Entity Recognition Xiej un
Learning a Monolingual Language Model from a Multilingual Text Database Rayid Ghani & Rosie Jones School of Computer Science Carnegie Mellon University.
AQUAINT AQUAINT Evaluation Overview Ellen M. Voorhees.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
AQUAINT R&D Program Phase I Kickoff Workshop WELCOME.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
Linguistic Resources for the 2013 TAC KBP Temporal SF Evaluation Joe Ellis (presenter), Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data.
Cold-Start KBP Something from Nothing Sean Monahan, Dean Carpenter Language Computer.
ASSOCIATIVE BROWSING Evaluating 1 Jin Y. Kim / W. Bruce Croft / David Smith by Simulation.
Tri-lingual EDL for 2017 and Beyond
CS122B: Projects in Databases and Web Applications Winter 2017
A Multi-media Approach to Cross-lingual Entity Knowledge Transfer
Preliminaries CSCI-GA.2591
Terminology Extraction Tool (Auto/Semi-Auto)
Survey of Web Design Tools
X Ambiguity & Variability The Challenge The Wikifier Solution
Clustering Algorithms for Noun Phrase Coreference Resolution
Summarization for entity annotation Contextual summary
Corpus Statistics ACE2005/ACE2007 English EDR
Presentation transcript:

Linguistic Resources for the 2013 TAC KBP Entity Linking Evaluation Joe Ellis (presenter), Justin Mott, Xuansong Li, Jeremy Getman, Jonathan Wright, Stephanie Strassel Linguistic Data Consortium University of Pennsylvania, USA

2013 Source Corpus LanguageGenreDocuments English Newswire1,000,257 Web Text 999,999 Discussion Forums 99,063 Chinese Newswire 2,000,256 Web Text 815,886 Discussion Forums199,321 SpanishNewswire 910,734 TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Entity Linking Overview TAC KBP Evaluation Workshop – NIST, November 18-19, 2013 Stage 1: Select name strings and ref docs Stage 2: Link namestrings to KB or mark as NIL

Entity Linking Overview TAC KBP Evaluation Workshop – NIST, November 18-19, 2013 Stage 1: Select name strings and ref docs Stage 2: Link namestrings to KB or mark as NIL

Entity Linking Overview TAC KBP Evaluation Workshop – NIST, November 18-19, 2013 Stage 1: Select name strings and ref docs Stage 2: Link namestrings to KB or mark as NIL

Entity Linking Overview TAC KBP Evaluation Workshop – NIST, November 18-19, 2013 Stage 1: Select name strings and ref docs Stage 2: Link namestrings to KB or mark as NIL Stage 3: Co-reference NIL entities Wendy Wendy Gaxiola

Entity Linking Overview TAC KBP Evaluation Workshop – NIST, November 18-19, 2013 Stage 1: Select name strings and ref docs Stage 2: Link namestrings to KB or mark as NIL Stage 3: Co-reference NIL entities Wendy Wendy Gaxiola

Entity Linking – Stage 1  Run named entity taggers over source corpora Provides guided search through the corpus Thanks KBP coordinators!  Namestring Selection Confusable TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Entity Linking – Stage 1  Run named entity taggers over source corpora Provides guided search through the corpus Thanks KBP coordinators!  Namestring Selection Confusable TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Entity Linking – Stage 1  Run named entity taggers over source corpora Provides guided search through the corpus Thanks KBP coordinators!  Namestring Selection Confusable TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Entity Linking – Stage 1  Run named entity taggers over source corpora Provides guided search through the corpus Thanks KBP coordinators!  Namestring Selection Confusable Ambiguous Varied TAC KBP Evaluation Workshop – NIST, November 18-19, 2013 ?

Entity Linking – Stage 1  Run named entity taggers over source corpora Provides guided search through the corpus Thanks KBP coordinators!  Namestring Selection Confusable Ambiguous Varied TAC KBP Evaluation Workshop – NIST, November 18-19, 2013 ?

Entity Linking – Stage 1: Namestring Selection  Ratios NIL & non-NIL Entity types Genre  Measurable confusability Multiple-entity namestrings (“Smith”) Multiple-namestring entities (“Barack Obama”, “Bam-Bam”, “Bammy”) NIL singletons Cross-lingual TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Entity Linking – Stages 2 & 3: KB Linking and NIL Coref  KB Linking Review ref document and search KB for matching node Multiple namestrings viewed together for quicker linking  NIL Coreference NIL queries (no KB match) require manual co-reference annotation Time-limited quality control pass to enhance completeness and accuracy TAC KBP Evaluation Workshop – NIST, November 18-19, 2013

Delivered 2013 Resources TAC KBP Evaluation Workshop – NIST, November 18-19, 2013 Corpus TitleTypeLDC CatalogLanguageSize TAC 2013 KBP English Entity Linking Evaluation Queries and Knowledge Base Links EvaluationLDC2013E90English 803 GPE 686 PER 701 ORG TAC 2013 KBP Chinese Entity Linking Evaluation Queries and Knowledge Base Links EvaluationLDC2013E96 Chinese English 714 GPE 706 PER 735 ORG TAC 2013 KBP Spanish Entity Linking Evaluation Queries and Knowledge Base Links EvaluationLDC2013E97 Spanish English 660 GPE 695 PER 762 ORG