Download presentation
Presentation is loading. Please wait.
1
Discovering Emerging Entities with Ambiguous Names
2
Content Motivation Approach Experiments Architecture
Disambiguation Confidence Extended key phrase model Experiments
3
Motivation Emerging entities (EE):
Our world is highly dynamic. Every day, new songs are composed, new movies are released, new companies are founded, there are new weddings, sports matches… new entities may appear under the same names as existing ones: when hurricane “Sandy” occurred, several singers, cities, and other entities with the name “Sand” already existed in Wikipedia there could be multiple EE's, all out-of-KB, with the same name. NERD: named entity recognition and disambiguation
4
Motivation Prior methods
Threshold on the scores they computed for mapping a given mention to a candidate entities. In difficult situations, the empirical quality is not good[1] hard to tune in a robust manner Have adverse effect on other entity linking decisions
5
Approach Assessing the confidence of the NED method's mapping of mentions to in-KB entities perturbing the mention-entity space of the NED method Enriching a possible EE with a keyphrase representation builds a global set of keyphrases compute a model difference between the global model and the union of all in-KB models
6
Architecture NED based on AIDA[2] and KORE[3]
7
Disambiguation Confidence
Normalizing Scores Perturbing Mentions Perturbing Entities
8
Perturbing Entities
9
Extended keyphrase model
Keyphrases for Existing Entities In-KB entities: Wikipedia category, href anchor texts Harvesting keyphrases from document collections Only for the entities that high-confidence mentions are mapped to by the given NED method Extract all sequences conforming to a set of predefined part-of-speech tag patterns Mainly proper nouns and technical terms Modeling Emerging Entities Exploiting news streams mining EE-specific keyphrases from chunks of news articles are in the vicinity of the publication date and time of the input document Model difference
10
Experiments Disambiguation Confidence
by disposing the mentions below k% and computing the fraction of correctly disambiguated mentions with respect to the ground truth for the remaining mentions number of mention above K% confidence mean average precision(MAP)
11
Experiments Emerging Entity Discovery
150 Associated Press news articles published on October 1st and 150 published on November 1st, 2010 annotated with EE Wikipedia
12
Experiments Emerging Entity Discovery
D, the collection of documents; Gd, all unique mentions in document d belongs D annotated by a human annotator with a gold standard entity; Gd|EE, the subset of G annotated with an emerging entity EE; Gd|KB, the subset of G annotated with an with an existing, in-KB entity; Ad, all unique mentions in document d belongs D automatically annotated by a method. IW: Illinois Wikier linker 3,436 mentions, out of which 162 are both ambiguous and refer to an emerging entity.
13
Reference [1] B. Hachey, W. Radford, J. Nothman, M. Honnibal, and J. R. Curran. Evaluating Entity Linking with Wikipedia. Artificial Intelligence, 194(C): , 2013. [2] J. Hoart, M. A. Yosef, I. Bordino, H. Furstenau, M. Pinkal, M. Spaniol, B. Taneva, S. Thater, and G. Weikum. Robust Disambiguation of Named Entities in Text. EMNLP pages , 2011. [3] J. Hoart, S. Seufert, D. B. Nguyen, M. Theobald, and G. Weikum. KORE: Keyphrase Overlap Relatedness for Entity Disambiguation. CIKM pages , 2012. [4] Johannes Hoffart, Yasemin Altun, Gerhard Weikum. Discovering Emerging Entities with Ambiguous NamesIn. WWW 2014.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.