Relational Inference for Wikification Xiao Cheng and Dan Roth University of Illinois at Urbana-Champaign
Wikification Blumenthal (D) is a candidate for the U.S. Senate seat now held by Christopher Dodd (D), and he has held a commanding lead in the race since he entered it. But the Times report has the potential to fundamentally reshape the contest in the Nutmeg State. Blumenthal (D) is a candidate for the U.S. Senate seat now held by Christopher Dodd (D), and he has held a commanding lead in the race since he entered it. But the Times report has the potential to fundamentally reshape the contest in the Nutmeg State. Let’s start with what is wikification Read wiki titles slower
Applications Knowledge Acquisition via Grounding Coreference Resolution Learning-based multi-sieve co-reference resolution with knowledge (Ratinov et al. 2012) Information Extraction Unsupervised relation discovery with sense disambiguation (Yao et al. 2012) Automatic Event Extraction with Structured Preference Modeling (Lu and Roth, 2012 ) Text Classification Gabrilovich and Markovitch, 2007; Chang et al., 2008 Entity Linking Generally allows us to inject knowledge into text models, with access to rich structured concept data Entity Linking is very similar and we will actually evaluate on an EL task, except for the restriction on linking entities only and to cluster NIL entities
Challenges Ambiguity Concepts outside of Wikipedia (NIL) Variability Blumenthal (D) is a candidate for the U.S. Senate seat now held by Christopher Dodd (D), and he has held a commanding lead in the race since he entered it. But the Times report has the potential to fundamentally reshape the contest in the Nutmeg State. Ambiguity Concepts outside of Wikipedia (NIL) Blumenthal ? Variability Scale Millions of labels Times The New York Times The Times Connecticut CT The Nutmeg State This is done with mostly with stats before What if there is no correct answer for Blumenthal in Wikipedia? How do we link those mentions to NILs even if we know Blumenthal could refer to something we know
Challenges Blumenthal (D) is a candidate for the U.S. Senate seat now held by Christopher Dodd (D), and he has held a commanding lead in the race since he entered it. But the Times report has the potential to fundamentally reshape the contest in the Nutmeg State. State-of-the-art systems (Ratinov et al. 2011) can achieve the above with local and global statistical features Reaches bottleneck around 70%~ 85% F1 on non-wiki datasets What is missing? Quite impressive, bottleneck: adding more sophisticated contextual features does not help much We do not want to level off here
Motivating Example What are we missing with Bag of Words (BOW) models? Mubarak, the wife of deposed Egyptian President Hosni Mubarak, … , the of deposed , … Mubarak wife Egyptian President Hosni Mubarak What are we missing with Bag of Words (BOW) models? Who is Mubarak? Constraining interaction between concepts (Mubarak, wife, Hosni Mubarak) Main contribution here We will show that by identifying
Relational Inference for Wikification Mubarak, the wife of deposed Egyptian President Hosni Mubarak, … (Mubarak, wife, Hosni Mubarak) Our contribution Identify key textual relations for Wikification A global inference framework to incorporate relational knowledge Significant improvement over state-of-the-art systems Main contribution here We will show that by identifying
Talk Outline Why Wikification? Introduction Approach Evaluation Motivation Approach Wikification Pipeline Formulation Relational Analysis Evaluation Result Entity Linking We will first go over the motivating example for our work How we approach Wikification by giving its mathematical formulation, as well as how to instantiate our model with relational analysis over the text Finally we will present our result and show its robustness in an Entity Linking task
Wikification Mention Segmentation Candidate Generation Candidate Ranking Determine NILs Mention Segmentation Candidate Generation Candidate Ranking NIL Linking To incorporate the improvements, we first have to take a look at the system layout We will give a quick overview of the terminology of wikification pipeline
Wikification Pipeline 1 - Mention Segmentation ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… sub-NP (Noun Phrase) chunks NER Regular expressions Generate robust nested mentions
Wikification Pipeline 1 - Mention Segmentation ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party…
Wikification Pipeline 2 - Candidate Generation ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 1 Slobodan_Milošević 2 Milošević_(surname) 3 Boki_Milošević 4 Alexander_Milošević … k ek4 1 Socialist_Party_(France) 2 Socialist_Party_(Portugal) 3 Socialist_Party_of_America 4 Socialist_Party_(Argentina) …
Wikification Pipeline 3 - Candidate Ranking ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 sk3 1 Slobodan_Milošević 0.7 2 Milošević_(surname) 0.1 3 Boki_Milošević 4 Alexander_Milošević 0.05 … k ek4 sk4 1 Socialist_Party_(France) 0.23 2 Socialist_Party_(Portugal) 0.16 3 Socialist_Party_of_America 0.07 4 Socialist_Party_(Argentina) 0.06 … Assign score to candidate entities to express the model’s preference based on contextual coherence Local and global statistical features
Wikification Pipeline 4 – Determine NILs ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 sk3 1 Slobodan_Milošević 0.7 k ek4 sk4 1 Socialist_Party_(France) 0.23 We classify the whether the top candidate belongs to Wikipedia Usually a binary classifier optimizes for evaluation metric Is the top candidate really what the text referred to?
Talk Outline Why Wikification? Introduction Approach Evaluation Motivation Approach Wikification Pipeline Formulation Relational Analysis Evaluation Result Entity Linking Next we will introduce
Formulation (0) Intuition Mubarak, the wife of deposed Egyptian President Hosni Mubarak, … (Mubarak, wife, Hosni Mubarak) Intuition Promote pairs of concepts coherent with textual relations
Formulation (1) Formulate as an Integer Linear Program (ILP): If no relation exists, collapse to the non-structured decision weight to output 𝑒 𝑖 𝑘 Whether to output 𝑘th candidate of the 𝑖th mention weight of a relation 𝑟 𝑖𝑗 (𝑘,𝑙) Whether a relation exists between 𝑒 𝑖 𝑘 and 𝑒 𝑗 𝑙 Formalize intuition, weights induced from data, small constraints Objective function collapse to the original output when there is no relation, so our model is negatives-proof True-negatives and false-negatives will not negatively impact performance of baseline
Formulation (2) ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 sk3 1 Slobodan_Milošević 0.7 2 Milošević_(surname) 0.1 3 Boki_Milošević 4 Alexander_Milošević 0.05 … k ek4 sk4 1 Socialist_Party_(France) 0.23 2 Socialist_Party_(Portugal) 0.16 3 Socialist_Party_of_America 0.07 4 Socialist_Party_(Argentina) 0.06 … r(1,2)34 r(4,3)34 Here we give a concrete example of how the variables in the objective function is instantiated eki: whether a concept is chosen ski : score of a concept r(k,l)ij: whether a relation is present w(k,l)ij : score of a relation
Relation Identification Overall Approach Wikification Candidate Generation Candidate Ranking Determine NILs Relation Analysis Relation Identification Relation Retrieval Relational Inference Next we will describe our approach in detail by walk through an example Our contribution – relation analysis helps all stages of wikification
Relation Identification ACE style in-document coreference Extract named entity-only coreference relations with high precision Syntactico-Semantic relations (Chan & Roth ‘10) Easy to extract with high precision Aim for high recall, as false-positives will be verified and discarded Sparse, but covers ~80% relation instances in ACE2004 Type Example Premodifier Iranian Ministry of Defense Possessive NYC’s stock exchange Formulaic Chicago, Illinois Preposition President of the US Here we identify the textual relations
Relation Identification ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… Argument 1 Relation Type Argument 2 Yugoslav President apposition Slobodan Milošević coreference Milošević possessive Socialist Party There are quite some relations in this short sentence Overall it is sparse but important in text understanding, as error propagates in structured inference We do not actually care about relation types too much, as the relation will be constrained by our knowledge
Relation Identification Overall Approach Wikification Candidate Generation Candidate Ranking Determine NILs Relation Analysis Relation Identification Relation Retrieval Relational Inference Next we will describe our approach in detail by walk through an example Our contribution – relation analysis helps all stages of wikification
Relation Retrieval ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… Current approach Collect known mappings from Wikipedia page titles, hyperlinks… Limit to top-K candidates based on frequency of links (Ratinov et al. 2011) What concepts can “Socialist Party” refer to? Top-k gives a very strong baseline, single most important feature in wikification Include both explicit relations and Wikipedia hyperlinks as weak implicit relations in the form of (concept1,relationType,concept2)
Uninformative Mentions Using all possible candidates will comprise our system’s tractability
Relation Retrieval ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… What concepts can “Socialist Party” refer to? More robust candidate generation Identified relations are verified against a knowledge base (DBPedia) Retrieve relation arguments matching “(Milošević ,?,Socialist Party)” as our new candidates Instead of the mention to concept mapping space, we go to the concept to concept relational space
Relation Retrieval ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 sk3 1 Slobodan_Milošević 0.7 … k ek4 sk4 1 Socialist_Party_(France) 0.23 … Query Pruning Only 2 queries per pair necessary due to strong baseline. Relation space too large, needs pruning Query result noisy Constrain one of the arguments to be known candidates Chances of both top answers are wrong is actually very small q1=(Socialist Party of France,?, *Milošević*) q2=(Slobodan Milošević,?,*Socialist Party*)
Relation Retrieval Argument 1 Relation Type Argument 2 Milošević possessive Socialist Party
Relation Retrieval ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 sk3 1 Slobodan_Milošević 0.7 2 Milošević_(surname) 0.1 3 Boki_Milošević 4 Alexander_Milošević 0.05 … k ek4 sk4 1 Socialist_Party_(France) 0.23 2 Socialist_Party_(Portugal) 0.16 3 Socialist_Party_of_America 0.07 4 Socialist_Party_(Argentina) 0.06 … 21 Socialist_Party_of_Serbia 0.0 Top-k gives a very strong baseline, single most important feature in Wikification 𝑟 34 (1,21) =1
Relation Identification Overall Approach Wikification Candidate Generation Candidate Ranking Determine NILs Relation Analysis Relation Identification Relation Retrieval Relational Inference Next we will describe our approach in detail by walk through an example Our contribution – relation analysis helps all stages of wikification
𝑤 34 (1,21) =? Relational Inference ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 sk3 1 Slobodan_Milošević 0.7 2 Milošević_(surname) 0.1 3 Boki_Milošević 4 Alexander_Milošević 0.05 … k ek4 sk4 1 Socialist_Party_(France) 0.23 2 Socialist_Party_(Portugal) 0.16 3 Socialist_Party_of_America 0.07 4 Socialist_Party_(Argentina) 0.06 … 21 Socialist_Party_of_Serbia 0.0 Top-k gives a very strong baseline, single most important feature in wikification 𝑟 34 (1,21) =1 𝑤 34 (1,21) =?
Retrieved relation tuple Relation scoring Query scoring as a tie-breaker between multiple relations Explicit relations are stronger than a hyperlink relations Normalize score for each pair of mention to [0,1] Relation query Retrieved relation tuple 𝑤= 1 𝑍 𝑓 𝑞,𝜎
𝑤 34 (1,21) =1 Relational Inference ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 sk3 1 Slobodan_Milošević 0.7 2 Milošević_(surname) 0.1 3 Boki_Milošević 4 Alexander_Milošević 0.05 … k ek4 sk4 1 Socialist_Party_(France) 0.23 2 Socialist_Party_(Portugal) 0.16 3 Socialist_Party_of_America 0.07 4 Socialist_Party_(Argentina) 0.06 … 21 Socialist_Party_of_Serbia 0.0 Top-k gives a very strong baseline, single most important feature in wikification 𝑟 34 (1,21) =1 𝑤 34 (1,21) =1
Relational Inference - coreference ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… k ek3 sk3 1 Slobodan_Milošević 0.7 2 Milošević_(surname) 0.1 3 Boki_Milošević 4 Alexander_Milošević 0.05 … 𝑟 23 (1,1) =1 k ek2 sk2 1 Slobodan_Milošević 1.0 The semantics of coreference relation demands we assign the same output for all the mentions in the coreferent cluster
Relation Identification Overall Approach Wikification Candidate Generation Candidate Ranking Determine NILs Relation Analysis Relation Identification Relation Retrieval Relational Inference Next we will describe how we handle the NIL candidates
Determine unknown concepts (NILs) Dorothy Byrne, a state coordinator for the Florida Green Party,… nominal mention k ek1 sk1 1 Dorothy_Byrne_(British_Journalist) 0.6 2 Dorothy_Byrne_(mezzo-soprano) 0.4 k ek2 sk2 1 Green_Party_of_Florida 1.0 Top-k gives a very strong baseline, single most important feature in wikification NIL entities tends to be introduced with nominal mentions How to capture the fact: “Dorothy Byrne” does not refer to any concept in Wikipedia Identify coreferent nominal mention relations Generate better features for NIL classifier
Determine unknown concepts (NILs) Dorothy Byrne, a state coordinator for the Florida Green Party,… nominal mention k ek1 sk1 NIL 1.0 1 Dorothy_Byrne_(British_Journalist) 0.6 2 Dorothy_Byrne_(mezzo-soprano) 0.4 k ek2 sk2 1 Green_Party_of_Florida 1.0 To incorporate NIL classification into our framework for inference, we make up a NIL candidate Create NIL candidate for propagation
Talk Outline Why Wikification? Introduction Approach Evaluation Motivation Approach Wikification Pipeline Formulation Relational Analysis Evaluation Result Entity Linking Next we will introduce
Wikification Performance Result And we show that by incorporating these relational inference into the disambiguation framework, our system achieves significant improvements over the previous methods
Evaluation – TAC KBP Entity Linking Task Definition Links a named entity in a document to either a TAC Knowledge Base (TAC KB) node or NIL Cluster NIL entities Relevant Tasks Wikification Cross-document coreference Explain NIL clustering
Evaluation – TAC KBP Entity Linking Run Relational Inference (RI) Wikifier “as-is”: No retraining using TAC data Without retraining on TAC data, we improve our baseline system to a degree that is statistically indistinguishable from the top systems *Median of top 14 systems
Thank you! Conclusion Importance of linguistic and world knowledge Identification of relational information benefits Wikification Introduced an inference framework to leverage better language understandings Future work Accumulate what we know about NIL concepts Joint entity typing, coreference and disambiguation Incorporate more relations Show milosevic Thank you! *Demo will be updated in a week at: http://cogcomp.cs.illinois.edu/demo/wikify Download at: http://cogcomp.cs.illinois.edu/page/download_view/Wikifier
Back up slides BACK UP slides
Massive Textual Information How can we know more from large volumes of possibly unfamiliar raw texts?
Massive Textual Information We naturally “look up” concepts and accumulate knowledge.
Challenges Tuesday marks the 142nd anniversary of an event that forever altered the course of Chicago's development as a then-young American city ? It’s a version of Chicago – the standard classic Macintosh menu font… By the time the New Orleans Saints kicked off at Soldier Field on Sunday afternoon, their woeful history in the Windy City was fully understood.
? Challenges Ambiguity Variety Concepts outside of Wikipedia (NIL) Tuesday marks the 142nd anniversary of an event that forever altered the course of Chicago's development as a then-young American city Ambiguity Variety Concepts outside of Wikipedia (NIL) Chicago Chicago font ? Chicago The Windy City It’s a version of Chicago – the standard classic Macintosh menu font… Similar to WSD, but we have to deal with NILs By the time the New Orleans Saints kicked off at Soldier Field on Sunday afternoon, their woeful history in the Windy City was fully understood.
Organizing knowledge Wikipedia as a source of “common sense” knowledge Naturally bridges rich structured knowledge and text data Comprehensive for most purposes ~4.3 million English articles as of today Cross-lingual Estimated size of a printed version of Wikipedia as of August 2010. *Picture courtesy of Wikipedia
Motivating Example Relatively sparse, but act as hard constraints Intuitively, we need to “de-coreference” this pair of mention Opens a new dimension in text understanding helps all stages of Wikification Mubarak Hosni Mubarak Egyptian President wife Mubarak, the wife of deposed Egyptian President Hosni Mubarak, …
Wikification Approach ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… Mention Segmentation Use Shallow Parsing, NER and regular expression to generate likely mentions of concepts. Match nested mentions using dictionary. Discard unknown mentions. A traditional steps
Ranking Features What are we losing? Local features from Bag of Words (BOW) representation, such as various TFIDF windows. Global features from Bag of Concepts (BOC) representation. What are we losing? Mubarak Hosni Mubarak Egyptian President wife What are we losing?
Relation Extraction In document coreference Extract entity-only coreference relations with high precision Syntactico-Semantic relations (Chan & Roth ‘10) Easy to extract with high precision Deep parsing too slow Premodifier Iranian Ministry of Defense Possessive NYC’s stock exchange Formulaic Chicago, Illinois Preposition President of the US What are the useful relations
Relational Inference Coreference ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party Possessive Calls for the need for looking up knowledge
Relational Inference ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party How to look up knowledge and use these relations?
Wikification Approach Recall concepts Rank concepts Determine unknown concepts (NILs) Move on first
Determine NILs Dorothy Bryne, a state coordinator for the Florida Green Party Leverage nominal mention relations to construct limited context
Relational Inference Example ...like Chicago O'Hare and [La Guardia] in [New York],... Wikipedia Page Ski LaGuardia_Airport 0.36 La_Guardia 0.22 Fiorello_La_Guardia 0.18 La_Guardia,_Spain 0.11 Wikipedia Page Ski New_York 0.53 New_York_City 0.44 New_York_(magazine) 0.01 New_York_Knicks
Relational Inference Scenario 1 ...like Chicago O'Hare and [La Guardia] in [New York],... Wikipedia Page Score LaGuardia_Airport 0.36 La_Guardia 0.22 Fiorello_La_Guardia 0.18 La_Guardia,_Spain 0.11 Wikipedia Page Score New_York 0.53 New_York_City 0.44 New_York_(magazine) 0.01 New_York_Knicks Γ = 0.18+0.53 = 0.71
Relational Inference Scenario 2 ...like Chicago O'Hare and [La Guardia] in [New York],... Wikipedia Page Score LaGuardia_Airport 0.36 La_Guardia 0.22 Fiorello_La_Guardia 0.18 La_Guardia,_Spain 0.11 Wikipedia Page Score New_York 0.53 New_York_City 0.44 New_York_(magazine) 0.01 New_York_Knicks Γ = 0.36+0.44 = 0.8
Relational Inference Conclusion ...like Chicago O'Hare and [La Guardia] in [New York],... Pr(Γ= ) > Pr(Γ= )
Constraint scaling 𝑞: query 𝜎: a retrieved relation triple Relation scaling factor Query lexical similarity Normalization factor Lexical similarity as a tie-breaker between multiple relations Explicit relations are stronger than a hyperlink relations Normalize score for each pair of mention to [0,1]
Wikification Pipeline 1 - Mention Segmentation ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party… Sub noun phrase chunks and NER segments (Ratinov et al. 2011) Regular expression capturing longer composite mentions President of the United States of America Generate robust nested mentions