Download presentation
Presentation is loading. Please wait.
Published byClyde Daniels Modified over 9 years ago
1
Page 1 March 2011 Local and Global Algorithms for Disambiguation to Wikipedia Lev Ratinov 1, Dan Roth 1, Doug Downey 2, Mike Anderson 3 1 University of Illinois at Urbana-Champaign 2 Northwestern University 3 Rexonomy
2
Information overload 2
3
Organizing knowledge 3 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. Chicago VIII was one of the early 70s-era Chicago albums to catch my ear, along with Chicago II.
4
Cross-document co-reference resolution 4 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. Chicago VIII was one of the early 70s-era Chicago albums to catch my ear, along with Chicago II.
5
Reference resolution: (disambiguation to Wikipedia) 5 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. Chicago VIII was one of the early 70s-era Chicago albums to catch my ear, along with Chicago II.
6
The “reference” collection has structure 6 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. Chicago VIII was one of the early 70s-era Chicago albums to catch my ear, along with Chicago II. Used_In Is_a Succeeded Released
7
Analysis of Information Networks 7 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Chicago was used by default for Mac menus through MacOS 7.6, and OS 8 was released mid-1997.. Chicago VIII was one of the early 70s-era Chicago albums to catch my ear, along with Chicago II.
8
Here – Wikipedia as a knowledge resource …. but we can use other resources 8 Used_In Is_a Succeeded Released
9
Talk outline High-level algorithmic approach. Bi-partite graph matching with global and local inference. Local Inference. Experiments & Results Global Inference. Experiments & Results Results, Conclusions Demo 9
10
Problem formulation - matching/ranking problem 10 Text Document(s)—News, Blogs,… Wikipedia Articles
11
Local approach 11 Γ is a solution to the problem A set of pairs (m,t) m: a mention in the document t: the matched Wikipedia Title Text Document(s)—News, Blogs,… Wikipedia Articles
12
Local approach 12 Γ is a solution to the problem A set of pairs (m,t) m: a mention in the document t: the matched Wikipedia Title Local score of matching the mention to the title Text Document(s)—News, Blogs,… Wikipedia Articles
13
Local + Global : using the Wikipedia structure 13 A “global” term – evaluating how good the structure of the solution is Text Document(s)—News, Blogs,… Wikipedia Articles
14
Can be reduced to an NP-hard problem 14 Text Document(s)—News, Blogs,… Wikipedia Articles
15
A tractable variation 15 1.Invent a surrogate solution Γ’; disambiguate each mention independently. 2.Evaluate the structure based on pair- wise coherence scores Ψ(t i,t j ) Text Document(s)—News, Blogs,… Wikipedia Articles
16
Talk outline High-level algorithmic approach. Bi-partite graph matching with global and local inference. Local Inference. Experiments & Results Global Inference. Experiments & Results Results, Conclusions Demo 16
17
I. Baseline : P(Title|Surface Form) 17 P(Title|”Chicago”)
18
II. Context(Title) 18 Context(Charcoal)+= “a font called __ is used to”
19
III. Text(Title) 19 Just the text of the page (one per title)
20
Putting it all together City Vs Font: (0.99-0.0001, 0.01-0.2, 0.03-0.01) Band Vs Font: (0.001-0.0001, 0.001-0.2, 0.02-0.01) Training ranking SVM: Consider all title pairs. Train a ranker on the pairs (learn to prefer the correct solution). Inference = knockout tournament. Key: Abstracts over the text – learns which scores are important. 20 Score Baseline Score Context Score Text Chicago_city0.990.010.03 Chicago_font0.00010.20.01 Chicago_band0.001 0.02
21
Example: font or city? 21 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_city), Context(Chicago_city) Text(Chicago_font), Context(Chicago_font)
22
Lexical matching 22 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_city), Context(Chicago_city) Text(Chicago_font), Context(Chicago_font) Cosine similarity, TF-IDF weighting
23
Ranking – font vs. city 23 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_city), Context(Chicago_city) Text(Chicago_font), Context(Chicago_font) 0.5 0.2 0.1 0.8 0.3 0.2 0.3 0.5
24
Train a ranking SVM 24 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_city), Context(Chicago_city) Text(Chicago_font), Context(Chicago_font) (0.5, 0.2, 0.1, 0.8) (0.3, 0.2, 0.3, 0.5) [(0.2, 0, -0.2, 0.3), -1]
25
Scaling issues – one of our key contributions 25 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_city), Context(Chicago_city) Text(Chicago_font), Context(Chicago_font)
26
Scaling issues 26 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_city), Context(Chicago_city) Text(Chicago_font), Context(Chicago_font) This stuff is big, and is loaded into the memory from the disk
27
Improving performance 27 It’s a version of Chicago – the standard classic Macintosh menu font, with that distinctive thick diagonal in the ”N”. Text(Chicago_city), Context(Chicago_city) Text(Chicago_font), Context(Chicago_font) Rather than computing TF- IDF weighted cosine similarity, we want to train a classifier on the fly. But due to the aggressive feature pruning, we choose PrTFIDF
28
Performance (local only): ranking accuracy 28 DatasetBaseline (solvable) +Local TFIDF (solvable) +Local PrTFIDF (solvable) ACE94.0595.6796.21 MSN News81.9184.0485.10 AQUAINT93.1994.3895.57 Wikipedia Test85.8892.7693.59
29
Talk outline High-level algorithmic approach. Bi-partite graph matching with global and local inference. Local Inference. Experiments & Results Global Inference. Experiments & Results Results, Conclusions Demo 29
30
Co-occurrence(Title 1,Title 2 ) 30 The city senses of Boston and Chicago appear together often.
31
Co-occurrence(Title 1,Title 2 ) 31 Rock music and albums appear together often
32
Global ranking How to approximate the “global semantic context” in the document”? (What is Γ’?) Use only non-ambiguous mentions for Γ’ Use the top baseline disambiguation for NER surface forms. Use the top baseline disambiguation for all the surface forms. How to define relatedness between two titles? (What is Ψ?) 32
33
Ψ : Pair-wise relatedness between 2 titles: Normalized Google Distance Pointwise Mutual Information 33
34
What is best the Γ’? (ranker accuracy, solvable mentions) 34 DatasetBaselineBaseline+ Lexical Baseline+ Global Unambiguous Baseline+ Global NER Baseline+ Global, All Mentions ACE94.0594.5696.2196.75 MSN News81.9184.4684.0488.51 AQUAINT93.1995.4094.0495.91 Wikipedia Test85.8889.6789.5989.79
35
Results – ranker accuracy (solvable mentions) 35 DatasetBaselineBaseline+ Lexical Baseline+ Global Unambiguous Baseline+ Global NER Baseline+ Global, All Mentions ACE94.0596.2196.75 MSN News81.9185.1088.51 AQUAINT93.1995.5795.91 Wikipedia Test85.8893.5989.79
36
Results: Local + Global 36 DatasetBaselineBaseline+ Lexical Baseline+ Lexical+ Global ACE94.0596.2197.83 MSN News81.9185.1087.02 AQUAINT93.1995.5794.38 Wikipedia Test85.8893.5994.18
37
Talk outline High-level algorithmic approach. Bi-partite graph matching with global and local inference. Local Inference. Experiments & Results Global Inference. Experiments & Results Results, Conclusions Demo 37
38
Conclusions: Dealing with a very large scale knowledge acquisition and extraction problem State-of-the-art algorithmic tools that exploit using content & structure of the network. Formulated a framework for Local & Global reference resolution and disambiguation into knowledge networks Proposed local and global algorithms: state of the art performance. Addressed scaling issue: a major issue. Identified key remaining challenges (next slide). 38
39
We want to know what we don’t know Not dealt well in the literature “As Peter Thompson, a 16-year-old hunter, said..” “Dorothy Byrne, a state coordinator for the Florida Green Party…” We train a separate SVM classifier to identify such cases. The features are: All the baseline, lexical and semantic scores of the top candidate. Score assigned to the top candidate by the ranker. The “confidence” of the ranker on the top candidate with respect to second-best disambiguation. Good-Turing probability of out-of-Wikipedia occurrence for the mention. Limited success; future research. 39
40
Comparison to the previous state of the art (all mentions, including OOW) 40 DatasetBaselineMilne&WittenOur System- GLOW ACE69.5272.7677.25 MSN News72.8368.4974.88 AQUAINT82.6483.6183.94 Wikipedia Test81.7780.3290.54
41
Demo 41
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.