Download presentation
Presentation is loading. Please wait.
Published byCelia Cator Modified over 9 years ago
1
AUTOMATIC GLOSS FINDING for a Knowledge Base using Ontological Constraints Bhavana Dalvi (PhD Student, LTI) Work done with: Prof. William Cohen, CMU Prof. Einat Mikov, University of Haifa Prof. Partha Talukdar, IISC Bangalore 1
2
Motivation 2
3
Need for gloss finding KBs are useful for many NLP tasks: E.g. Question answering Lot of research in fact extraction to populate KBs Glosses can help further in applications like - Word/Entity sense disambiguation - Information retrieval Automatically constructed KBs lack glosses e.g. NELL, YAGO 3
4
Example: Gloss finding Class constraints: Inclusion: Every entity that is of type “Fruit” is also of type “Food”. Mutual Exclusion: If an entity is of type “Food” then it cannot be of type “Organization” 4
5
Example: Gloss finding 5
6
6
7
7 Knowledge Bases: NELL / Freebase / YAGO Candidate glosses: DBPedia abstracts/ Wiktionary definitions
8
Gloss Finding 8
9
Problem Definition InputExample KB classesFood, Fruits, Company … Ontological constraints: Subset, Mutex Entities ‘E’ belonging to KB categoriesBanana, Microsoft Lexical strings ‘L’ that refer to entities ‘E’E.g. ‘MS’, ‘microsoft inc’ Candidate glossesE.g. G3: Apple, formerly Apple Computer Inc., is an American multinational corporation headquartered in Cupertino … Output: Matching candidate glosses to entities in the KB E.g. (Apple, G3) Company:Apple 9
10
Can we use existing techniques? Problem: Match potential glosses to appropriate entities in the KB. Entity linking: Assume existence of glosses on KB side Input KB does not have glosses Chicken & egg problem Ontology alignment: Both ends being matched are structured databases Asymmetric problem: Structured KB on one side, without glosses Candidate glosses contain text but no structure 10
11
Proposed Gloss Finding Procedure Decide head-NP for a gloss: NP being defined G3: Apple, formerly Apple Computer Inc., is an American multinational corporation headquartered in Cupertino … Select candidate glosses for which string match (head-NP, KB entity) For each gloss a set of candidate KB entities (Apple, G3) (Fruit:Apple, Company:Apple) Classify the head-NP into KB classes using ontological constraints (Apple, G3) Company Choose the KB entity match based on chosen KB category. (Apple, G3) Company:Apple 11
12
Building Classifiers 12
13
Training classifiers for KB Categories Train: Unambiguous glosses Test: Ambiguous glosses 13
14
Assumptions If a gloss has only one candidate entity matching in a KB, then it is correct i.e. we assume that KB is always correct and complete in terms of senses. Assumption holds for 81% for NELL dataset Given the category, a mention is unambiguous [Suchanek WWW’07, Nakashole ACL’13] i.e. we can differentiate between entities of different category but not within a category. 14
15
Methods Baselines SVM Learning Train binary classifiers using unambiguous glosses Predict categories for ambiguous glosses Label propagation PIDGIN [Wijaya et al. CIKM’13]: Graph-based label propagation method. GLOFIN: semi-supervised EM + use of ontological constraints. 15
16
Proposed Method: GLOFIN Initialize model with few seeds per class Iterate till convergence (Data likelihood) E step: Predict labels for unlabeled points For each unlabeled datapoint Find P(Class | datapoint) for all classes Assign a consistent bit vector of labels in accordance with ontological constraints M step: Recompute model parameters using seeds + predicted labels for unlabeled points 16
17
Proposed Method: GLOFIN Initialize model with few seeds per class Iterate till convergence (Data likelihood) E step: Predict labels for unlabeled points For each unlabeled datapoint Find P(Class | datapoint) for all classes Assign a consistent bit vector of labels in accordance with ontological constraints M step: Recompute model parameters using seeds + predicted labels for unlabeled points 17
18
Estimating class parameters and assignment probabilities Naïve Bayes Independent multinomial distributions per word K-Means Cosine similarity between centroid and datapoint von-Mises Fisher Data distributed on a unit hypersphere 18
19
Proposed Method: GLOFIN Initialize model with few seeds per class Iterate till convergence (Data likelihood) E step: Predict labels for unlabeled points For each unlabeled datapoint Find P(Class | datapoint) for all classes Assign a consistent bit vector of labels in accordance with ontological constraints M step: Recompute model parameters using seeds + predicted labels for unlabeled points 19
20
Mixed Integer Linear Program Input: P(C j | X i ), Class constraints: Subset, Mutex Output: Consistent bit vector y ji for X i Max { likelihood of assignment – constraint violation penalty } 20
21
Proposed Method: GLOFIN Initialize model with few seeds per class Iterate till convergence (Data likelihood) E step: Predict labels for unlabeled points For each unlabeled datapoint Find P(Class | datapoint) for all classes Assign a consistent bit vector of labels in accordance with ontological constraints M step: Recompute model parameters using seeds + predicted labels for unlabeled points 21
22
Experiments 22
23
Candidate glosses DBPedia is a database derived from Wikipedia We use short abstracts (definitions upto 500 characters, from Wikipedia page) E.g. McGill University is a research university located in Montreal Quebec Canada Founded in 1821 during the British colonial era the university bears the name of James McGill a prominent Montreal merchant from Glasgow Scotland and alumnus of Glasgow University whose bequest formed the beginning of the university. 23
24
Knowledge bases 24
25
GLOFIN vs. SVM & Label propagation Freebase Dataset: Performance on ambiguous glosses 25
26
GLOFIN vs. SVM & Label propagation NELL Dataset: Performance on ambiguous glosses 26
27
Are the datasets close to real world? Large fraction of data used for training 80% of NELL 90% of Freebase In real world scenarios, amount of training data might be a small fraction of the dataset. We simulate this by using 10% of unambiguous glosses for training 27
28
Small amount of training data Freebase Dataset: Performance on ambiguous glosses 28
29
Small amount of training data NELL Dataset: Performance on ambiguous glosses 29
30
Compare variants of GLOFIN Freebase Dataset 30
31
Compare variants of GLOFIN NELL Dataset 31
32
Some more experiments … 32 Evaluating quality of automatically acquired seeds Manually creating gold standard for NELL dataset Different ways of scaling GLOFIN NELL to Freebase mappings via common glosses http://www.cs.cmu.edu/~bbd http://www.cs.cmu.edu/~bbd
33
And Future Work ….. Conclusions 33
34
Conclusions Completely unsupervised method for gloss finding - using unambiguous matches as training data - hierarchical classification instead of entity linking Our proposed method GLOFIN: GLOFIN ≥ Label Propagation ≥ SVM Variants of Hierarchical GLOFIN Naïve Bayes ≥ K-Means, von-Mises Fisher Ontological constraints help for all GLOFIN variants Hierarchical GLOFIN ≥ Flat GLOFIN In future, we will like to add new entities to the KB. 34
35
head-NPGloss Candidate NELL entities Entity selected by GLOFIN McGill_UniversityMcGill University is a research university located in Montreal Quebec Canada Founded in 1821 during the British colonial era the university bears the name of James McGill a prominent … University:E, Sports_team:E University:E Kingston_upon_ Hull Kingston upon Hull frequently referred to as Hull is a city and unitary authority area in the ceremonial county of the East Riding of Yorkshire England It stands on the River Hull at its junction with … City:E, Visual_Artist:E City:E Robert_SoutheyRobert Southey was an English poet of the Romantic school one of the so called Lake Poets and Poet Laureate for 30 years from 1813 to his death in 1843 Although his fame has been long eclipsed by that … Person_Europe:E, Person_Africa:E, Politician_USA:E Person_Europe:E 35
36
Thank You Questions? 36
37
Extra Slides 37
38
Comparing of GLOFIN Approximations 38
39
Eval: quality of seeds for NELL KB Noisy seeds: Only 81% leaf category assignments are correct Hierarchical labeling can help: 94% higher level category labels are correct 39
40
Creating gold standard for NELL Gold standard for evaluation on ambiguous glosses For most glosses, precise category is part of NELL 40
41
NELL – Freebase mappings via common glosses 41
42
Pros and Cons of GLOFIN Generative EM framework that can build on SSL methods: NBayes, K-Means, VMF Can label unseen datapoints once models are learnt. Assumption: Input KB is complete and accurate. All experiments are done in transductive setting: need to extend for missing entities and categories in the KB. AdvantagesLimitations 42
43
Future work … Adding new entities to existing KB categories KBs are usually incomplete w.r.t coverage of entities. GLOFIN: Classifies mentions into categories Introducing new clusters of entities: missing categories in the KB Extensions similar to Exploratory EM [Dalvi et al. ECML’13] New categories: entities belonging to them, along with glosses for those entities 43
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.