Presentation is loading. Please wait.

Presentation is loading. Please wait.

AUTOMATIC GLOSS FINDING for a Knowledge Base using Ontological Constraints Bhavana Dalvi (PhD Student, LTI) Work done with: Prof. William Cohen, CMU Prof.

Similar presentations


Presentation on theme: "AUTOMATIC GLOSS FINDING for a Knowledge Base using Ontological Constraints Bhavana Dalvi (PhD Student, LTI) Work done with: Prof. William Cohen, CMU Prof."— Presentation transcript:

1 AUTOMATIC GLOSS FINDING for a Knowledge Base using Ontological Constraints Bhavana Dalvi (PhD Student, LTI) Work done with: Prof. William Cohen, CMU Prof. Einat Mikov, University of Haifa Prof. Partha Talukdar, IISC Bangalore 1

2 Motivation 2

3 Need for gloss finding  KBs are useful for many NLP tasks: E.g. Question answering  Lot of research in fact extraction to populate KBs  Glosses can help further in applications like - Word/Entity sense disambiguation - Information retrieval  Automatically constructed KBs lack glosses e.g. NELL, YAGO 3

4 Example: Gloss finding Class constraints:  Inclusion: Every entity that is of type “Fruit” is also of type “Food”.  Mutual Exclusion: If an entity is of type “Food” then it cannot be of type “Organization” 4

5 Example: Gloss finding 5

6 6

7 7 Knowledge Bases: NELL / Freebase / YAGO Candidate glosses: DBPedia abstracts/ Wiktionary definitions

8 Gloss Finding 8

9 Problem Definition InputExample KB classesFood, Fruits, Company … Ontological constraints: Subset, Mutex Entities ‘E’ belonging to KB categoriesBanana, Microsoft Lexical strings ‘L’ that refer to entities ‘E’E.g. ‘MS’, ‘microsoft inc’ Candidate glossesE.g. G3: Apple, formerly Apple Computer Inc., is an American multinational corporation headquartered in Cupertino … Output: Matching candidate glosses to entities in the KB E.g. (Apple, G3)  Company:Apple 9

10 Can we use existing techniques?  Problem: Match potential glosses to appropriate entities in the KB.  Entity linking: Assume existence of glosses on KB side  Input KB does not have glosses  Chicken & egg problem  Ontology alignment: Both ends being matched are structured databases  Asymmetric problem: Structured KB on one side, without glosses Candidate glosses contain text but no structure 10

11 Proposed Gloss Finding Procedure  Decide head-NP for a gloss: NP being defined G3: Apple, formerly Apple Computer Inc., is an American multinational corporation headquartered in Cupertino …  Select candidate glosses for which string match (head-NP, KB entity)  For each gloss  a set of candidate KB entities (Apple, G3)  (Fruit:Apple, Company:Apple)  Classify the head-NP into KB classes using ontological constraints (Apple, G3)  Company  Choose the KB entity match based on chosen KB category. (Apple, G3)  Company:Apple 11

12 Building Classifiers 12

13 Training classifiers for KB Categories Train: Unambiguous glosses Test: Ambiguous glosses 13

14 Assumptions  If a gloss has only one candidate entity matching in a KB, then it is correct  i.e. we assume that KB is always correct and complete in terms of senses.  Assumption holds for 81% for NELL dataset  Given the category, a mention is unambiguous [Suchanek WWW’07, Nakashole ACL’13]  i.e. we can differentiate between entities of different category but not within a category. 14

15 Methods  Baselines  SVM Learning Train binary classifiers using unambiguous glosses Predict categories for ambiguous glosses  Label propagation PIDGIN [Wijaya et al. CIKM’13]: Graph-based label propagation method.  GLOFIN: semi-supervised EM + use of ontological constraints. 15

16 Proposed Method: GLOFIN Initialize model with few seeds per class Iterate till convergence (Data likelihood) E step: Predict labels for unlabeled points For each unlabeled datapoint  Find P(Class | datapoint) for all classes  Assign a consistent bit vector of labels in accordance with ontological constraints M step: Recompute model parameters using seeds + predicted labels for unlabeled points 16

17 Proposed Method: GLOFIN Initialize model with few seeds per class Iterate till convergence (Data likelihood) E step: Predict labels for unlabeled points For each unlabeled datapoint  Find P(Class | datapoint) for all classes  Assign a consistent bit vector of labels in accordance with ontological constraints M step: Recompute model parameters using seeds + predicted labels for unlabeled points 17

18 Estimating class parameters and assignment probabilities  Naïve Bayes Independent multinomial distributions per word  K-Means Cosine similarity between centroid and datapoint  von-Mises Fisher Data distributed on a unit hypersphere 18

19 Proposed Method: GLOFIN Initialize model with few seeds per class Iterate till convergence (Data likelihood) E step: Predict labels for unlabeled points For each unlabeled datapoint  Find P(Class | datapoint) for all classes  Assign a consistent bit vector of labels in accordance with ontological constraints M step: Recompute model parameters using seeds + predicted labels for unlabeled points 19

20 Mixed Integer Linear Program Input: P(C j | X i ), Class constraints: Subset, Mutex Output: Consistent bit vector y ji for X i Max { likelihood of assignment – constraint violation penalty } 20

21 Proposed Method: GLOFIN Initialize model with few seeds per class Iterate till convergence (Data likelihood) E step: Predict labels for unlabeled points For each unlabeled datapoint  Find P(Class | datapoint) for all classes  Assign a consistent bit vector of labels in accordance with ontological constraints M step: Recompute model parameters using seeds + predicted labels for unlabeled points 21

22 Experiments 22

23 Candidate glosses  DBPedia is a database derived from Wikipedia  We use short abstracts (definitions upto 500 characters, from Wikipedia page)  E.g. McGill University is a research university located in Montreal Quebec Canada Founded in 1821 during the British colonial era the university bears the name of James McGill a prominent Montreal merchant from Glasgow Scotland and alumnus of Glasgow University whose bequest formed the beginning of the university. 23

24 Knowledge bases 24

25 GLOFIN vs. SVM & Label propagation Freebase Dataset: Performance on ambiguous glosses 25

26 GLOFIN vs. SVM & Label propagation NELL Dataset: Performance on ambiguous glosses 26

27 Are the datasets close to real world?  Large fraction of data used for training 80% of NELL 90% of Freebase  In real world scenarios, amount of training data might be a small fraction of the dataset.  We simulate this by using 10% of unambiguous glosses for training 27

28 Small amount of training data Freebase Dataset: Performance on ambiguous glosses 28

29 Small amount of training data NELL Dataset: Performance on ambiguous glosses 29

30 Compare variants of GLOFIN Freebase Dataset 30

31 Compare variants of GLOFIN NELL Dataset 31

32 Some more experiments … 32  Evaluating quality of automatically acquired seeds  Manually creating gold standard for NELL dataset  Different ways of scaling GLOFIN  NELL to Freebase mappings via common glosses http://www.cs.cmu.edu/~bbd http://www.cs.cmu.edu/~bbd

33 And Future Work ….. Conclusions 33

34 Conclusions  Completely unsupervised method for gloss finding - using unambiguous matches as training data - hierarchical classification instead of entity linking  Our proposed method GLOFIN: GLOFIN ≥ Label Propagation ≥ SVM  Variants of Hierarchical GLOFIN Naïve Bayes ≥ K-Means, von-Mises Fisher  Ontological constraints help for all GLOFIN variants Hierarchical GLOFIN ≥ Flat GLOFIN  In future, we will like to add new entities to the KB. 34

35 head-NPGloss Candidate NELL entities Entity selected by GLOFIN McGill_UniversityMcGill University is a research university located in Montreal Quebec Canada Founded in 1821 during the British colonial era the university bears the name of James McGill a prominent … University:E, Sports_team:E University:E Kingston_upon_ Hull Kingston upon Hull frequently referred to as Hull is a city and unitary authority area in the ceremonial county of the East Riding of Yorkshire England It stands on the River Hull at its junction with … City:E, Visual_Artist:E City:E Robert_SoutheyRobert Southey was an English poet of the Romantic school one of the so called Lake Poets and Poet Laureate for 30 years from 1813 to his death in 1843 Although his fame has been long eclipsed by that … Person_Europe:E, Person_Africa:E, Politician_USA:E Person_Europe:E 35

36 Thank You Questions? 36

37 Extra Slides 37

38 Comparing of GLOFIN Approximations 38

39 Eval: quality of seeds for NELL KB  Noisy seeds: Only 81% leaf category assignments are correct  Hierarchical labeling can help: 94% higher level category labels are correct 39

40 Creating gold standard for NELL  Gold standard for evaluation on ambiguous glosses  For most glosses, precise category is part of NELL 40

41 NELL – Freebase mappings via common glosses 41

42 Pros and Cons of GLOFIN  Generative EM framework that can build on SSL methods: NBayes, K-Means, VMF  Can label unseen datapoints once models are learnt.  Assumption: Input KB is complete and accurate.  All experiments are done in transductive setting: need to extend for missing entities and categories in the KB. AdvantagesLimitations 42

43 Future work …  Adding new entities to existing KB categories  KBs are usually incomplete w.r.t coverage of entities.  GLOFIN: Classifies mentions into categories  Introducing new clusters of entities: missing categories in the KB  Extensions similar to Exploratory EM [Dalvi et al. ECML’13]  New categories: entities belonging to them, along with glosses for those entities 43


Download ppt "AUTOMATIC GLOSS FINDING for a Knowledge Base using Ontological Constraints Bhavana Dalvi (PhD Student, LTI) Work done with: Prof. William Cohen, CMU Prof."

Similar presentations


Ads by Google