Download presentation
Presentation is loading. Please wait.
1
A Metric-based Framework for Automatic Taxonomy Induction Hui Yang and Jamie Callan Language Technologies Institute Carnegie Mellon University ACL2009, Singapore
2
R OADMAP Introduction Related Work Metric-Based Taxonomy Induction Framework The Features Experimental Results Conclusions
3
I NTRODUCTION Semantic taxonomies, such as WordNet, play an important role in solving knowledge-rich problems Limitations of Manually-created Taxonomies Rarely complete Difficult to include new terms from emerging/changing domains Time-consuming to create; May make it unfeasible for specialized domains and personalized tasks
4
I NTRODUCTION Automatic Taxonomy Induction is a solution to Augment existing resources Quickly produce new taxonomies for specialized domains and personalized tasks Subtasks in Automatic Taxonomy Induction Term extraction Relation formation This paper focuses on Relation Formation
5
Related Work Pattern-based Approaches Define lexical-syntactic patterns for relations, and use these patterns to discover instances Have been applied to extract Is-a, part-of, sibling, synonym, causal, etc, relations Strength: Highly accurate Weakness: Sparse coverage of patterns Clustering-based Approaches Hierarchically cluster terms based on similarities of their meanings usually represented by a feature vector Have only been applied to extract is-a and sibling relations Strength: Allowing discovery of relations which do not explicitly appear in text; higher recall Weaknesses: Generally fail to produce coherent cluster for small corpora [ Pantel and Pennacchiotti 2006 ]; Hard to label non-leaf nodes
6
A UNIFIED SOLUTION Combine strengths of both approaches in a unified framework Flexibly incorporate heterogeneous features Use lexical-syntactic patterns as one types of features in a clustering framework Metric-based Taxonomy Induction
7
THE FRAMEWORK A novel framework, which Incrementally clusters terms Transforms taxonomy induction into a multi-criteria optimization Using heterogeneous features Optimization based on two criteria Minimization of taxonomy structures Minimum Evolution Assumption Modeling of term abstractness Abstractness Assumption
8
L ET ’ S B EGIN WITH S OME I MPORTANT D EFINITIONS A Taxonomy is a data model Concept Set Relationship Set Domain
9
M ORE D EFINITIONS ball table Game Equipment A Full Taxonomy: AssignedTermSet={game equipment, ball, table, basketball, volleyball, soccer, table-tennis table, snooker table} UnassignedTermSet={}
10
M ORE D EFINITIONS ball Game Equipment A Partial Taxonomy table AssignedTermSet={game equipment, ball, table, basketball, volleyball} UnassignedTermSet={soc cer, table-tennis table, snooker table}
11
M ORE D EFINITIONS Ontology Metric distance = 1.5distance = 2 distance =1 d(, ) = 2 d(, ) = 1 ball d(, ) = 4.5 table
12
A SSUMPTIONS Minimum Evolution Assumption: The Optimal Ontology is One that Introduces Least Information Changes!
13
I LLUSTRATION Minimum Evolution Assumption
14
I LLUSTRATION Minimum Evolution Assumption
15
I LLUSTRATION Minimum Evolution Assumption ball
16
I LLUSTRATION Minimum Evolution Assumption ball table
17
I LLUSTRATION Minimum Evolution Assumption ball table Game Equipment
18
I LLUSTRATION Minimum Evolution Assumption ball table Game Equipment
19
I LLUSTRATION Minimum Evolution Assumption ball table Game Equipment
20
A SSUMPTIONS Abstractness Assumption: Each abstraction level has its own Information function
21
A SSUMPTIONS Abstractness Assumption ball table Game Equipment
22
M ULTIPLE C RITERION O PTIMIZATION Minimum Evolution objective function Abstractness objective function Scalarization variable
23
E STIMATING O NTOLOGY M ETRIC Assume ontology metric is a linear interpolation of some underlying feature functions Ridge Regression to estimate and predict the ontology metric
24
THE FEATURES Our framework allows a wide range of features to be used Input for the Feature Functions: Two terms Output: A numeric score to measure semantic distance between these two terms We can use the following types of feature functions, but not restricted to only these: Contextual Features Term Co-occurrence Lexical-Syntactic Patterns Syntactic Dependency Features Word Length Difference Definition Overlap, etc
25
E XPERIMENTAL R ESULTS Task: Reconstruct taxonomies from WordNet and ODP Not the entire WordNet or ODP, but fragments of WordNet or ODP Ground Truth: 50 hypernym taxonomies from WordNet; 50 hypernym taxonomies from ODP; 50 meronym taxonomies from WordNet. Auxiliary Datasets: 1000 Google documents per term or per term pair; 100 Wikipedia documents per term. Evaluation Metrics: F1-measure (averaged by Leave-One-Out Cross Validation).
26
D ATASETS
27
P ERFORMANCE OF TAXONOMY INDUCTION Compare our system (ME) with other state-of-the-art systems HE: 6 is-a patterns [Hearst 1992] GI: 3 part-of patterns [Girju et al. 2003] PR: a probabilistic framework [Snow et al. 2006] ME: our metric-based framework
28
P ERFORMANCE OF TAXONOMY INDUCTION Our system (ME) consistently gives the best F1 for all three tasks. Systems using heterogeneous features (ME and PR) achieve a significant absolute F1 gain (>30%)
29
F EATURES VS. RELATIONS This is the first study of the impact of using different features on taxonomy induction for different relations Co-occurrence and lexico- syntactic patterns are good for is-a, part-of, and sibling relations Contextual and syntactic dependency features are only good for sibling relation
30
F EATURES VS. ABSTRACTNESS This is the first study of the impact of using different features on taxonomy induction for terms at different abstraction levels Contextual, co-occurrence, lexical-syntactic patterns, and syntactic dependency features work well for concrete terms; Only co-occurrence works well for abstract terms
31
C ONCLUSIONS This paper presents a novel metric-based taxonomy induction framework, which Combines strengths of pattern-based and clustering-based approaches Achieves better F1 than 3 state-of-the-art systems The first study on the impact of using different features on taxonomy induction for different types of relations and for terms at different abstraction levels
32
C ONCLUSIONS This work is a general framework, which Allows a wider range of features Allows different metric functions at different abstraction levels This work has a potential to learn more complex taxonomies than previous approaches
33
THANK YOU AND QUESTIONS huiyang@cs.cmu.edu callan@cs.cmu.edu huiyang@cs.cmu.edu callan@cs.cmu.edu
34
E XTRA S LIDES
35
FORMAL FORMULATION OF TAXONOMY INDUCTION The Task of Taxonomy Induction: The construction of a full ontology T given a set of concepts C and an initial partial ontology T 0 Keeping adding concepts in C into T 0 Note T 0 could be empty Until a full ontology is formed
36
GOAL OF TAXONOMY INDUCTION Find the optimal full ontology s.t. the information changes since T 0 are least, i.e., Note that this is by the Minimum Evolution Assumption
37
G ET TO THE G OAL Goal: Since the optimal set of concepts is always C Concepts are added incrementally
38
G ET TO THE G OAL Plug in definition of information change Transform into a minimization problem Minimum Evolution objective function
39
E XPLICITLY M ODEL A BSTRACTNESS Model Abstractness for each Level by Least Square Fit Plug in definition of amount of information for an abstraction level Abstractness objective function
40
T HE O PTIMIZATION A LGORITHM
41
M ORE D EFINITIONS distance = 1.5distance = 2 distance =1 d(, ) = 2 d(, ) = 1 ball d(, ) = 4.5 table Information in an Taxonomy T
42
M ORE D EFINITIONS d(, ) = 2 d(, ) = 1 ball d(, ) = 1 Information in a Level L ball
43
Contextual Features Global Context KL-Divergence = KL-Divergence(1000 Google Documents for C x, 1000 Google Documents for C y ); Local Context KL-Divergence = KL-Divergence(Left two and Right two words for C x, Left two and Right two words for C y ). Term Co-occurrence Point-wise Mutual Information (PMI) = # of sentences containing the term(s); or # of documents containing the term(s); or n as in “Results 1-10 of about n for …” in Google. EXAMPLES OF FEATURES
44
Syntactic Dependency Features Minipar Syntactic Distance = Average length of syntactic paths in syntactic parse trees for sentences containing the terms; Modifier Overlap = # of overlaps between modifiers of the terms; e.g., red apple, red pear; Object Overlap = # of overlaps between objects of the terms when the terms are subjects; e.g., A dog eats apple; A cat eats apple; Subject Overlap = # of overlaps between subjects of the terms when the terms are objects; e.g., A dog eats apple; A dog eats pear; Verb Overlap = # of overlaps between verbs of the terms when the terms are subjects/objects; e.g., A dog eats apple; A cat eats pear.
45
EXAMPLES OF FEATURES Lexical-Syntactic Patterns
46
EXAMPLES OF FEATURES Miscellaneous Features Definition Overlap = # of non-stopword overlaps between definitions of two terms. Word Length Difference
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.