Download presentation
Presentation is loading. Please wait.
Published byJack Newman Modified over 9 years ago
1
The Topology of WordNet: some metrics Ann Devitt and Carl Vogel Computational Linguistics Group Trinity College Dublin, Ireland
2
Ann Devitt, TCD Introduction Measures Measures WordNet “sub-hierarchies” Multiple inheritance Branching Factor Depth versus Height Cluster coefficients Specificity pilot study Specificity pilot study
3
Ann Devitt, TCD Terminology WordNet as directed acyclic graph WordNet as directed acyclic graph Node and synset interchangeable Node and synset interchangeable
4
Ann Devitt, TCD Dimensional distribution
5
Ann Devitt, TCD Overlap between hierarchies 2072 synsets: more than 1 top hierarchy 2072 synsets: more than 1 top hierarchy 35 synsets: more than 2 top hierarchies 35 synsets: more than 2 top hierarchies
6
Ann Devitt, TCD Some overlap examples Abstraction and Event Abstraction and Event 948 synsets group action Entity and Group Entity and Group 250 nodes weaponry
7
Ann Devitt, TCD Multiple inheritance 2.6% of nodes 2.6% of nodes Normal distribution throughout depth Normal distribution throughout depth Significantly different in different taxonomies: Significantly different in different taxonomies: χ 2 (8, N=75180)=324.27, p≤0.001
8
Ann Devitt, TCD Specificity examples Parents = 1, depth < 3 Parents = 1, depth < 3 damnation office Parents = 1, depth > 8 Parents = 1, depth > 8 beagle palomino Parents > 1, depth < 3 person artefact Parents > 1, depth > 8 sea bass self- condemnation bombardon
9
Ann Devitt, TCD Branching Factor Number of children + 1 Number of children + 1 Including leaf nodes Including leaf nodes Range: 1 – 573 Average: 2.023 Excluding leaf nodes: Excluding leaf nodes: Average: 5.793 97% less than 20
10
Ann Devitt, TCD Branching factor Overall low branching factor Overall low branching factor Same distribution in all sub-hierarchies Same distribution in all sub-hierarchies Large number of nodes in total Large number of nodes in total Greater overall depth in paths Greater overall depth in paths Not a shallow structure Not a shallow structure despite 55,000 leaf nodes
11
Ann Devitt, TCD Depth vs Height Depth: Depth: Maximum = 18 Normal distribution Height: Height: Maximum = 5 93.6% 1 or 2 nodes from a leaf node Zipfian distribution
12
Ann Devitt, TCD Depth vs Height Reported distributions Reported distributions the same across the different sub hierarchies Depth is a more informative measure Depth is a more informative measure
13
Ann Devitt, TCD Clustering coefficient Measure of graph connectivity Measure of graph connectivity Ratio: Ratio: Number of connections btwn nodes Possible number of connections 2 Σ i k i (k i – 1)
14
Ann Devitt, TCD Cluster coefficients First-order measure First-order measure Not useful for WordNet Only 62 nodes have a coefficient > 0 Does not form clusters readily
15
Ann Devitt, TCD Cluster coefficients Second-order measure Second-order measure Average 0.337 Normal distribution May form clusters of wider diameter
16
Ann Devitt, TCD Pilot Study Aims 1. Do people have a notion of generality/specificity for concepts? 2. Do people agree on what is more/less general/specific? 3. What features of WordNet do these judgments correlate with?
17
Ann Devitt, TCD Sample ranking task I Axis, axis of rotation – (the center around which something rotates Axis, axis of rotation – (the center around which something rotates River boat – (a boat used on rivers or to ply a river) River boat – (a boat used on rivers or to ply a river) Remains – (any object that is left unused or still extant; “I threw out the remains of my dinner” Remains – (any object that is left unused or still extant; “I threw out the remains of my dinner”
18
Ann Devitt, TCD Sample ranking task II rational motive - (a motive that can be defended by reasoning or logical argument rational motive - (a motive that can be defended by reasoning or logical argument disapproval - (the act of disapproving or condemning) disapproval - (the act of disapproving or condemning) harmony, concord, concordance - (agreement of opinions) harmony, concord, concordance - (agreement of opinions)
19
Ann Devitt, TCD Do people agree on what is more/less general/specific? YES Cochran Q statistic (Cochran 1950) Cochran Q statistic (Cochran 1950) H 0 : that any agreement between respondents is due to chance H 0 : that any agreement between respondents is due to chance Overall: for 11 respondents Overall: for 11 respondents Cochran's Q165.859 44 degrees of freedom Asymp. Sig..000
20
Ann Devitt, TCD What WN features correlate? Depth Depth Less deep = more general Children Children Inconclusive Sisters Sisters Less sisters = more general Sub-hierarchy Sub-hierarchy Did not seem to affect judgments Did increase the difficulty of the task
21
Ann Devitt, TCD Conclusion WordNet metrics WordNet metrics Inheritance: Sub-hierarchy and parentage Branching Factor Distance: depth and height Clustering Pilot study Pilot study Suggests where to go with a larger study
22
Ann Devitt, TCD Bibliography W. G. Cochran: The comparison of percentages in matched samples. Biometrika, 37:256-266, 1950 W. G. Cochran: The comparison of percentages in matched samples. Biometrika, 37:256-266, 1950 David Touretsky: The Mathematics of Inheritance Systems, Los Altos, CA: Morgan Kaufmann (1986) David Touretsky: The Mathematics of Inheritance Systems, Los Altos, CA: Morgan Kaufmann (1986) D. J. Watts and S. H. Strogatz: Collective dynamics of small world networks, Nature 401, 130 (1999) D. J. Watts and S. H. Strogatz: Collective dynamics of small world networks, Nature 401, 130 (1999)
23
Ann Devitt, TCD Multiple Inheritance vs Depth
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.