Download presentation
Presentation is loading. Please wait.
Published byDerrick Turner Modified over 9 years ago
1
A Random Graph Walk based Approach to Computing Semantic Relatedness Using Knowledge from Wikipedia Presenter: Ziqi Zhang OAK Research Group, Department of Computer Science, University of Sheffield Authors: Ziqi Zhang, Anna Lisa Gentile, Lei Xia, José Iria, Sam Chapman
2
Introduction to semantic relatedness Motivation to this research Methodology: random walk, Wikipedia, semantic relatedness Experiment and Evaluation: computing semantic relatedness, semantic relatedness for named entity disambiguation In this presentation…
3
Semantic relatedness (SR) measures how much words or concepts are related by encompassing all kinds of relations between them Semantic Relatedness > Introduction LREC computer science Malta ACL COLING computational linguistics Volcano ashes Airline ? ? It captures broader sense than semantic similarity It enables many complex NLP tasks, e.g., sense disambiguation, lexicon construction
4
Typically, lexical resources (e.g., WordNet, Wikipedia) are needed to provide structural and content information about concepts Method and Literature > Introduction Relatedness is computed by aggregating and balancing these “semantic” elements using mathematical formula Some best known works: Resnik (1995), Leacock & Chodorow (1998), Strube & Ponzetto (2006), Zesch et al. (2008), Gabrilovich & Markovitch (2007) Recent trend: towards using collaborative lexical resources, such as Wikipedia, Wiktionary
5
Wikipedia contains rich and diverse structural and content information about concepts and entities Another SR measure, why? > Motivation Title Redirect Content words Links Lists Infobox Category On a Wiki page: Which are useful for SR? Which are more useful than others? Can we combine them? How to combine them? Can we gain more if we combine them?
6
This paper aims to answer these questions by The Research > Motivation Proposing a method that naturally integrates diverse features in a balanced way, and studying the importance of different features
7
Overview of the method > Methodology “NLP” “Computational Linguistics” Wiki Page Retrieval Wiki Page Retrieval Feature Extraction Feature Extraction “NLP” F. 1 F. 2 F. 3 weight=x weight=y weight=z Random Walk “NLP” F. 1 F. 3 F. 2 “Computational Linguistics” F’. 1 F’. 3 F’. 2 Rel?
8
Wiki page retrieval > Methodology Objective: given two words/phrases, find the corresponding information pages from Wikipedia that they refer to “Computational Linguistics” Wiki Page Retrieval Wiki Page Retrieval “NLP” Problem: Ambiguities of input words (surface) Solution: Collect all pages (sense page), compute pair-wise relatedness between all senses, choose the pair with maximum score Natural Language Processing National Liberal Party Computati onal Linguistics (science) Computatio nal Linguistics (journal)
9
Feature Extraction > Methodology Objective: identify useful features to represent each sense of a surface for algorithmic consumption Page title and redirect target Content words from the first section; or top n frequent words from the entire page Page categories (search depth = 2) Outgoing link target in list structure Other outgoing link targets Descriptive/Definitive noun (first noun phrase after be in the first sentence) All features formulated at word-level
10
Random Walk – Graph Construction > Methodology Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features “Natural Language Processing” T1 L3 C1 “Computational Linguistics (science)” has_title has_ category has_ category has_ link has_ link has_ link C2 L2 L1 T2 C3 has_title has_ category L4 has_ link L5 has_ link has_ link has_ category
11
Random Walk – Graph Construction > Methodology Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features “Natural Language Processing” T1 L3 C1 “Computational Linguistics (science)” has_title has_ category has_ category has_ link has_ link has_ link C2 L2 L1 T2 C3 has_title has_ category L4 has_ link L5 has_ link has_ link Intuition: a walker takes n steps, in each step a random route is taken has_ category
12
Random Walk – Graph Construction > Methodology Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features “Natural Language Processing” T1 L3 C1 “Computational Linguistics (science)” has_title has_ category has_ category has_ link has_ link has_ link C2 L2 L1 T2 C3 has_title has_ category L4 has_ link L5 has_ link has_ link has_ category Intuition: a walker takes n steps, in each step a random route is taken
13
Random Walk – Graph Construction > Methodology Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features “Natural Language Processing” T1 L3 C1 “Computational Linguistics (science)” has_title has_ category has_ category has_ link has_ link has_ link C2 L2 L1 T2 C3 has_title has_ category L4 has_ link L5 has_ link has_ link has_ category Intuition: starting from a node, in n step, one can reach a limited set of other nodes.
14
Random Walk – Graph Construction > Methodology Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features “Natural Language Processing” T1 L3 C1 “Computational Linguistics (science)” has_title has_ category has_ category has_ link has_ link has_ link C2 L2 L1 T2 C3 has_title has_ category L4 has_ link L5 has_ link has_ link has_ category Intuition: the more routes connecting the desired end nodes, and the more likely the routes are taken, the more relevant two senses are
15
Random Walk – Graph Construction > Methodology Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features “Natural Language Processing” T1 L3 C1 “Computational Linguistics (science)” has_title has_ category has_ category has_ link has_ link has_ link C2 L2 L1 T2 C3 has_title has_ category L4 has_ link L5 has_ link has_ link has_ category Intuition: the more routes connecting the desired end nodes, and the more likely the routes are taken, the more relevant two senses are Routes are established by feature extraction and graph construction
16
Random Walk – Graph Construction > Methodology Objective: plot surfaces, their senses and features on a single graph, so senses will be connected by shared features “Natural Language Processing” T1 L3 C1 “Computational Linguistics (science)” has_title has_ category has_ category has_ link has_ link has_ link C2 L2 L1 T2 C3 has_title has_ category L4 has_ link L5 has_ link has_ link has_ category Intuition: the more routes connecting the desired end nodes, and the more likely the routes are taken, the more relevant two senses are “Likelihood” is modelled by importance of each type of feature, and to be studied by experiments
17
Random Walk – The Math > Methodology Random walk is simulated via matrix calculation and transformation Adjacency matrix modelling distribution of weights for different features T-step random walk is achieved by matrix calculation Translating probability to relatedness
18
Experiment & Evaluation > Experiment The experiments are designed to achieve three objectives – Analyse the importance of each proposed feature – Evaluate effectiveness of the random walk method for computing semantic relatedness – Evaluate the usefulness of the method for solving other NLP problems – Named Entity Disambiguation (NED)
19
Feature Analysis > Experiment Simulated Annealing optimisation (Nie et al., 2005) method is used to perform the analysis, in which – 200 pair of words from WordSim353 is used – To begin with, we treat each feature equally by assigning same weights (weight model) – Compute SR using the weight model, and evaluate against the gold standard – Hundreds of iterations are run, in each turn, different weight model is generated randomly – Manually analysing the weight model that contribute to the highest performance achieved on this dataset, eliminating least important features or combining them into other features that are semantically similar
20
Feature Analysis - findings > Experiment WeightFeature 0.166Title (incl. redirect target) 0.166First section words 0.166Categories 0.166Descriptive nouns 0.166Out links in lists 0.166Other out links WeightFeature 0.16Title (incl. redirect target) 0.28First section words Most freq words (top 75) 0.11Categories 0.45Out links Achieved best accuracy of 0.45 on the data, compared to best in the literature of 0.5 by Zesch et al. (2008)
21
Feature Analysis - findings > Experiment WeightFeature 0.166Title (incl. redirect target) 0.166First section words 0.166Categories 0.166Descriptive nouns 0.166Out links in lists 0.166Other out links WeightFeature 0.16Title (incl. redirect target) 0.28First section words Most freq words (top 75) 0.11Categories 0.45Out links This setting is then used for further evaluation
22
Evaluating Computation of SR > Experiment Three datasets are chosen: different set of 153 pairs of words from WordSim353; 65 pairs from Rubenstein &Goodenough (1965), RG65; 30 pairs from Miller & Charles (1991), MC30 Compared against: a collection of WordNet-based algorithms and other state-of-the-art methods for SR WordSim353 -153 RG65MC30WordSim353 -200 feature analysis Ours0.710.760.710.46 Strube & Ponzetto (2006)0.550.690.67/ Zesch et al. ESA (2008)0.62//0.31 Zesch et al. Wiki (2008)0.70.760.680.5 Zesch et al. Wiktionary (2008)0.70.84 0.6 Best of WordNet0.390.790.810.23
23
Evaluating Usefulness of SR for NED > Experiment The NED method in a nutshell (Details: Gentile et al., 2009) Identify surfaces of NEs that occur in a text passage and that are defined by Wikipedia, retrieve corresponding sense pages Computing SR of each pair of their underlying senses The sense of a surface is determined collectively by the senses of other surfaces found in the text (contexts) Three functions are defined to capture this collective context
24
Evaluating Usefulness of SR for NED > Experiment Dataset: 20 news stories by Cucerzan (2007), each story contains 10 – 50 NEs Accuracy Our best91.5 Our baseline68.7 Cucerzan baseline51.7 Curcerzan best91.4
25
Conclusion Computing SR isn’t an easy task Different structural and content information in Wikipedia all contribute to the task, but in different weights Combining these different features in a uniform measure can improve performance Can we use simpler similarity functions to obtain same results? Can we integrate different lexical resources? How to compute relatedness/similarity of longer text passages? In future
26
Thank you! Cucerzan, S. (2007). Large-Scale Named Entity Disambiguation Based on Wikipedia Data. In EMNLP’07 Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., and Ruppin, E. (2002). Placing search in context: the concept revisited. In ACM Transactions on Information Systems, 20 (1), pp. 116 – 131 Gabrilovich, E., Markovitch, S. (2007). Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In Proceedings of IJCAI’07, pp. 1606-1611 Gentile, A., Zhang, Z., Xia, L., Iria, J. (2009). Graph-based semantic relatedness for named entity disambiguation. In S3T Leacock, C., Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In C. Fellbaum (Ed.), WordNet. An Electronic Lexical Database, Chp. 11, pp. 265-283. Miller, G., Charles, W. (1991). Contextual correlates of semantic similarity. In Language and Cognitive Processes, 6(1): 1-28 Nie, Z., Zhang, Y., Wen, J., Ma, W. (2005). Object-level ranking: bringing order to web objects. In Proceedings of WWW’05 Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of IJCAI-95, pp. 448-453 Rubenstein, H., Goodenough, J. (1965). Contextual correlates of synonymy. In Communications of the ACM, 8(10):627-633 Strube, M., Ponzetto, S. (2006). WikiRelate! Computing semantic relatedness using Wikipedia. In AAAI’06 Zesch, T., Müller, C., Gurevych, I. (2008). Using Wiktionary for computing semantic relatedness. In Proceedings of AAAI’08 References (complete list can be found in paper)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.