Download presentation
Presentation is loading. Please wait.
Published byDuane Rose Modified over 9 years ago
1
Printing: This poster is 48” wide by 36” high. It’s designed to be printed on a large-format printer. Customizing the Content: The placeholders in this poster are formatted for you. Type in the placeholders to add text, or click an icon to add a table, chart, SmartArt graphic, picture or multimedia file. To add or remove bullet points from text, just click the Bullets button on the Home tab. If you need more placeholders for titles, content or body text, just make a copy of what you need and drag it into place. PowerPoint’s Smart Guides will help you align it with everything else. Want to use your own pictures instead of ours? No problem! Just right-click a picture and choose Change Picture. Maintain the proportion of pictures as you resize by dragging a corner. Evaluating Word Sense Induction and Disambiguation Methods Ioannis P. Klapaftis Suresh Manandhar Poster by Sumedh Masulkar Guided by Prof. Amitabh Mukherjee INTRODUCTION Word Sense Induction is the method to deduce automatically the senses or uses of a given word with multiple meanings (known as target word) directly from a text without relying on any external resources such as dictionaries or sense-tagged data. It is also known as unsupervised Word Sense Disambiguation, since Word Sense Induction(WSI) methods automatically disambiguates the ambiguous occurrences of a given word. This paper presents a thorough description of the SemEval-2010 WSI task and a new evaluation setting for sense induction methods. WSD was first formulated as a computational task during the early days of machine translation in the 1940s, making it one of the oldest problems in NLP. Few applications of WSI include differentiating homonyms in Web Information Retrieval(IR). They help to develop a theory of text based IR. WSI is also useful in various search engines to get better results for a query, in online translations of various websites or texts for making content available in different languages to different people. PREVIOUS WORKS By Agirre et al.(2006), where they propose evaluating and optimizing the parameters of an unsupervised graph-based WSD algorithm. By Aitor Soroa and Eneko Agirre (SemEval-2007), Evaluating Word Sense Induction and Discrimination Systems, where they proposed method for comparison across sense-induction and discrimination systems, and also to compare these systems to other supervised and knowledge-based systems. By Ioannis P. Klapaftis and Suresh Manandhar (SemEval-2010), Evaluation Setting for Word Sense Induction and Disambiguation Systems. F-score in the setting of the SemEval-2007 WSI task, suffers from the matching problem which does not allow: (1) the assessment of the entire membership of clusters, and (2) the evaluation of all clusters in a given solution. In this paper, the authors present the use of V-measure as a measure of objectively assessing WSI methods in an unsupervised setting, and also a small modification on the supervised evaluation. By Marianna Apidianaki and Tim Van de Cruys, A Quantitative Evaluation of Global Word Sense Induction where the authors compare the performance of such algorithms to an algorithm that uses a `global' approach, i.e. the different senses of a particular word are determined by comparing them to, and demarcating them from, the senses of other words in a full-blown word space model. APPROACH RESULTS(2) RESULTS(1) CONCLUSIONS This paper presented the main difference of the task from corresponding SemEval- 2007 and SemEval-2010 WSI challenge, and subsequently evaluated participating systems in terms of their unsupervised (V-Measure, paired F-Score) and supervised performance according to the skewness of target words distribution of senses. The results seem to justify the authors' claim and the particular evaluation measure does not offer any discriminative information among the three categories. This is an important evaluation setting, in which the results of systems can be interpreted in terms of the number of generated clusters and the distribution of target word instances within the clusters. CONTACT Sumedh Masulkar sumedh@iitk.ac.in (+91)8005463472 As in the official evaluation, we also observe that systems generating a higher number of clusters achieve a high V-Measure, although their performance does not increase monotonically with the number of clusters increasing. Given that performance in the paired F-Score seems to be more biased towards a small number of clusters, than V-Measure was towards a high number of clusters, the particular evaluation measure does not offer any discriminative information among the three skewness categories.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.