Download presentation
Presentation is loading. Please wait.
Published byNorma Dean Modified over 9 years ago
1
WSD for Applications Bill Dolan SenseEval 2004
2
Where is WSD useful? Lots of work in the field, but still no clear answer Where WSD = classical, dictionary-sense resolution
3
Intuitive Motivations Automates something we already do with dictionaries Many applications seem to require WSD Information Retrieval/Question Answering Cross-language information retrieval Information extraction Proofing tools, e.g. synonym replacement Translation
4
Pragmatic Motivations Splitting off WSD yields a pleasing division of the NLP problem space manageable in size clear success metrics readily available training data: annotated and unannotated
5
But where are the applications? Why is it so hard to find a convincing app? Hopeful answer: the quality bar just hasn’t been met yet But even experimentally, little/no evidence that WSD helps any application Alternatively: maybe we’re trying to automate the wrong task Then what is the right task?
6
An Application-centric view What do apps actually need? Information Retrieval/Question Answering Cross-language information retrieval Information extraction Proofing tools, e.g. synonym replacement Translation Not a sense, a cluster of related words, etc. Instead: The ability to map one string into another that’s superficially distinct Regardless of length or language Paraphrase
7
Question Answering The genome of the fungal pathogen that causes Sudden Oak Death has been sequenced by US scientists Researchers announced Thursday they've completed the genetic blueprint of the blight- causing culprit responsible for sudden oak death Scientists have figured out the complete genetic code of a virulent pathogen that has killed tens of thousands of California native oaks The East Bay-based Joint Genome Institute said Thursday it has unraveled the genetic blueprint for the diseases that cause the sudden death of oak trees
8
Information Extraction The genome of the fungal pathogen that causes Sudden Oak Death has been sequenced by US scientists Researchers announced Thursday they've completed the genetic blueprint of the blight- causing culprit responsible for sudden oak death Scientists have figured out the complete genetic code of a virulent pathogen that has killed tens of thousands of California native oaks The East Bay-based Joint Genome Institute said Thursday it has unraveled the genetic blueprint for the diseases that cause the sudden death of oak trees
9
Cross-lingual Information Retrieval The genome of the fungal pathogen that causes Sudden Oak Death has been sequenced by US scientists Researchers announced Thursday they've completed the genetic blueprint of the blight- causing culprit responsible for sudden oak death Scientists have figured out the complete genetic code of a virulent pathogen that has killed tens of thousands of California native oaks The East Bay-based Joint Genome Institute said Thursday it has unraveled the genetic blueprint for the diseases that cause the sudden death of oak trees
10
Proofing: rewriting tool The genome of the fungal pathogen that causes Sudden Oak Death has been sequenced by US scientists Researchers announced Thursday they've completed the genetic blueprint of the blight- causing culprit responsible for sudden oak death Scientists have figured out the complete genetic code of a virulent pathogen that has killed tens of thousands of California native oaks The East Bay-based Joint Genome Institute said Thursday it has unraveled the genetic blueprint for the diseases that cause the sudden death of oak trees
11
A different take on the problem What’s missing is a basic enabling technology Paraphrase identification/generation capability The applications for WSD that have been suggested over the years really need more general paraphrase identification/generation skills Resolving lexical associations is just one aspect of this Problem begins to look more like an MT problem Map one chunk of text to another, similar or not Not clear that explicit WSD useful
12
Some Apps Machine Translation Data-driven techniques predominate, work pretty well No explicit WSD, just learned associations between bilingual pairings Lexical mappings learned through statistical association not perfect, but given the right data, pretty good Different language pairs require different sense breakdowns Paraphrase/MT are the same problem Cross-language IR What else but MT? Proofing tools, e.g. thesaurus-level replacements But often not terribly useful; as any writer knows, there’s usually no good synonym, and a complete rewrite is necessary Question Answering/IR Map a query to a piece of text to semantically similar but potentially formally distinct prose For all of these apps, problem is less individual words than whole sequences
13
Direction? The applications that have been suggested for WSD are all just aspects of the larger paraphrase problem Even MT is a paraphrase problem, though a bit more extreme than the monolingual case Focus on the broader paraphrase problem, rather than on individual words
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.