Download presentation
Presentation is loading. Please wait.
1
Lexicons, Concept Networks, and Ontologies
Kevin Bloomquist Dan Pratt
2
What is a Lexicon? A general term
Simple word lists (base word, POS) Wordnets (related words & other info as well depending on project) Ontologies (hierarchical forms) At a bare minimum, it contains a dictionary in some machine readable format Entire field of Computational Lexicology
3
General Applications of Lexicons
Word sense disambiguation: SENSEVAL Use of unsupervised systems Pattern matching in Information Extraction Categorize words by what syntactic information they convey Question Answering Use of keywords Use of ontologies in text summarization Speech recognition/synthesis
4
How are Lexicons Created?
First created from already existing dictionaries that are made machine readable Lexicons can be added to with derived information from corpuses Statistical information, etc. Human input The intended use highly influences how the lexicon is organized and what information it conveys
5
Ontologies Semantic relations between words
Example below for the word “oxygenate” Uses information about word roots and definitions to create the graph. This graph creates “definition cycles” This is just one example… There are many ways to create an ontology From Litowski… He hypothesized that there were primitive words that had similar patterns of use.
6
Mindnet An application that finds relations between arbitrary sets of words Uses definitions to find different types of relations between words, such as synonym, antonym, goal, part, object, and subject Attempts to construct logical relations using a lexical database Now part of Microsoft, next step is working on machine translation
7
ACQUILEX I & II Overall Goal: Develop a rich multilingual knowledge base Want to “support a ‘deep’ knowledge-intensive model of language processing.” I: Explore creating a multilingual dictionary out of a number of machine readable dictionaries Some were monolingual, some bilingual II: Add to this by using statistical information from corpuses Ended up publishing a large number of academic papers (most of which are highly specific or immediately inaccessible)
8
Overall Insights One of the main problems with building lexicons is each project develops its own format and chooses the information required. WordNet is changing this. Building good ontologies may be the next important step, but there may be other (better/easier) ways
9
WordNet
10
WordNet Online A field to type in a word
Eight options that can be displayed or hidden Every definition has related words and you can view there definition.
11
Example of Online Hybrid Hybridize
12
How does WordNet work? WordNet is a large database containing words and their definitions. Also it contains mapping between words, like synonyms and antonyms. It can tell how common or rare a word is in a particular sense.
13
What is not covered by WordNet?
WordNet does not include any closed set of words. That means no pronouns, articles, conjunctions, prepositions, etc. The only types are nouns, verbs, adjectives and adverbs.
14
Example of how WordNet stores a word.
Index.sense: hybrid%1:05:00:: hybrid%1:09:00:: hybrid%1:10:00:: hybrid%5:00:00:crossbred:
15
Example cont. Index.noun: hybrid n 3 ~ + ; Index.adj: hybrid a 1 1 & Data.adj: s 04 crossed 0 hybrid 0 interbred 0 intercrossed & a 0000 | produced by crossbreeding
16
Example cont. Data.noun: n 03 loanblend 0 loan-blend 0 hybrid n 0000 ;r n 0000 ;c n 0000 | a word that is composed of parts from different languages (e.g., `monolingual' has a Greek prefix and a Latin root) n 01 hybrid n v 0103 | a composite of mixed origin; "the vice-presidency is a hybrid of administrative and legislative offices“ n 03 hybrid 0 crossbreed 0 cross n v v v 0103 ~ n 0000 ~ n 0000 ~ n 0000 | an organism that is the offspring of genetically dissimilar parents or stock; especially offspring produced by breeding plants or animals of different varieties or breeds or species; "a mule is a cross between a horse and a donkey"
17
Example of Download Hybrid
18
References ACQUILEX I and II. Sponsored by the European Commission, centered at University of Cambridge Last access: 01/25/06 Litowski, Kenneth C. Computational Lexicons and Dictionaries. Part of CL Research Dolan, et. al. Mindnet. Microsoft Research WordNet. Princeton University
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.