Lexicons, Concept Networks, and Ontologies

Slides:



Advertisements
Similar presentations
A centralized approach to language resources Piek Vossen S&T Forum on Multilingualism, Luxembourg, June 6th 2005.
Advertisements

Grammar Spinner Touch any part of the screen to begin. (Or click your mouse) Touch the screen again each time you want to spin.
Improved TF-IDF Ranker
Building a Large- Scale Knowledge Base for Machine Translation Kevin Knight and Steve K. Luk Presenter: Cristina Nicolae.
C SC 620 Advanced Topics in Natural Language Processing Lecture 22 4/15.
Building an Ontology-based Multilingual Lexicon for Word Sense Disambiguation in Machine Translation Lian-Tze Lim & Tang Enya Kong Unit Terjemahan Melalui.
C SC 620 Advanced Topics in Natural Language Processing Sandiway Fong.
Introduction to Lexical Semantics Vasileios Hatzivassiloglou University of Texas at Dallas.
Creating a Bilingual Ontology: A Corpus-Based Approach for Aligning WordNet and HowNet Marine Carpuat Grace Ngai Pascale Fung Kenneth W.Church.
Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
WORDNET Approach on word sense techniques - AKILAN VELMURUGAN.
WordNet ® and its Java API ♦ Introduction to WordNet ♦ WordNet API for Java Name: Hao Li Uni: hl2489.
Machine Translation, Digital Libraries, and the Computing Research Laboratory Indo-US Workshop on Digital Libraries June 23, 2003.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Related terms search based on WordNet / Wiktionary and its application in ontology matching RCDL'2009 St. Petersburg Institute for Informatics and Automation.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
Language Learning Targets based on CLIMB standards.
WORD SENSE DISAMBIGUATION STUDY ON WORD NET ONTOLOGY Akilan Velmurugan Computer Networks – CS 790G.
W ORD S ENSE D ISAMBIGUATION By Mahmood Soltani Tehran University 2009/12/24 1.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
Using a Lemmatizer to Support the Development and Validation of the Greek WordNet Harry Kornilakis 1, Maria Grigoriadou 1, Eleni Galiotou 1,2, Evangelos.
Application of INTEX in refinement and validation of Serbian WordNet Ivan Obradović, Ranka Stanković Cvetana Krstev, Gordana Pavlović-Lažetić University.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
_____________________ Definition Part of Speech (circle one) Picture Antonym (Opposite) Vocab Word Noun Pronoun Adjective Adverb Conjunction Verb Interjection.
8. ONLINE REFERENCE TOOLS Dictionaries and Thesauruses Concordancers and corpuses for language analysis Translators for language analysis Encyclopedias.
SVETLA KOEVA SVETLOZARA LESEVA BORISLAV RIZOV. The project Automatic information extraction based on semantic relations (RILA – a bilateral co-operation.
Computational Linguistics. The Subject Computational Linguistics is a branch of linguistics that concerns with the statistical and rule-based natural.
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
Parts of Speech Major source: Wikipedia. Adjectives An adjective is a word that modifies a noun or a pronoun, usually by describing it or making its meaning.
Parts of Speech Review. A Noun is a person, place, thing, or idea.
GoBack definitions Level 1 Parts of Speech GoBack is a memorization game; the teacher asks students definitions, and when someone misses one, you go back.
PARTS OF SPEECH ANSWER: QUESTION: HOW MANY PARTS OF SPEECH ARE IN THE ENGLISH LANGUAGE? A.4 B.6 C.8.
Types of Dictionaries A. Types of Dictionaries in terms of form/medium: - Books (advantages & disadvantages) - CDs (advantages & disadvantages) - Internet/Online.
NATURAL LANGUAGE PROCESSING
A knowledge rich morph analyzer for Marathi derived forms Ashwini Vaidya IIIT Hyderabad.
MORPHOLOGY. PART 1: INTRODUCTION Parts of speech 1. What is a part of speech?part of speech 1. Traditional grammar classifies words based on eight parts.
Parts of Speech By: Miaya Nischelle Sample. NOUN A noun is a person place or thing.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
ENGLISH is a language Learning mode of ENGLISH Subject Language(Spoken) Literature Competition.
Automatic Writing Evaluation
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Verbal Foundations.
GRE VERBAL REASONING VOCABULARY BUILDING.
Approaches to Machine Translation
Jason Ji Computer Systems Laboratory
ENGLISH MORPHOLOGY Week 1.
Statistical NLP: Lecture 13
Cross-language Information Retrieval
European Network of e-Lexicography
WordNet: A Lexical Database for English
Language Review Topics
FIRST SEMESTER GRAMMAR
WordNet WordNet, WSD.
A method for WSD on Unrestricted Text
Approaches to Machine Translation
PREPOSITIONAL PHRASES
Text Mining Application Programming Chapter 3 Explore Text
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Word phoneme SENTENCE PHRASE SUFFIX prefix PHRASE CLAUSE UTTERANCE PART OF SPEECH MICRO-LINGUISTICS Macro-linguistics Language dictionary LEXICON allophone.
Word Map Synonyms (what is the word similar to?):
Word Map Synonyms (what is the word similar to?):
Word Map Synonyms (what is the word similar to?):
Using Dictionaries in Translation (223 TRAJ)
Word Map Synonyms (what is the word similar to?):
Presentation transcript:

Lexicons, Concept Networks, and Ontologies Kevin Bloomquist Dan Pratt

What is a Lexicon? A general term Simple word lists (base word, POS) Wordnets (related words & other info as well depending on project) Ontologies (hierarchical forms) At a bare minimum, it contains a dictionary in some machine readable format Entire field of Computational Lexicology

General Applications of Lexicons Word sense disambiguation: SENSEVAL Use of unsupervised systems Pattern matching in Information Extraction Categorize words by what syntactic information they convey Question Answering Use of keywords Use of ontologies in text summarization Speech recognition/synthesis

How are Lexicons Created? First created from already existing dictionaries that are made machine readable Lexicons can be added to with derived information from corpuses Statistical information, etc. Human input The intended use highly influences how the lexicon is organized and what information it conveys

Ontologies Semantic relations between words Example below for the word “oxygenate” Uses information about word roots and definitions to create the graph. This graph creates “definition cycles” This is just one example… There are many ways to create an ontology From Litowski… He hypothesized that there were primitive words that had similar patterns of use.

Mindnet An application that finds relations between arbitrary sets of words Uses definitions to find different types of relations between words, such as synonym, antonym, goal, part, object, and subject Attempts to construct logical relations using a lexical database http://atom.research.microsoft.com/mnex/InputPath.aspx?l=e&d=d Now part of Microsoft, next step is working on machine translation

ACQUILEX I & II Overall Goal: Develop a rich multilingual knowledge base Want to “support a ‘deep’ knowledge-intensive model of language processing.” I: Explore creating a multilingual dictionary out of a number of machine readable dictionaries Some were monolingual, some bilingual II: Add to this by using statistical information from corpuses Ended up publishing a large number of academic papers (most of which are highly specific or immediately inaccessible)

Overall Insights One of the main problems with building lexicons is each project develops its own format and chooses the information required. WordNet is changing this. Building good ontologies may be the next important step, but there may be other (better/easier) ways

WordNet

WordNet Online A field to type in a word Eight options that can be displayed or hidden Every definition has related words and you can view there definition.

Example of Online Hybrid Hybridize

How does WordNet work? WordNet is a large database containing words and their definitions. Also it contains mapping between words, like synonyms and antonyms. It can tell how common or rare a word is in a particular sense.

What is not covered by WordNet? WordNet does not include any closed set of words. That means no pronouns, articles, conjunctions, prepositions, etc. The only types are nouns, verbs, adjectives and adverbs.

Example of how WordNet stores a word. Index.sense: hybrid%1:05:00:: 01310936 3 0 hybrid%1:09:00:: 05796358 2 0 hybrid%1:10:00:: 06210172 1 0 hybrid%5:00:00:crossbred:00 01973272 1 0

Example cont. Index.noun: hybrid n 3 4 @ ~ + ; 3 0 06210172 05796358 01310936 Index.adj: hybrid a 1 1 & 1 0 01973272 Data.adj: 01973272 00 s 04 crossed 0 hybrid 0 interbred 0 intercrossed 0 001 & 01972954 a 0000 | produced by crossbreeding

Example cont. Data.noun: 06210172 10 n 03 loanblend 0 loan-blend 0 hybrid 0 003 @ 06203456 n 0000 ;r 08657546 n 0000 ;c 06868465 n 0000 | a word that is composed of parts from different languages (e.g., `monolingual' has a Greek prefix and a Latin root) 05796358 09 n 01 hybrid 0 002 @ 05796126 n 0000 + 01417728 v 0103 | a composite of mixed origin; "the vice-presidency is a hybrid of administrative and legislative offices“ 01310936 05 n 03 hybrid 0 crossbreed 0 cross 0 007 @ 00004576 n 0000 + 01417728 v 0302 + 01417728 v 0201 + 01417728 v 0103 ~ 01311349 n 0000 ~ 01311480 n 0000 ~ 01311624 n 0000 | an organism that is the offspring of genetically dissimilar parents or stock; especially offspring produced by breeding plants or animals of different varieties or breeds or species; "a mule is a cross between a horse and a donkey"

Example of Download Hybrid

References ACQUILEX I and II. http://www.cl.cam.ac.uk/Research/NL/acquilex/acqhome.html Sponsored by the European Commission, centered at University of Cambridge Last access: 01/25/06 Litowski, Kenneth C. Computational Lexicons and Dictionaries. http://www.clres.com/online-papers/ell.doc Part of CL Research Dolan, et. al. Mindnet. http://research.microsoft.com/nlp/Projects/MindNet.aspx Microsoft Research WordNet. http://wordnet.princeton.edu/ Princeton University