Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia

Slides:



Advertisements
Similar presentations
Improved TF-IDF Ranker
Advertisements

Project activities October 2011 – February 2012 Primary School, 21 Belehradská Street, Košice, SLOVAKIA LET’S LEARN MATH TOGETHER.
Building an Ontology-based Multilingual Lexicon for Word Sense Disambiguation in Machine Translation Lian-Tze Lim & Tang Enya Kong Unit Terjemahan Melalui.
Maps, Dictionaries, Hashtables
INTRODUCTION COMPUTATIONAL MODELS. 2 What is Computer Science Sciences deal with building and studying models of real world objects /systems. What is.
1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.
File Systems and Databases
A STUDY ON THE KNOWLEDGE SOURCES OF TURKISH EFL LEARNERS IN LEXICAL INFERENCING İlknur İSTİFÇİ Anadolu University Eskişehir, TURKEY Eskişehir, TURKEY.
Sequence Alignment vs. Database Task: Given a query sequence and millions of database records, find the optimal alignment between the query and a record.
E-learning in preparation of mathematics teachers and in mathematics teaching Working meeting to project EuroMath Innsbruck, 2004.
SmartSQL AlfaTech Software Solutions Application Requirements Document  Radi Bekker  Vladimir Goldman  Marina Shaevich  Alexander Shapiro Team Members:
1 LOMGen: A Learning Object Metadata Generator Applied to Computer Science Terminology A. Singh, H. Boley, V.C. Bhavsar National Research Council and University.
The computer memory and the binary number system.
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Objective 5.01: Understand database tables used in business Database Fundamentals.
Intelligent Workflow Management System(iWMS). Agenda Background Motivation Usage Potential application domains iWMS.
Objectives Learn what a file system does
MBI 630: Class 6 Logic Modeling 9/7/2015. Class 6: Logic Modeling Logic Modeling Broadway Entertainment Co. Inc., Case –Group Discussion (Handout) –Logic.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Database Design - Lecture 1
Chapter 10 Promoting Success for All Students through Technology J. Charlene Welsh.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
WordNet ® and its Java API ♦ Introduction to WordNet ♦ WordNet API for Java Name: Hao Li Uni: hl2489.
2.00 Understand Computer Fundamentals Unit Objective: 2.01 Software.
1 Define a model 2 Populate the lexicon. Core Model.
Toman, Steinberger, Ježek Searching and Summarizing in a Multilingual Environment Michal Toman, Josef Steinberger, Karel Ježek University of West Bohemia.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Development of NE Wordnet: An Integrated Wordnet for Languages of the North-East India Assamese & Bodo by Utpal Saikia Biswajit Brahma Dibyajyoti Sarmah.
Using a Lemmatizer to Support the Development and Validation of the Greek WordNet Harry Kornilakis 1, Maria Grigoriadou 1, Eleni Galiotou 1,2, Evangelos.
Here are the main points:  use the words SEARCH, SORT and REPORT (report = a print out, sometimes called a hard copy)  mention the names of the fields.
M Taimoor Khan Course Objectives 1) Basic Concepts 2) Tools 3) Database architecture and design 4) Flow of data (DFDs)
Application of INTEX in refinement and validation of Serbian WordNet Ivan Obradović, Ranka Stanković Cvetana Krstev, Gordana Pavlović-Lažetić University.
Integrating Semantic Dictionaries for English, French and Bulgarian into the NooJ System for the Purposes of Information Retrieval Svetla Koeva, Max Silbetztein.
COMU114: Introduction to Database Development 1. Databases and Database Design.
Guessing Meaning from Context More practice by Jonathan Smith.
11 Chapter 19 Lexical Semantics. 2 Lexical Ambiguity Most words in natural languages have multiple possible meanings. –“pen” (noun) The dog is in the.
10/31/20151 EASTERN MEDITERRANEAN UNIVERSITY COMPUTER ENGINEERING DEPARTMENT Presented By Duygu CELIK Supervised By Atilla ELCI Intelligent Semantic Web.
Grammars Grammars can get quite complex, but are essential. Syntax: the form of the text that is valid Semantics: the meaning of the form – Sometimes semantics.
WordNet Enhancements: Toward Version 2.0 WordNet Connectivity Derivational Connections Disambiguated Definitions Topical Connections.
IndoWordNet Database Design Presented By: Konkani NLP Team Goa University IndoWordNet Database Design 1.
Foundations of Business Intelligence: Databases and Information Management.
PROPOSAL : The Use of Voice Command in Operating Personal Computer By : COLLEGE OF ART & SCIENCE UNIVERSITI UTARA MALAYSIA STIW5023 ADVANCED PROGRAMMING.
Current State-Operated Scientific Information Systems within the EU Presentation on euroCRIS Meeting, Prague, 7th – 9th November 2010 Danica Zendulkova,
A Patent Document Retrieval System Addressing Both Semantic and Syntactic Properties Liang Chen*,Naoyuki Tokuda+, Hisahiro Adachi+ *University of Northern.
Some Thoughts to Consider 5 Take a look at some of the sophisticated toys being offered in stores, in catalogs, or in Sunday newspaper ads. Which ones.
Word of the Day Week of November 12 th. defiant: bold in standing up against someone or something Part of Speech: adjective  Copy and complete this word.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
CROSSWORD PUZZLE – TEAM 2 Members:Derek van Assche Cody Hansen Jonathan Juett Seungbum Park Anthony Vito Date: 4/22/2014.
Software. Introduction n A computer can’t do anything without a program of instructions. n A program is a set of instructions a computer carries out.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
An Introduction to Programming with C++1 Beginning the Problem- Solving Process Tutorial 2.
Intro to Data Structures Concepts ● We've been working with classes and structures to form linked lists – a linked list is an example of something known.
Normalized bubble chart for Data in the Instructor’s View
Intro to MIS – MGS351 Databases and Data Warehouses
Computer Organization
Revised: 2 April 2004 Fred Swartz
Generating sets of synonyms between languages
Chapter 2 : Data Flow Diagram
Multilingual Biomedical Dictionary

Databases and Data Warehouses Chapter 3
File Systems and Databases
WordNet: A Lexical Database for English
Bulgarian WordNet Svetla Koeva Institute for Bulgarian Language
Lecture 8 Information Retrieval Introduction
LINGUA INGLESE 2A – a.a. 2018/2019 Computer-Aided Translation Technology LESSON 3 prof. ssa Laura Liucci –
Market Access Database (MADB)
Tiran Software RadeX Tahir Bilal Onur Deniz Soner Kara
Presentation transcript:

Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia

2 Plan WordNet, EuroWordNet + Slovak language Motivation Solution Results Future plans

3 WordNet, EuroWordNet Well known projects WordNet defines meaning of English words and their relationships (it defines synsets) EuroWordNet (EWN) is very similar multilingual project EWN doesn’t contain Slovak language (Slovak WN)

4 Motivation Text classification tasks require reduction of dimensionality and Intelligent search  –Morphological database –Something like WordNet

5 Our approach We decided to try to use on-line dictionaries to map Slovak meanings to Wordnet synset entries Two approaches: –Intersection of translation of each member of EN synset –Intersection of translation of related words

6 Architecture Input word WordNet DBlocal DB Synset Builder Inet online dict.

7 Synset “members” translation According WN word computer has 2 meanings specified by 2 synsets –{computer, computing machine,computing device, data processor,electronic computer, information, processing system} –{calculator, reckoner, figurer, estimator, computer} Result is formed as intersection of translation of synset members

8 Translation of related words Based on hyponym/hyperonym relationship between words: –Related words are translated –Result is formed as intersection of partial translations

9 Results We provide 4 Slovak and 2 Czech on-line dictionaries (Slovak dictionaries seem to be from one source) Result depends on: –Number of members in the synset (1 is problem) –Related words –Quality(?) of dictionary

10 Results (cont.) Parts of speech are sometimes mixed (nouns and adjectives) We implemented “multilingual view” Time consuming approach (quite slow) – results are stored to the database

11 Examples word computer

12

13

14

15 Example word table

16

17 Future works (plans) To deal with “dictionary problem” To eliminate mixed parts of speech in the results (at least for Slovak language, using morphological database) To connect other languages

18 Local copy of new webpage Addresses – – (new one)

19 Thank you!