Download presentation
Presentation is loading. Please wait.
Published byJordan Fowler Modified over 9 years ago
1
Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia genci@tuke.sk
2
2 Plan WordNet, EuroWordNet + Slovak language Motivation Solution Results Future plans
3
3 WordNet, EuroWordNet Well known projects WordNet defines meaning of English words and their relationships (it defines synsets) EuroWordNet (EWN) is very similar multilingual project EWN doesn’t contain Slovak language (Slovak WN)
4
4 Motivation Text classification tasks require reduction of dimensionality and Intelligent search –Morphological database –Something like WordNet
5
5 Our approach We decided to try to use on-line dictionaries to map Slovak meanings to Wordnet synset entries Two approaches: –Intersection of translation of each member of EN synset –Intersection of translation of related words
6
6 Architecture Input word WordNet DBlocal DB Synset Builder Inet online dict.
7
7 Synset “members” translation According WN word computer has 2 meanings specified by 2 synsets –{computer, computing machine,computing device, data processor,electronic computer, information, processing system} –{calculator, reckoner, figurer, estimator, computer} Result is formed as intersection of translation of synset members
8
8 Translation of related words Based on hyponym/hyperonym relationship between words: –Related words are translated –Result is formed as intersection of partial translations
9
9 Results We provide 4 Slovak and 2 Czech on-line dictionaries (Slovak dictionaries seem to be from one source) Result depends on: –Number of members in the synset (1 is problem) –Related words –Quality(?) of dictionary
10
10 Results (cont.) Parts of speech are sometimes mixed (nouns and adjectives) We implemented “multilingual view” Time consuming approach (quite slow) – results are stored to the database
11
11 Examples word computer
12
12
13
13
14
14
15
15 Example word table
16
16
17
17 Future works (plans) To deal with “dictionary problem” To eliminate mixed parts of speech in the results (at least for Slovak language, using morphological database) To connect other languages
18
18 Local copy of new webpage Addresses –http://ruzin.fei.tuke.sk/~laposphttp://ruzin.fei.tuke.sk/~laposp –http://ruzin.fei.tuke.sk/~sudynova (new one)http://ruzin.fei.tuke.sk/~sudynova
19
19 Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.