Presentation is loading. Please wait.

Presentation is loading. Please wait.

Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia

Similar presentations


Presentation on theme: "Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia"— Presentation transcript:

1 Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia genci@tuke.sk

2 2 Plan WordNet, EuroWordNet + Slovak language Motivation Solution Results Future plans

3 3 WordNet, EuroWordNet Well known projects WordNet defines meaning of English words and their relationships (it defines synsets) EuroWordNet (EWN) is very similar multilingual project EWN doesn’t contain Slovak language (Slovak WN)

4 4 Motivation Text classification tasks require reduction of dimensionality and Intelligent search  –Morphological database –Something like WordNet

5 5 Our approach We decided to try to use on-line dictionaries to map Slovak meanings to Wordnet synset entries Two approaches: –Intersection of translation of each member of EN synset –Intersection of translation of related words

6 6 Architecture Input word WordNet DBlocal DB Synset Builder Inet online dict.

7 7 Synset “members” translation According WN word computer has 2 meanings specified by 2 synsets –{computer, computing machine,computing device, data processor,electronic computer, information, processing system} –{calculator, reckoner, figurer, estimator, computer} Result is formed as intersection of translation of synset members

8 8 Translation of related words Based on hyponym/hyperonym relationship between words: –Related words are translated –Result is formed as intersection of partial translations

9 9 Results We provide 4 Slovak and 2 Czech on-line dictionaries (Slovak dictionaries seem to be from one source) Result depends on: –Number of members in the synset (1 is problem) –Related words –Quality(?) of dictionary

10 10 Results (cont.) Parts of speech are sometimes mixed (nouns and adjectives) We implemented “multilingual view” Time consuming approach (quite slow) – results are stored to the database

11 11 Examples word computer

12 12

13 13

14 14

15 15 Example word table

16 16

17 17 Future works (plans) To deal with “dictionary problem” To eliminate mixed parts of speech in the results (at least for Slovak language, using morphological database) To connect other languages

18 18 Local copy of new webpage Addresses –http://ruzin.fei.tuke.sk/~laposphttp://ruzin.fei.tuke.sk/~laposp –http://ruzin.fei.tuke.sk/~sudynova (new one)http://ruzin.fei.tuke.sk/~sudynova

19 19 Thank you!


Download ppt "Use of WordNet and on-line dictionaries to build EN-SK synsets (experimental tool) Ján GENČI Technical University of Košice, Slovakia"

Similar presentations


Ads by Google