Download presentation
Presentation is loading. Please wait.
1
Heili Orav & Kadri Vider heili@ut.ee; kadriv@ut.ee
Concerning the Difference Between a Conception and its Application in the Case of the Estonian WordNet Heili Orav & Kadri Vider
2
Estonian WordNet Today Estonian wordnet contains:
3
Resources Monolingual resources Bilingual resources
EKSS (Explanatory Dictionary of Estonian) Dictionaries of synonyms and antonyms via complex query of KeeleWeb ( Bilingual resources Estonian-English Dictionary English-Estonian Dictionary Other tools of Estonian language technology
4
Word Sense Disambiguation
New challenge for EstWordNet 100,000 tokens from Corpus of Estonian Literary Language Morphologically disambiguated text Manual sense-tagging => RISE PROBLEMS =>
5
Word senses in EstWN - too broad or too narrow?
?? over-generalisation, e.g. : kuduma * weave, tissue [of textiles; create a piece of cloth by interlacing strands, such as wool or cotton] kuduma * knit [make textiles by knitting] !!! Good to use translation equivalents for test ?? over-differentiation, e.g. : kool 1 school [polysemic sense that applies both to the institution and the building] => kool 2 schoolhouse [school building] => kool 3 school [educational institution]
6
Metaphors It is possible to distinguish between two main types of knowledge in the comprehension of a text (Õim, 1983): semantic knowledge is knowledge of extralinguistic reality; pragmatic knowledge is knowledge regulating communication (social norms, conventions). Because EstWN is based on the existing traditional dictionaries and a text corpus (providing usage information), one might suppose that the semantic information in the database reflects semantic knowledge. The addition of metaphors to the thesaurus would make it a thesaurus that combines semantic and pragmatic knowledge. It would increase the size of the thesaurus to a remarkable degree. For this reason until now we have tried to avoid the addition of metaphors.
7
CWC = Conceptual word combinations in EstWordNet
Phraseological unit - is not a sum of its constituents, but constitute a conceptual whole Other word combinations - meaning of the whole is sum of their constituents How to recognise them in text? What is useful to annotate in WSD task?
8
CWCs of different types
(1) Synonyms, e.g. {meenutama, meelde tuletama} ‘recall, remember’ (2) Specific hierarchical nodes, e.g. {ruumiline omadus} ‘spatial characteristic’, {üleloomulik olend} ‘supernatural creature’ (3) Technical terms, e.g. {ilmaütlev kääne} ‘abessive case’, {kreeka tähestik} ‘Greek alphabet’ (4) Explanations, e.g. {kultiveerima, kultuurina kasvatama} ‘cultivate, grow as a culture’ Only (1) and (3) are relevant from the perspective of WSD task
9
References Kaalep, H.-J, & Muischnek, K. (2003). Inconsistent selectional criteria in semi-automatic multi-word unit extraction. In Complex 2003, 7th conference on computational lexicography and corpus research (pp ). Budapest: Research Institute for Linguistics, Hungarian Academy of Sciences. Kahusk, N., Orav,H., & Õim, H. (2001). Sensiting infleectionality: Estonian task for SENSEVAL-2. In Proceedings of SENSEVAL-2: Second International Workshop on Evaluating Word Sense Disambiguating Systems (pp ). Toulouse, France: CNRS-Institut de Recerche en Informatique de Toulouse, and Univeresite des Sciences Sociales. Kahusk,N. & Vider, K. (2002). Estonian wordnet benefits from word sense disambiguation. In Proceedings of the 1st international global wordnet conference (pp ). Mysore, India: Central Institute of Indian Languages. Miller, G. A. (1979). Images and models, similes and metaphors. In A. Ortony (Ed.), Metaphor and thought (2nd ed.). Cambridge University Press. Õim, A. (1993). Fraseoloogiasõnaraamat. Tallinn, Estonia: Eesti Teaduste Akadeemia Keele ja Kirjanduse Instituut. Õim, H. (1983). Semantika i teoria ponimania jazyka. Analiz leksiki i tekstov direktivnogo obwenia estonskogo jazyka [Semantics and language understanding theory. Analysis of lexicon and texts of Estonian directive communication] Doctoral dissertation, University of Tartu.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.