Download presentation
Presentation is loading. Please wait.
Published byJunior Winburn Modified over 9 years ago
1
Distinción semántica de compuestos léxicos en Recuperación de Información Anselmo Peñas, Julio Gonzalo y Felisa Verdejo Dpto. Lenguajes y Sistemas Informáticos, UNED SEPLN 2002 Valladolid
2
Content Phrase indexing in IR Types of lexical compounds Automatic classification through WordNet Retrieval experiments Conclusions Future work
3
Phrase indexing in IR Literature Phrase indexing doesn’t improve retrieval Give partial credit to components Proximity better than adjacency Reason Compositional meaning of phrases No phrase distinction Our proposal Semantic distinction of phrases
4
Types of lexical compounds (in English) Endocentric compound toothed whale One component gives nuclear meaning: whale toothed whale is_a_type_of whale
5
Types of lexical compounds (in English) Appositional compound folk song All components add nuclear meaning folk song is_a_type_of song folk song is_a_type_of folk
6
Types of lexical compounds (in English) Exocentric compound mentally retarded No components give nuclear meaning mentally retarded is_a_type_of people However they retain syntactic properties
7
Types of lexical compounds (in English) Copulative compound Miguel de Cervantes No components give nuclear meaning Referred to entities Loss of syntactic properties
8
Automatic classification through WordNet Endocentric: one component is hyperonym Appositional: all components are hyperonyms Exocentric: no components are hyperonyms Copulative: not in WordNet (Entity Recognition) folk song songfolk is_a Appositional mentally retarded Exocentric Miguel de Cervantes Copulative toothed whale whale is_a Endocentric
9
Effect in retrieval Compare 1. Plain text 2. All compounds 3. Only exocentric compounds Experiment conditions INQUERY Search Engine OHSUMED collection 380 Mb 101 queries medical domain
11
Conclusions Automatic classification of lexical compounds in WordNet with semantic criteria Exocentric and endocentric compounds behavior are different Detection of exocentric compounds in queries seems to increase precision slightly Not significative results Very few exocentric compounds in queries (7%)
12
Future Work Try a test collection with longer queries (narrative in TREC topics) Detect exocentric compounds in a pseudo- relevance feedback framework
13
Distinción semántica de compuestos léxicos en Recuperación de Información Anselmo Peñas, Julio Gonzalo y Felisa Verdejo Dpto. Lenguajes y Sistemas Informáticos, UNED SEPLN 2002 Valladolid
14
Phrase indexing Texts...a guide for the fisher who......information on cat care......arboreal carnivorous called fisher cat... Query fisher...a guide for the fisher who......arboreal carnivorous called fisher cat......information on cat care... Plain Query fisher Phrase indexing...a guide for the fisher who......arboreal carnivorous called fisher_cat......information on cat care... matches
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.