Download presentation
Presentation is loading. Please wait.
Published byCollin Holt Modified over 9 years ago
1
CLEF 2007 - Budapest Joint SemEval/CLEF tasks: Contribution of WSD to CLIR UBC: Agirre, Lopez de Lacalle, Otegi, Rigau, FBK: Magnini Irion Technologies: Vossen
2
CLEF 2007 - Budapest2 WSD and SemEval Word Sense Disambiguation When I went to bed at around two o'clock that night, everyone else was still out in the party. party:N:1 political organization party:N:2 social event Potential for more precise expansion (translation) SemEval 2007 Framework for semantic evaluations Under auspices of SIGLEX (ACL) 19 tasks incl. WSD, SRL, full frames, people, … > 100 attendants in ACL workshop
3
CLEF 2007 - Budapest3 Motivation for the task WSD perspective In-vitro evaluations not fully satisfactory In-vivo evaluations in applications (MT, IR, …) IR perspective Usefulness of WSD on IR/CLIR disputed, but … Real compared to artificial experiments Expansion compared to just WSD Weighted list of senses compared to best sense Controlling which word to disambiguate WSD technology has improved Coarser-grained senses (90% acc. on Semeval 2007)
4
CLEF 2007 - Budapest4 Motivation for the task Combining WSD and IR: Many possible variations Unfeasible for a single research team A public common dataset allows for the community to explore different combinations. Tasks where we could hope to get positive impact: High recall IR scenarios Short passage IR scenarios Q&A CLIR We selected CLIR because of previous expertise of some of the organizers.
5
CLEF 2007 - Budapest5 Two-stage framework First stage (SemEval 2007 task 01): Participants: submit WSD results Sense inventory WordNet 1.6 (multilinguality) Organizers: Expansion / translation strategy fixed IR/CLIR system fixed (IR as upperbound) Second stage (Proposed CLEF 2008 track): Organizers: provide several WSD annotations Participants: submit CLIR results with/without WSD annotations
6
CLEF 2007 - Budapest6 Outline Description of the SemEval task (1st stage) Evaluation of results (1st stage) Conclusions (1st stage) Next step (2nd stage)
7
CLEF 2007 - Budapest7 Description of the task Datasets CLEF data: Documents in English: LA94, GH95 170.000 documents, 580 Mb raw text 300 topics: both in English and Spanish Existing relevance judgments Due to time limitations of the exercise 16,6% of document collection (we will have 100% shortly) Subset of relevance judgments, 201 topics
8
CLEF 2007 - Budapest8 Description of the task Two subtasks for participants English WSD of the following: the document collection the topics We limit to English at the time being. Return WN 1.6 senses.
9
CLEF 2007 - Budapest9 Description of the task Steps of CLIR/IR system Step 1: Participants return WSD results Step 2: Expansion / Translation Multilingual Central Repository (based on EuroWN) 5 languages tightly connected To ILI concepts (WN 1.6 synsets) Mappings to other WN versions Example: car sense 1 Expanded to synonyms: automobile Translated to equivalents: auto, coche
10
CLEF 2007 - Budapest10 Description of the task Steps of CLIR/IR system Step 3: IR/CLIR system Adaptation of TwentyOne (Irion) Pre-processing: XML Indexing: detected noun phrases only Title and description used for queries Stripped down to vector-space matching
11
CLEF 2007 - Budapest11 Description of the task Three evaluation settings IR with WSD of documents (English) WSD of English documents Expansion of senses in the documents IR with WSD of topics (English) WSD of English documents Expansion of senses in the documents IR as upperbound of CLIR CLIR with WSD of documents: WSD of English documents Translation of English documents Retrieval using Spanish topics CLIR with WSD of topics (Spanish WSD, NO)
12
CLEF 2007 - Budapest12 Evaluation and results Participant systems Participants returned sense-tagged documents and topics Two systems participated: PUTOP from Princeton, unsupervised UNIBA from Bari, KB using WordNet In-house system: ORGANIZERS, supervised, kNN classifiers Other baselines: Noexp: original text Fullexp: expand to all senses WSDrand: return sense at random 1st: return first sense in WordNet Wsd50: 50% best senses (in-house WSD system only)
13
CLEF 2007 - Budapest13 Evaluation and results S2AW and S3AW control Indication of performance of WSD Not necessarily correlated with IR/CLIR results Supervised system (ORG) fares better Prec.RecallCov. Senseval-2 all words ORG0.5840.57793.61% UNIBA0.4980.37575.39% PUTOP0.3880.24061.92% Senseval-3 all words ORG0.5910.56695.76% UNIBA0.4840.33869.98% PUTOP0.3340.18655.68%
14
CLEF 2007 - Budapest14 Evaluation and results Results (Mean Average Precision MAP) IR: noexp best CLIR: fullexp best ORG close far from IR Expansion and IR/CLIR system too simple IRtopsIRdocsCLIR noexp0.3599 0.1446 fullexp0.16100.14100.2676 UNIBA0.30300.15210.1373 PUTOP0.30360.14820.1734 Wsdrand0.26730.14820.2617 1st0.28620.11720.2637 ORG0.28860.15870.2664 wsd500.26510.14790.2640 Mean Average Precision
15
CLEF 2007 - Budapest15 Analysis # words in expansion of docs. IR: the less the better (but) MAP: noexp > ORG > UNIBA MW: noexp < … < ORG CLIR: the more the better (but) MAP: fullexp > ORG > ORG (50) MW: fullexp > ORG(50) > ORG WSD allows for more informed expansion Eng. Sp. NO WSD noexp99 fullexp9358 UNIBA wsdbest1917 wsd501917 PUTOP wsdbest2016 wsd502016 Baseline 1st2420 wsdrand2419 ORG. wsdbest2621 wsd503627 Millions of words
16
CLEF 2007 - Budapest16 Conclusions Main goals met: First try on evaluating WSD on CLIR Large dataset prepared and preprocessed WSD allows for more informed expansion On the negative side: Participation low SemEval overload, 10 interested No improvement over baseline Expansion and IR/CLIR naive
17
CLEF 2007 - Budapest17 Next stage: CLEF 2008 WSD results provided: WSD of whole collection Best WSD systems in SemEval 2007 CLEF teams will be able to try more sophisticated IR/CLIR methods Feasibility of a Q/A exercise Suggestions for cooperation on other tasks welcome Thank you for your attention! http://ixa2.si.ehu.es/semeval-clir
18
CLEF 2007 - Budapest18 Description of the task Sample topic C301 Nestlé Brands What brands are marketed by Nestlé around the world? Relevant articles will report the name of goods marketed globally by Nestlé or by companies belonging to the Nestlé group. In the second case, the document must make clear reference to the parent company.
19
CLEF 2007 - Budapest19 Description of the task Sample document LA010194-0001 000013 Los Angeles Times January 1, 1994, Saturday, Orange County Edition Part A; Page 1; Column 1; Metro Desk 330 words ORANGE COUNTY NEWSWATCH By Jerry Hick; Greg Hernandez, ; Nettie Mackley BUCKLE UP: The new bicycle helmet law for youngsters takes effect today. But that doesn't mean your child will be cited for forgetting. "We're going to educate the public first," says Fountain Valley police Lt. Robert Mosley. "We're going to have a six-month grace period where we'll just be issuing warnings."... Some local police departments will be spreading the word through flyers and programs at schools. But Mosley warns: "Kids need to follow the law." ….
20
CLEF 2007 - Budapest20 Description of the task Participation Participants got: Test data: Documents and topics in XML A file for each word with common ML features for each occurrence in documents/topics Training data: a file for each word with ML features and sense tags from Semcor 1.6 Script to produce result in desired format DTD’s and software to check XML Returned disambiguated documents and topics
21
CLEF 2007 - Budapest21 Discussion Reusing relevance judgments Not ideal, specially if WSD retrieves unseen documents But, we use all relevance judgments (incl. those from other language topics) Pooling in CLEF includes documents found by people
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.