The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion.

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Yansong Feng and Mirella Lapata
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
European Interoperability Architecture e-SENS Workshop : Cartography Tool in practise 7-8 January 2015.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Sensible Searching: Making Search Engines Work Dr Shaun Ryan CEO S.L.I. Systems
Search Results Need to be Diverse Mark Sanderson University of Sheffield.
ImageCLEF breakout session Please help us to prepare ImageCLEF2010.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
Evaluating Search Engine
Search Engines and Information Retrieval
Information Retrieval in Practice
Senior Project Database: Design and Usability Evaluation Stephanie Cheng Rachelle Hom Ronald Mg Hoang Bao CSC 484 – Winter 2005.
Jumping Off Points Ideas of possible tasks Examples of possible tasks Categories of possible tasks.
Evaluating the Performance of IR Sytems
ISP 433/633 Week 6 IR Evaluation. Why Evaluate? Determine if the system is desirable Make comparative assessments.
 Official Site: facility.org/research/evaluation/clef-ip-10http:// facility.org/research/evaluation/clef-ip-10.
Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.
1 Internet Search Tools Adapted from Kathy Schrock’s PowerPoint entitled “Successful Web Search Strategies” Kathy Schrock’s complete PowerPoint available.
Search and Retrieval: Relevance and Evaluation Prof. Marti Hearst SIMS 202, Lecture 20.
SEO Presentation High Level SEO overview 08 th December 2010.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
August 21, 2002Szechenyi National Library Support for Multilingual Information Access Douglas W. Oard College of Information Studies and Institute for.
Search Engines and Information Retrieval Chapter 1.
Kelly rowland WHAT WE ALL NEED!!. hoppadon formly of village deuce mafia...the hottest rap don spitting!!
JULIO GONZALO, VÍCTOR PEINADO, PAUL CLOUGH & JUSSI KARLGREN CLEF 2009, CORFU iCLEF 2009 overview tags : image_search, multilinguality, interactivity, log_analysis,
Copyright Universal Solutions Inc Generate More Online Advertising Revenues for Website Publishers We Globalize and Monetize Your Web Content to..
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
An Analysis of Assessor Behavior in Crowdsourced Preference Judgments Dongqing Zhu and Ben Carterette University of Delaware.
1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.
1999 Asian Women's Network Training Workshop Tools for Searching Information on the Web  Search Engines  Meta-searchers  Information Gateways  Subject.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
The PATENTSCOPE search system: CLIR February 2013 Sandrine Ammann Marketing & Communications Officer.
Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010.
The CLEF 2003 cross language image retrieval task Paul Clough and Mark Sanderson University of Sheffield
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
25/10/20151Gianluca Demartini Desktop Search Evaluation Sergey Chernov and Gianluca Demartini TREC 2006, 16th November 2006 Pre-Track Workshop.
COMP 208/214/215/216 – Lecture 8 Demonstrations and Portfolios.
UA in ImageCLEF 2005 Maximiliano Saiz Noeda. Index System  Indexing  Retrieval Image category classification  Building  Use Experiments and results.
Encyclopaedia Idea1 New Library Feature Proposal 22 The Encyclopaedia.
COMP 106 Practical 2 Proposal 5 Slide 1. Designing an Interface (1) The user must be able to anticipate a widget's behaviour from its visual properties.
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
Searching for NZ Information in the Virtual Library Alastair G Smith School of Information Management Victoria University of Wellington.
Information Retrieval
Adapting to the user's needs and preferences Behzad Kateli.
What’s happening in iCLEF? (the iCLEF Flickr Challenge) Julio Gonzalo (UNED), Paul Clough (U. Sheffield), Jussi Karlgren (SICS), Javier Artiles (UNED),
Selected Internet Search Engines Search Engine Database Advanced/ Boolean Other search options Miscellaneous Google Google google.co m Advanced Search.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
Why IR test collections are so bad Mark Sanderson University of Sheffield.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.
Indri at TREC 2004: UMass Terabyte Track Overview Don Metzler University of Massachusetts, Amherst.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Jaime Teevan MIT, CSAIL The Re:Search Engine. “Pick a card, any card.”
The Information School of the University of Washington Information System Design Info-440 Autumn 2002 Session #20.
Developments in Evaluation of Search Engines
Evaluation of Priority Gender Equality
Evaluation Anisio Lacerda.
Evaluation of IR Systems
Experiments for the CL-SR task at CLEF 2006
Amazing The Re:Search Engine Jaime Teevan MIT, CSAIL.
Evaluating Information Retrieval Systems
Best Practices On-Line Analysis Tools
IATEFL LASIG Local Conference Brno 2018
Information Retrieval and Web Design
Lab 2: Information Retrieval
Presentation transcript:

The Cross Language Image Retrieval Track: ImageCLEF Breakout session discussion

Overview General comments/feedback –about the 2006 event Proposals for 2007 –Photographic image retrieval task –Medical image retrieval task –Medical image annotation and object classification tasks Participants feedback

General comments/feedback Tasks offered –are they realistic/useful/beneficial? –what are the use cases? Collections used Topics and assessments Evaluation measures Organisation of the tasks Information provided about results Provision of resources –sharing resources between participants Interactive experiments

Photographic Retrieval Task ImageCLEFphoto

Feedback from survey (2006) Selected results Document Collection and Query Topics strongly disagreedisagreeneutralagree strongly agree The IAPR TC-12 Benchmark is an appropriate collection for ImageCLEFphoto.52 The database represents a realistic set of real-life still natural images.7 The quality of the annotations in the IAPR TC-12 collection is good and adequate for ImageCLEFphoto.331 It was appropriate to provide a subset of the annotations (rather than the full set).41 The topics were realistic.2211 Images with such good annotations are not at all real-life. I don't think that topics were realistic. I mean, if only titles are considered perhaps they are near to real search queries given by users but narratives are too long and tricky (several nested negations). I’m not sure what size of or what types of topics are ideal for the evaluation. As we are working in an environment of web based cross language image retrieval we found that query formulation was a little bit to “raffinate”, sometimes with rather unusual terms.

Feedback from survey (2006) Selected results Relevance Assessments & Performance Measures strongly disagreedisagreeneutralagree strongly agree The set of performance measure (MAP, P20, GMAP, BPREF) was adequate. 61 MAP is a good indicator for of the effectiveness of an image retrieval system.12 4 P20 should be adopted as the standard measure as many online image retrieval engines (Google, Yahoo, etc.) display by default 20 images on the first page of results It is appropriate to use an interactive search and judge tool to complement the ground-truth with further relevant images that were not part of the pools I don't think MAP to be a good indicator of the effectiveness of a retrieval system but I don't know other. The "feeling" of the user interacting with the system should be considered in some way but I don't know how (and I'm not sure if tests with 10 or 15 users, as in iCLEF experiments, are representative enough) I don't think P20 is a good standard measure. Although most systems show 20 results in the first page I think that there are a lot of users that review more than 20. I don't have the proof of this assertion, of course, it is only a feeling. I have no idea about this… but I prefer clear and simple performance measures which everybody knows and can optimise to. Suggestions: do it the standard way, that is: MAP

Feedback from survey (2007) Selected results Image Annotations and Annotation Languages for 2007 strongly disagreedisagreeneutralagree strongly agree English annotations.1 32 German annotations Spanish annotations 411 Annotations with a randomly selected language (English, German, Spanish) because this is the most realistic scenario.1 23 Visual features only, without annotations Other target languages: French

Feedback from survey (2007) Selected results Topic Languages for 2007 topics English5Portuguese1NorwegianTraditional Chinese1 German2DutchFinnishMonolingual only Spanish 2Polish1DanishImages only1 Italian2Russian1Japanese2 French4SwedishSimplified Chinese1 Other query languages: Arabic

Proposals for 2007 Collection –Spanish annotations will be completed –Randomly select N images for each annotation language (and no annotation) –Assess “usefulness” of annotation fields titles/keywords only vs. descriptions Resources –Runs from GIFT and VIPER for topic examples (other systems?) –Use 2006 data as training data

Proposals for 2007 Topics –“visual” topics part of the standard ad-hoc set make visual properties part of the topic narrative? –Will provide 3 example images for each topic (but make sure these are removed from the database – not just the set of relevant images) –Support the languages suggested by participants English, German, Spanish, Italian, French, Portuguese, Polish, Russian, Simplified/traditional Chinese, Arabic –Release topic narratives? –Creating topics – what/how? realism vs. controlled parameters

Proposals for 2007 Tasks –ad-hoc retrieval task? use same topics to see how much improvement can be gained one-year on –object recognition task? –submit at least one run using “off the shelf” tools? Performance measures –bridge gap between research and real-world time taken for retrieval / cost of resources (and copyright!) / use of “off the shelf” resources –use measures which correlate with human satisfaction? user satisfaction vs. system effectiveness –comparing systems rank systems based on average rank from different measures

Medical Ad-hoc Retrieval Task

Automatic Annotation Tasks