Experiments on Using Semantic Distances Between Words in Image Caption Retrieval Presenter: Cosmin Adrian Bejan Alan F. Smeaton and Ian Quigley School.

Slides:



Advertisements
Similar presentations
Center for NLP Whither Come the Words? Dr. Elizabeth D. Liddy Center for Natural Language Processing School of Information Studies Syracuse University.
Advertisements

Improved TF-IDF Ranker
UCLA : GSE&IS : Department of Information StudiesJF : 276lec1.ppt : 5/2/2015 : 1 I N F S I N F O R M A T I O N R E T R I E V A L S Y S T E M S Week.
TÍTULO GENÉRICO Concept Indexing for Automated Text Categorization Enrique Puertas Sanz Universidad Europea de Madrid.
Automatic indexing and retrieval of crime-scene photographs Katerina Pastra, Horacio Saggion, Yorick Wilks NLP group, University of Sheffield Scene of.
 Andisheh Keikha Ryerson University Ebrahim Bagheri Ryerson University May 7 th
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)
IR Models: Overview, Boolean, and Vector
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart.
Search Engines and Information Retrieval
Modeling Modern Information Retrieval
Modern Information Retrieval Chapter 5 Query Operations.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
Semantic (Language) Models: Robustness, Structure & Beyond Thomas Hofmann Department of Computer Science Brown University Chief Scientist.
SLIDE 1IS 240 – Spring 2007 Prof. Ray Larson University of California, Berkeley School of Information Tuesday and Thursday 10:30 am - 12:00.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Vector Space Model CS 652 Information Extraction and Integration.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Recuperação de Informação. IR: representation, storage, organization of, and access to information items Emphasis is on the retrieval of information (not.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Automatically obtain a description for a larger cluster of relevant documents Identify terms related to query terms  Synonyms, stemming variations, terms.
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Modeling (Chap. 2) Modern Information Retrieval Spring 2000.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Query Relevance Feedback and Ontologies How to Make Queries Better.
Search Engines and Information Retrieval Chapter 1.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher Laura Po and Sonia Bergamaschi DII, University of Modena and Reggio Emilia, Italy.
Jiuling Zhang  Why perform query expansion?  WordNet based Word Sense Disambiguation WordNet Word Sense Disambiguation  Conceptual Query.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Complex Linguistic Features for Text Classification: A Comprehensive Study Alessandro Moschitti and Roberto Basili University of Texas at Dallas, University.
1 Query Operations Relevance Feedback & Query Expansion.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware.
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
An Architecture for Emergent Semantics Sven Herschel, Ralf Heese, and Jens Bleiholder Humboldt-Universität zu Berlin/ Hasso-Plattner-Institut.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Natural Language Processing for Information Retrieval -KVMV Kiran ( )‏ -Neeraj Bisht ( )‏ -L.Srikanth ( )‏
© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Vector Space Models.
Conceptual structures in modern information retrieval Claudio Carpineto Fondazione Ugo Bordoni
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Presented By- Shahina Ferdous, Student ID – , Spring 2010.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Structured Text Retrieval Models. Str. Text Retrieval Text Retrieval retrieves documents based on index terms. Observation: Documents have implicit structure.
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
Toward Entity Retrieval over Structured and Text Data Mayssam Sayyadian, Azadeh Shakery, AnHai Doan, ChengXiang Zhai Department of Computer Science University.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
CS791 - Technologies of Google Spring A Web­based Kernel Function for Measuring the Similarity of Short Text Snippets By Mehran Sahami, Timothy.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Introduction Multimedia initial focus
Irion Technologies (c)
Information Retrieval on the World Wide Web
Center for Natural Language Processing School of Information Studies
Giannis Varelas Epimenidis Voutsakis Paraskevi Raftopoulou
Information Retrieval and Web Design
Information Retrieval and Web Design
Information Retrieval and Web Design
Presentation transcript:

Experiments on Using Semantic Distances Between Words in Image Caption Retrieval Presenter: Cosmin Adrian Bejan Alan F. Smeaton and Ian Quigley School of Computer Applications Dublin City University

2 IR implementation - traditional approach  Represent:  a user query = a bag of query terms  document = a bag of index terms  Compute:  a degree of similarity between a document and a query based on the overlap or number of query terms in common between them.

3 Problems in IR implementation  caused by  same words describing different things (“bar”, “bank”)  different words describing same thing (“stomach pain” – “belly ache”)  natural language is fraught with ambiguities at all levels leading to multiple interpretations of words, phrases, etc.  Common way to address these problems: query expansion  The approach in this paper: when computing the degree of similarity between query and document instead of basing similarity on the terms in common between the two incorporate a quantitative measure of the semantic similarity between index terms into the measure.

4 Measuring semantic distance between words  knowledge base – hierarchical concept graphs (HCGs) automatically constructed from WordNet  The similarity of two classes or synsets:  Computing the similarity between two word senses (nouns) can only be done if both are in the same HCG, otherwise they are regarded as being dissimilar. information content of the class c i P(c i ) the class probability of class ci

5 Experimental Set-up  Hand-caption 2714 images  Manually disambiguate polysemous words in caption  Manually built a collection of 60 queries  Compute various query-caption similarity measure using word-word semantic distances.

6 Retrieval Strategies [1-2]  Notation  query Q={q 1, q 1, … q m }.  caption C={c 1, c 1 … c n } where a q i or a c j is the original term used only as a representation for its synset.  Sim(t i, t j ) is the similarity between the sense- disambiguated form of two terms t i and t j.  Run1  Run2 straightforward statistically-based tf*IDF match between the word forms or strings, i.e. not using word sense disambiguated captions or queries. where terms in caption in query are both expanded to include other word strings from their sense disambi- guated sysnsets (query expansion).

7 Retrieval Strategies [3-5]  Run3  Run4  Run5 when considering different threshold values for each HCG, given that there is a concentration of usage of concepts from some HCGs (like entity) and hardly any use of others (like shape).

8 Retrieval Strategies [6-8]  Run6  Run7  Run8

9 Experimental Results