Accounting for the relative importance of objects in image retrieval

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Sharing Features Between Visual Tasks at Different Levels of Granularity Sung Ju Hwang 1, Fei Sha 2 and Kristen Grauman 1 1 University of Texas at Austin,
Temporal Query Log Profiling to Improve Web Search Ranking Alexander Kotov (UIUC) Pranam Kolari, Yi Chang (Yahoo!) Lei Duan (Microsoft)
Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Capturing Human Insight for Visual Learning Kristen Grauman Department of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimhan,
Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge Date: 2011/11/21 Source: Claudiu S. Firan (CIKM’10)
Parsing Clothing in Fashion Photographs
Discriminative Segment Annotation in Weakly Labeled Video Kevin Tang, Rahul Sukthankar Appeared in CVPR 2013 (Oral)
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
1 Statistical correlation analysis in image retrieval Reporter : Erica Li 2004/9/30.
Presented by Zeehasham Rasheed
Mobile Photos April 17, Auto Extraction of Flickr Tags Unstructured text labels Extract structured knowledge Place and event semantics Scale-structure.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Sung Ju Hwang and Kristen Grauman University of Texas at Austin.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
DOG I : an Annotation System for Images of Dog Breeds Antonis Dimas Pyrros Koletsis Euripides Petrakis Intelligent Systems Laboratory Technical University.
Sung Ju Hwang and Kristen Grauman University of Texas at Austin.
Sung Ju Hwang and Kristen Grauman University of Texas at Austin CVPR 2010.
Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.
Unsupervised object discovery via self-organisation Presenter : Bo-Sheng Wang Authors: Teemu Kinnunen, Joni-Kristian Kamarainen, Lasse Lensu, Heikki Kälviäinen.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Hands segmentation Pat Jangyodsuk. Motivation Alternative approach of finding hands Instead of finding bounding box, classify each pixel whether they’re.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Date: 2013/8/27 Author: Shinya Tanaka, Adam Jatowt, Makoto P. Kato, Katsumi Tanaka Source: WSDM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Estimating.
Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.
Efficient Region Search for Object Detection Sudheendra Vijayanarasimhan and Kristen Grauman Department of Computer Science, University of Texas at Austin.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Sharing Features Between Objects and Their Attributes Sung Ju Hwang 1, Fei Sha 2 and Kristen Grauman 1 1 University of Texas at Austin, 2 University of.
Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
Semi-Automatic Image Annotation Liu Wenyin, Susan Dumais, Yanfeng Sun, HongJiang Zhang, Mary Czerwinski and Brent Field Microsoft Research.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Data Mining, ICDM '08. Eighth IEEE International Conference on Duy-Dinh Le National Institute of Informatics Hitotsubashi, Chiyoda-ku Tokyo,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
WhittleSearch: Image Search with Relative Attribute Feedback CVPR 2012 Adriana Kovashka Devi Parikh Kristen Grauman University of Texas at Austin Toyota.
Recognition Using Visual Phrases
Post-Ranking query suggestion by diversifying search Chao Wang.
Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.
Duc-Tien Dang-Nguyen, Giulia Boato, Alessandro Moschitti, Francesco G.B. De Natale Department to Information and Computer Science –University of Trento.
Sung Ju Hwang and Kristen Grauman University of Texas at Austin.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
Cross-modal Hashing Through Ranking Subspace Learning
Image Retrieval and Ranking using L.S.I and Cross View Learning Sumit Kumar Vivek Gupta
BMVC 2010 Sung Ju Hwang and Kristen Grauman University of Texas at Austin.
CNN-RNN: A Unified Framework for Multi-label Image Classification
Learning to Personalize Query Auto-Completion
Detecting Semantic Concepts In Consumer Videos Using Audio Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li Multimedia Computing Lab,
Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment Xinyang Jiang, Fei Wu, Xi Li, Zhou Zhao, Weiming Lu, Siliang Tang, Yueting.
A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology
An Empirical Study of Learning to Rank for Entity Search
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Personalized Social Image Recommendation
Huazhong University of Science and Technology
Object Localization Goal: detect the location of an object within an image Fully supervised: Training data labeled with object category and ground truth.
Thesis Advisor : Prof C.V. Jawahar
Location Recommendation — for Out-of-Town Users in Location-Based Social Network Yina Meng.
Object-Graphs for Context-Aware Category Discovery
Intent-Aware Semantic Query Annotation
On-going research on Object Detection *Some modification after seminar
Iterative Projection and Matching: Finding Structure-preserving Representatives and Its Application to Computer Vision.
Presentation transcript:

Accounting for the relative importance of objects in image retrieval Sung Ju Hwang and Kristen Grauman University of Texas at Austin

What can the images tags tell us? Images tagged with keywords clearly tell us which objects to search for Dog Black lab Jasper Sofa Self Living room Fedora Explore #24 Nowadays, images with tags are becoming more and more common, as user community photo sites such as

Content-based image retrieval Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Retrieving images with similar visual features Visual features do not always correspond to a semantic object, or a concept Proximity in the visual feature space does not directly mean that two images will be semantically similar Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Semantic retrieval of the images By mapping each images to the semantic space using labels, we can associate the visual features with the semantics This semantic space does not know which objects are more important than others Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names. Tree, Grass, Cow Tree, Grass, Train

Our Idea: learning the semantic space with objects importance People expect the retrieved images to contain similar semantics as the query image What’s the point in finding images with similar background? Should find images that contain similar main objects Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Related work Hardoon~ Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Kernel Canonical Correlation Analysis Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Performance Evaluation Normalized Discounted Cumulative Gain at top k. - A good matching made in earlier rank will have more effect in the matching score. - A perfect ranking will have the score 1 Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Image-to-image retrieval results Retrieval performance is measured by the similarity of the ground-truth bounding box labels Retrieval performance is measured by the similarity of the tag lists Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Retrieval results Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

More retrieval results Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Tag-to-image retrieval results Our method achieved 20% better accuracy than the word+visual baseline Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Image-to-tag auto annotation results Dataset PASCAL VOC 2007 Method K=1 K=3 K=5 K=10 Visual-only 0.0826 0.1765 0.2022 0.2095 Word+Visual 0.0818 0.1712 0.1992 0.2097 Ours 0.0901 0.1936 0.2230 0.2335 Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.

Conclusion Most of the previous work concerning tagged images tried to find the correspondence between the nouns and objects. such as the correspondence between the image blobs and labels, or the correspondence between faces and names.