Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Wiki-Reality: Augmenting Reality with Community Driven Websites Speaker: Yi Wu Intel Labs/vision and image processing research Collaborators: Douglas Gray,
Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Chapter 5: Introduction to Information Retrieval
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Content-Based Image Retrieval
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
Herv´ eJ´ egouMatthijsDouzeCordeliaSchmid INRIA INRIA INRIA
Neurocomputing,Neurocomputing, Haojie Li Jinhui Tang Yi Wang Bin Liu School of Software, Dalian University of Technology School of Computer Science,
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
CVPR 2008 James Philbin Ondˇrej Chum Michael Isard Josef Sivic
Packing bag-of-features ICCV 2009 Herv´e J´egou Matthijs Douze Cordelia Schmid INRIA.
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009.
Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Robust and large-scale alignment Image from
Object retrieval with large vocabularies and fast spatial matching
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Presented by Zeehasham Rasheed
Scalable Text Mining with Sparse Generative Models
Chapter 5: Information Retrieval and Web Search
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Keypoint-based Recognition and Object Search
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Object Recognition and Augmented Reality
Large-Scale Content-Based Image Retrieval Project Presentation CMPT 880: Large Scale Multimedia Systems and Cloud Computing Under supervision of Dr. Mohamed.
Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.
Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
APPLICATIONS OF DATA MINING IN INFORMATION RETRIEVAL.
AUTOMATIC ANNOTATION OF GEO-INFORMATION IN PANORAMIC STREET VIEW BY IMAGE RETRIEVAL Ming Chen, Yueting Zhuang, Fei Wu College of Computer Science, Zhejiang.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
Efficient Keyword Search over Virtual XML Views Feng Shao and Lin Guo and Chavdar Botev and Anand Bhaskar and Muthiah Chettiar and Fan Yang Cornell University.
Querying Structured Text in an XML Database By Xuemei Luo.
1 Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval Ondrej Chum, James Philbin, Josef Sivic, Michael Isard and.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Chapter 6: Information Retrieval and Web Search
10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
NTU Natural Language Processing Lab. 1 An Analysis of Effectiveness of Tagging in Blogs Christopher H. Brooks and Nancy Montanez University of San Francisco.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Data Mining, ICDM '08. Eighth IEEE International Conference on Duy-Dinh Le National Institute of Informatics Hitotsubashi, Chiyoda-ku Tokyo,
Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision.
CC P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2014 Aidan Hogan Lecture IX: 2014/05/05.
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun Microsoft Research.
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
776 Computer Vision Jan-Michael Frahm Spring 2012.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Flickr Tag Recommendation based on Collective Knowledge Hyunwoo Kim SNU IDB Lab. August 27, 2008 Borkur Sigurbjornsson, Roelof van Zwol Yahoo! Research.
Large-Scale Content-Based Audio Retrieval from Text Queries
Information Retrieval in Practice
Rey-Long Liu Dept. of Medical Informatics Tzu Chi University Taiwan
Aidan Hogan CC Procesamiento Masivo de Datos Otoño 2017 Lecture 7: Information Retrieval II Aidan Hogan
Video Google: Text Retrieval Approach to Object Matching in Videos
Martin Rajman, Martin Vesely
Aidan Hogan CC Procesamiento Masivo de Datos Otoño 2018 Lecture 7 Information Retrieval: Ranking Aidan Hogan
Accounting for the relative importance of objects in image retrieval
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Citation-based Extraction of Core Contents from Biomedical Articles
Video Google: Text Retrieval Approach to Object Matching in Videos
Presentation transcript:

Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Most of photo organization tools allow tagging (labeling) with keywords  Tagging is a tedious process  Automated annotation

 First step : Build database on large-scale data crawling from community photo collections  Second step : Recognition from database

 The crawling stage :  Create a large database of object model, each object is represented as a cluster of images (object clusters)  Tell us what the cluster contain (labels, GPS location, related content )  The retrieval stage :  Consists of a large scale retrieval system which is based on local image feature  Optimize this stage

 The annotation stage :  Estimates the position of object within image (bounding box)  Annotates with text, location, related content from the database

 Not general annotation of image with words  The annotation happens at the object level, and include textual labels, related web-sites, GPS location  The annotation of a query image happens within seconds Building Taipei 101

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Geospatial grid is overlaid over the earth, query Flickr to retrieve geo-tagged photo GPS location

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Visual vocabulary technique : Created by clustering the descriptor vectors of local visual features such as SIFT or SURF  Ranked using TF*IDF  Using RANSAC to estimate a homography between candidate and query image  Retain only candidate when the number of inliers exceeds a give threshold

D : candidate document (candidate image) contain set of visual word v : visual words (local feature) df(v) : document frequency of visual word v Note : we want to know which object is present in the query image, so we return a ranked list of object clusters instead of image

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Database :  Not organized by individual images but by object clusters  We can use partly redundant information to :  Obtain a better understanding of the object appearance  Segment objects  Create more compact inverted indices

 Use the feature matches from pair-wise can derive a score for each feature  Only feature which match to many of their counterparts in other image will receive a high score  Many of the photo are taken from varying viewpoint around the object, the background will receive less match

f : feature, i : image : set of inlying feature matches for image ij : number of images in the current object cluster o, : parameter set 1 and 1/3 Note : The bounding box is drawn around all feature with confidence higher than

 Estimate bounding boxes can help to compact our inverted index of visual word  Removing object clusters taken by a single user

 Select the best object cluster as a final result  Simple voting with retrieved image for their parent clusters  Normalizing by cluster size is not feasible  Only votes of 5 images per cluster with the highest retrieval scores are counted

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Consists of two steps :  Bounding box estimation  Labelling  Bounding box estimation  Estimated in the same way for database images  The query image matched to a number of images in the cluster returned at the top  Labelling  Simply copy the information to serve as labels for the query image from object cluster

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Conducted a large dataset collected from Flickr  Collected a challenging test-set of 674 images from Picasa Web-Albums  Estimated bounding boxes cover on average 52% of each images

 : baseline, TF*IDF-ranking on 500K visual vocabulary as it is used in other work  : bounding box features + no single user clusters  : all features + no single user clusters  : 66% random features subset + no single user clusters  : 66% random features subset

67%

 Evaluate how well our system localize bounding boxes by measuring the intersection- over-union(IOU) measure for the ground-truth and hypothesis overlap 76.1%

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Presented a full auto-annotation pipeline for holiday snaps  Object-level annotation with bounding box, relevant tags, Wikipedia articles and GPS location