Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.

Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool

 Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

 Most of photo organization tools allow tagging (labeling) with keywords  Tagging is a tedious process  Automated annotation

 First step : Build database on large-scale data crawling from community photo collections  Second step : Recognition from database

 The crawling stage :  Create a large database of object model, each object is represented as a cluster of images (object clusters)  Tell us what the cluster contain (labels, GPS location, related content )  The retrieval stage :  Consists of a large scale retrieval system which is based on local image feature  Optimize this stage

 The annotation stage :  Estimates the position of object within image (bounding box)  Annotates with text, location, related content from the database

 Not general annotation of image with words  The annotation happens at the object level, and include textual labels, related web-sites, GPS location  The annotation of a query image happens within seconds Building Taipei 101

 Geospatial grid is overlaid over the earth, query Flickr to retrieve geo-tagged photo GPS location

 Visual vocabulary technique : Created by clustering the descriptor vectors of local visual features such as SIFT or SURF  Ranked using TF*IDF  Using RANSAC to estimate a homography between candidate and query image  Retain only candidate when the number of inliers exceeds a give threshold

D : candidate document (candidate image) contain set of visual word v : visual words (local feature) df(v) : document frequency of visual word v Note : we want to know which object is present in the query image, so we return a ranked list of object clusters instead of image

 Database :  Not organized by individual images but by object clusters  We can use partly redundant information to :  Obtain a better understanding of the object appearance  Segment objects  Create more compact inverted indices

 Use the feature matches from pair-wise can derive a score for each feature  Only feature which match to many of their counterparts in other image will receive a high score  Many of the photo are taken from varying viewpoint around the object, the background will receive less match

f : feature, i : image : set of inlying feature matches for image ij : number of images in the current object cluster o, : parameter set 1 and 1/3 Note : The bounding box is drawn around all feature with confidence higher than

 Estimate bounding boxes can help to compact our inverted index of visual word  Removing object clusters taken by a single user

 Select the best object cluster as a final result  Simple voting with retrieved image for their parent clusters  Normalizing by cluster size is not feasible  Only votes of 5 images per cluster with the highest retrieval scores are counted

 Consists of two steps :  Bounding box estimation  Labelling  Bounding box estimation  Estimated in the same way for database images  The query image matched to a number of images in the cluster returned at the top  Labelling  Simply copy the information to serve as labels for the query image from object cluster

 Conducted a large dataset collected from Flickr  Collected a challenging test-set of 674 images from Picasa Web-Albums  Estimated bounding boxes cover on average 52% of each images

 : baseline, TF*IDF-ranking on 500K visual vocabulary as it is used in other work  : bounding box features + no single user clusters  : all features + no single user clusters  : 66% random features subset + no single user clusters  : 66% random features subset

 Evaluate how well our system localize bounding boxes by measuring the intersection- over-union(IOU) measure for the ground-truth and hypothesis overlap 76.1%

 Presented a full auto-annotation pipeline for holiday snaps  Object-level annotation with bounding box, relevant tags, Wikipedia articles and GPS location

Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.

Similar presentations

Presentation on theme: "Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.

Similar presentations

Presentation on theme: "Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool."— Presentation transcript:

Similar presentations

About project

Feedback