Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.

Similar presentations


Presentation on theme: "Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool."— Presentation transcript:

1 Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool

2  Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

3  Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

4  Most of photo organization tools allow tagging (labeling) with keywords  Tagging is a tedious process  Automated annotation

5

6  First step : Build database on large-scale data crawling from community photo collections  Second step : Recognition from database

7  The crawling stage :  Create a large database of object model, each object is represented as a cluster of images (object clusters)  Tell us what the cluster contain (labels, GPS location, related content )  The retrieval stage :  Consists of a large scale retrieval system which is based on local image feature  Optimize this stage

8  The annotation stage :  Estimates the position of object within image (bounding box)  Annotates with text, location, related content from the database

9  Not general annotation of image with words  The annotation happens at the object level, and include textual labels, related web-sites, GPS location  The annotation of a query image happens within seconds Building Taipei 101

10  Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

11  Geospatial grid is overlaid over the earth, query Flickr to retrieve geo-tagged photo GPS location

12  Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

13  Visual vocabulary technique : Created by clustering the descriptor vectors of local visual features such as SIFT or SURF  Ranked using TF*IDF  Using RANSAC to estimate a homography between candidate and query image  Retain only candidate when the number of inliers exceeds a give threshold

14 D : candidate document (candidate image) contain set of visual word v : visual words (local feature) df(v) : document frequency of visual word v Note : we want to know which object is present in the query image, so we return a ranked list of object clusters instead of image

15  Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

16  Database :  Not organized by individual images but by object clusters  We can use partly redundant information to :  Obtain a better understanding of the object appearance  Segment objects  Create more compact inverted indices

17  Use the feature matches from pair-wise can derive a score for each feature  Only feature which match to many of their counterparts in other image will receive a high score  Many of the photo are taken from varying viewpoint around the object, the background will receive less match

18

19 f : feature, i : image : set of inlying feature matches for image ij : number of images in the current object cluster o, : parameter set 1 and 1/3 Note : The bounding box is drawn around all feature with confidence higher than

20

21  Estimate bounding boxes can help to compact our inverted index of visual word  Removing object clusters taken by a single user

22  Select the best object cluster as a final result  Simple voting with retrieved image for their parent clusters  Normalizing by cluster size is not feasible  Only votes of 5 images per cluster with the highest retrieval scores are counted

23

24  Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

25  Consists of two steps :  Bounding box estimation  Labelling  Bounding box estimation  Estimated in the same way for database images  The query image matched to a number of images in the cluster returned at the top  Labelling  Simply copy the information to serve as labels for the query image from object cluster

26  Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

27  Conducted a large dataset collected from Flickr  Collected a challenging test-set of 674 images from Picasa Web-Albums  Estimated bounding boxes cover on average 52% of each images

28

29  : baseline, TF*IDF-ranking on 500K visual vocabulary as it is used in other work  : bounding box features + no single user clusters  : all features + no single user clusters  : 66% random features subset + no single user clusters  : 66% random features subset

30 67%

31  Evaluate how well our system localize bounding boxes by measuring the intersection- over-union(IOU) measure for the ground-truth and hypothesis overlap 76.1%

32

33

34

35

36

37

38  Introduction  Automatic object mining  Scalable object cluster retrieval  Object knowledge from the wisdom of crowds  Object-level auto-annotation  Experiments and Results  Conclusions

39  Presented a full auto-annotation pipeline for holiday snaps  Object-level annotation with bounding box, relevant tags, Wikipedia articles and GPS location

40


Download ppt "Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool."

Similar presentations


Ads by Google