Unsupervised Learning of Hierarchical Spatial Structures Devi Parikh, Larry Zitnick and Tsuhan Chen.

Slides:



Advertisements
Similar presentations
CVPR2013 Poster Modeling Actions through State Changes.
Advertisements

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance Dhruv Batra, Carnegie Mellon University Adarsh Kowdle, Cornell.
LARGE-SCALE IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill road building car sky.
Patch to the Future: Unsupervised Visual Prediction
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
INTRODUCTION Heesoo Myeong, Ju Yong Chang, and Kyoung Mu Lee Department of EECS, ASRI, Seoul National University, Seoul, Korea Learning.
Semi-Supervised Hierarchical Models for 3D Human Pose Reconstruction Atul Kanaujia, CBIM, Rutgers Cristian Sminchisescu, TTI-C Dimitris Metaxas,CBIM, Rutgers.
Bangpeng Yao Li Fei-Fei Computer Science Department, Stanford University, USA.
LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs Roozbeh Mottaghi 1, Sanja Fidler 2, Jian Yao 2, Raquel Urtasun 2, Devi Parikh 3 1 UCLA.
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
Unsupervised Learning of Visual Taxonomies IEEE conference on CVPR 2008 Evgeniy Bart – Caltech Ian Porteous – UC Irvine Pietro Perona – Caltech Max Welling.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Quantifying and Transferring Contextual Information in Object Detection Professor: S. J. Wang Student : Y. S. Wang 1.
Good morning, everyone, thank you for coming to my presentation.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Generic Object Recognition -- by Yatharth Saraf A Project on.
Agenda Introduction Bag-of-words model Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Unsupervised discovery of visual object class hierarchies Josef Sivic (INRIA / ENS), Bryan Russell (MIT), Andrew Zisserman (Oxford), Alyosha Efros (CMU)
Generative learning methods for bags of features
5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.
Con-Text: Text Detection Using Background Connectivity for Fine-Grained Object Classification Sezer Karaoglu, Jan van Gemert, Theo Gevers 1.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
A M P C CC C Automatic Contextual Pattern Modeling Pengyu Hong Beckman Institute for Advanced Science and Technology University of Illinois at Urbana Champaign.
What, Where & How Many? Combining Object Detectors and CRFs
Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.
Style-aware Mid-level Representation for Discovering Visual Connections in Space and Time Yong Jae Lee, Alexei A. Efros, and Martial Hebert Carnegie Mellon.
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Project 2 SIFT Matching by Hierarchical K-means Quantization
Computer Vision James Hays, Brown
CSE 185 Introduction to Computer Vision Pattern Recognition.
AUTOMATIC ANNOTATION OF GEO-INFORMATION IN PANORAMIC STREET VIEW BY IMAGE RETRIEVAL Ming Chen, Yueting Zhuang, Fei Wu College of Computer Science, Zhejiang.
Object Detection Sliding Window Based Approach Context Helps
Recognition using Regions (Demo) Sudheendra V. Outline Generating multiple segmentations –Normalized cuts [Ren & Malik (2003)] Uniform regions –Watershed.
Interactive Discovery and Semantic Labeling of Patterns in Spatial Data Thomas Funkhouser, Adam Finkelstein, David Blei, and Christiane Fellbaum Princeton.
End-to-End Text Recognition with Convolutional Neural Networks
Why Categorize in Computer Vision ?. Why Use Categories? People love categories!
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
IIIT Hyderabad Learning Semantic Interaction among Graspable Objects Swagatika Panda, A.H. Abdul Hafez, C.V. Jawahar Center for Visual Information Technology,
Towards Semantic Embedding in Visual Vocabulary Towards Semantic Embedding in Visual Vocabulary The Twenty-Third IEEE Conference on Computer Vision and.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
© Devi Parikh 2008 Devi Parikh and Tsuhan Chen Carnegie Mellon University April 3, ICASSP 2008 Bringing Diverse Classifiers to Common Grounds: dtransform.
Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Object-Graphs for Context-Aware Category Discovery Yong Jae Lee and Kristen Grauman University of Texas at Austin 1.
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.
Morphological Image Processing
Learning Hierarchical Features for Scene Labeling Cle’ment Farabet, Camille Couprie, Laurent Najman, and Yann LeCun by Dong Nie.
The topic discovery models
Learning Mid-Level Features For Recognition
Lecture 25: Introduction to Recognition
Nonparametric Semantic Segmentation
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT
The topic discovery models
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Object-Graphs for Context-Aware Category Discovery
Brief Review of Recognition + Context
Semantic Segmentation
Presentation transcript:

Unsupervised Learning of Hierarchical Spatial Structures Devi Parikh, Larry Zitnick and Tsuhan Chen

2 … hierarchical spatial patterns Our visual world… What is an object? What is context? Intro Approach Results Conclusion

3 Goal Unsupervised! Intro Approach Results Conclusion

4 Related work [Todorovic 2008] [Fidler 2007] [Zhu 2008] [Sivic 2008] Fully unsupervised Structure and parameters learnt From features to multiple objects Intro Approach Results Conclusion

5 Model Rule based c2c2 c4c4 c1c1 c2c2 c3c3 r1r Intro Approach Results Conclusion

6 c2c2 r2r2 c1c1 c2c2 c3c3 r1r Model Rule based Intro Approach Results Conclusion

7 c2c2 r2r2 c1c1 c2c2 c3c3 r1r Model Hierarchical rule-based Intro Approach Results Conclusion

8  Rules R  Image-parts V Model  Codewords C  Features F Intro Approach Results Conclusion

9 Model  Notation V = {v} instantiated image-parts r v rule corresponding to instantiated part v Ch(r v ) = {x} children of rule r v  includes instantiated children Ch(v)  and un-instantiated children Intro Approach Results Conclusion

10 Model Intro Approach Results Conclusion

11 Inference Intro Approach Results Conclusion

12 Inference Intro Approach Results Conclusion

13 Inference Intro Approach Results Conclusion

14 Inference Intro Approach Results Conclusion

15 Inference Intro Approach Results Conclusion

16 Inference Intro Approach Results Conclusion

17 Inference Intro Approach Results Conclusion

18 Inference Intro Approach Results Conclusion

19 Inference Intro Approach Results Conclusion

20 Inference Intro Approach Results Conclusion

21 Minimum Cost Steiner Tree Charikar 1998 Inference Intro Approach Results Conclusion

22 Inference Intro Approach Results Conclusion

23 Generalized distance transform Felzenszwalb et al Inference Intro Approach Results Conclusion

24  EM style  Initialize rules  Infer rules  Update parameters  Modify rules Learning Intro Approach Results Conclusion

25  Initialize rules … Learning Intro Approach Results Conclusion

26  Inference … Learning Intro Approach Results Conclusion

27  Inference … Learning Intro Approach Results Conclusion

28  Add children … Learning Intro Approach Results Conclusion

29  Add children  Update parameters  Pruning children  Removing rules … Learning Intro Approach Results Conclusion

30  Adding rules Randomly add rules … … Learning Intro Approach Results Conclusion

31 Behavior  Competition among rules  Competition with root (noise) Intro Approach Results Conclusion

32 Behavior  Competition among rules  Competition with root (noise)  Dropping children and rules  Number of children  Structure of DAG and tree  # rules, parameters, structure learnt automatically  Multiple instantiations of rules  Multiple children with same appearance Intro Approach Results Conclusion

Experiment 1: Faces & Motorbikes Intro Approach Results Conclusion

34  Faces and Motorbikes  SIFT (200 words)  Learnt 15 L1 rules, 2 L2 rules  Each L1 rule  average ~7 children  Each L2 rule  average ~4 children Faces & Motorbikes Intro Approach Results Conclusion

35 Example rules Intro Approach Results Conclusion

36 Patches Intro Approach Results Conclusion

37 Localization behavior Intro Approach Results Conclusion

38 Categorization behavior Faces Motorbikes Faces Motorbikes Faces Motorbikes occurrence code-words first level rules second level rules Intro Approach Results Conclusion

39 Categorization behavior Words RulesTree Words: 94 % Tree: 100% Kmeans PLSA SVM Intro Approach Results Conclusion

40 Edge features Words: 55 % Tree: 82% Intro Approach Results Conclusion

Experiment 2: Six categories Intro Approach Results Conclusion

42 Six categories 61 L1 rules (~9 children) 12 L2 rules (~3 children) Kim 2008: 95 % Words: 87 % Tree: 95 % Intro Approach Results Conclusion

Experiment 3: Scene categories Intro Approach Results Conclusion

44 Scene categories Image Segmentation Mean color Codeword Intro Approach Results Conclusion

45 Outdoor scenes rules images Intro Approach Results Conclusion

Experiment 4: Structured street scenes Intro Approach Results Conclusion

47 Windows Intro Approach Results Conclusion

48 Object categories Intro Approach Results Conclusion

49 Object categories Intro Approach Results Conclusion

50 Object categories Intro Approach Results Conclusion

51 Parts of objects Intro Approach Results Conclusion

52 Multiple objects Intro Approach Results Conclusion

53 Street Scenes (PLSA) Intro Approach Results Conclusion

54 Dataset specific rules irrelevant relevant Intro Approach Results Conclusion

55 Conclusion  Unsupervised learning of hierarchical spatial patterns  Low level features, object parts, objects, regions in scene  Rule-based approach  Learning: EM style  Inference: Minimum cost Steiner tree  Features  SIFT, edges, color segments Intro Approach Results Conclusion

56 Summary I Root Scene Objects Object Parts Features Intro Approach Results Conclusion