CS294‐43: Visual Object and Activity Recognition Prof. Trevor Darrell Spring 2009 March 17 th, 2009.

Slides:



Advertisements
Similar presentations
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Advertisements

Three things everyone should know to improve object retrieval
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
MIT CSAIL Vision interfaces Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell.
Efficiently searching for similar images (Kristen Grauman)
Classification using intersection kernel SVMs is efficient Joint work with Subhransu Maji and Alex Berg Jitendra Malik UC Berkeley.
Part 1: Bag-of-words models by Li Fei-Fei (Princeton)
Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features Kristen Grauman Trevor Darrell MIT.
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
1 Image Retrieval Hao Jiang Computer Science Department 2009.
Fast intersection kernel SVMs for Realtime Object Detection
Discriminative and generative methods for bags of features
Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
Lecture 28: Bag-of-words models
Agenda Introduction Bag-of-words model Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Bag-of-features models
CS294‐43: Visual Object and Activity Recognition Prof. Trevor Darrell Spring 2009.
Object Recognition by Discriminative Methods Sinisa Todorovic 1st Sino-USA Summer School in VLPR July, 2009.
Discriminative and generative methods for bags of features
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
A String Matching Approach for Visual Retrieval and Classification Mei-Chen Yeh* and Kwang-Ting Cheng Learning-Based Multimedia Lab Department of Electrical.
Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
The Beauty of Local Invariant Features
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Indexing Techniques Mei-Chen Yeh.
Real-time Action Recognition by Spatiotemporal Semantic and Structural Forest Tsz-Ho Yu, Tae-Kyun Kim and Roberto Cipolla Machine Intelligence Laboratory,
Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.
Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.
Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.
CSE 473/573 Computer Vision and Image Processing (CVIP)
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 24 – Classifiers 1.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Category Discovery from the Web slide credit Fei-Fei et. al.
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
SVM-KNN Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik.
Classical Methods for Object Recognition Rob Fergus (NYU)
Deformable Part Model Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 11 st, 2013.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Methods for classification and image representation
Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions Bangpeng Yao and Li Fei-Fei Computer Science Department, Stanford.
A feature-based kernel for object classification P. Moreels - J-Y Bouguet Intel.
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
CS654: Digital Image Analysis
3D Motion Data Mining Multimedia Project Multimedia and Network Lab, Department of Computer Science.
Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.
Lecture IX: Object Recognition (2)
Tues April 25 Kristen Grauman UT Austin
Paper Presentation: Shape and Matching
Digit Recognition using SVMS
By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,
Finding Clusters within a Class to Improve Classification Accuracy
Approximate Correspondences in High Dimensions
CS 1674: Intro to Computer Vision Scene Recognition
Presentation transcript:

CS294‐43: Visual Object and Activity Recognition Prof. Trevor Darrell Spring 2009 March 17 th, 2009

Last Lecture – Discriminative approaches (SVM, HCRF) Classic SVM on “bags of features”: C. Dance, J. Willamowski, L. Fan, C. Bray, and G. Csurka, "Visual categorization with bags of keypoints," in ECCV International Workshop on Statistical Learning in Computer Vision, ISM + SVM + Local Kernels: M. Fritz; B. Leibe; B. Caputo; B. Schiele: Integrating Representative and Discriminant Models for Object Category Detection, ICCV'05, Beijing, China, 2005 [M. Fritz] Local SVM: H. Zhang, A. C. Berg, M. Maire, and J. Malik, "Svm-knn: Discriminative nearest neighbor classification for visual category recognition," in CVPR '06: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, DC, USA: IEEE Computer Society, 2006, pp [M. Maire] “Latent” SVM with deformable parts: P. Felzenszwalb, D. Mcallester, and D. Ramanan, "A discriminatively trained, multiscale, deformable part model," in IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) Anchorage, Alaska, June 2008., June Hidden Conditional Random Fields: Y. Wang and G. Mori, “Learning a Discriminative Hidden Part Model for Human Action Recognition”, Advances in Neural Information Processing Systems (NIPS), 2008

Today – Correspondence and Pyramid-based techniques C. Berg, T. L. Berg, and J. Malik, "Shape matching and object recognition using low distortion correspondences," in CVPR '05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) [K. Kin] K. Grauman and T. Darrell, "The pyramid match kernel: discriminative classification with sets of image features," ICCV, vol. 2, 2005, pp Vol. 2 S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," CVPR, vol. 2, 2006, pp [L. Bourdev] S. Maji, A. C. Berg, and J. Malik, "Classification using intersection kernel support vector machines is efficient," in Computer Vision and Pattern Recognition, CVPR IEEE Conference on, 2008, pp [S. Maji] K. Grauman and T. Darrell, "Approximate correspondences in high dimensions," in In NIPS, vol A. Bosch, A. Zisserman, and X. Munoz, "Representing shape with a spatial pyramid kernel," in CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval [D. Bellugi]

Comparing sets of local features Previous strategies: Match features individually, vote on small sets to verify [Schmid, Lowe, Tuytelaars et al.] Explicit search for one-to- one correspondences [Rubner et al., Belongie et al., Gold & Rangarajan, Wallraven & Caputo, Berg et al., Zhang et al.,…] Compare frequencies of prototype features [Csurka et al., Sivic & Zisserman, Lazebnik & Ponce]

Partially matching sets of features We introduce an approximate matching kernel that makes it practical to compare large sets of features based on their partial correspondences. Optimal match: O(m 3 ) Greedy match: O(m 2 log m) Pyramid match: O(m) (m=num pts) [Previous work: Indyk & Thaper, Bartal, Charikar, Agarwal & Varadarajan, …]

Pyramid match: main idea descriptor space Feature space partitions serve to “match” the local descriptors within successively wider regions.

Pyramid match: main idea Histogram intersection counts number of possible matches at a given partitioning.

Pyramid match kernel For similarity, weights inversely proportional to bin size (or may be learned) Normalize these kernel values to avoid favoring large sets [Grauman & Darrell, ICCV 2005] measures difficulty of a match at level number of newly matched pairs at level

Pyramid match kernel optimal partial matching Optimal match: O(m 3 ) Pyramid match: O(mL)

Highlights of the pyramid match Linear time complexity Formal bounds on expected error Mercer kernel Data-driven partitions allow accurate matches even in high-dim. feature spaces Strong performance on benchmark object recognition datasets

Recognition results: Caltech-101 dataset Data provided by Fei-Fei, Fergus, and Perona 101 categories images per class

Jain, Huynh, & Grauman (2007) Recognition results: Caltech-101 dataset

Number of training examples Combination of pyramid match and correspondence kernels Recognition results: Caltech-101 dataset Accuracy [Kapoor et al. IJCV 2009]

Spatial Pyramid Match Kernel Lazebnik, Schmid, Ponce, Dual-space Pyramid Matching Hu et al., Representing Shape with a Pyramid Kernel Bosch & Zisserman, L=0 L=1L=2 Pyramid match kernel: examples of extensions and applications by other groups Scene recognition Shape representation Medical image classification

wavesit down Single View Human Action Recognition using Key Pose Matching, Lv & Nevatia, Spatio-temporal Pyramid Matching for Sports Videos, Choi et al., From Omnidirectional Images to Hierarchical Localization, Murillo et al Pyramid match kernel: examples of extensions and applications by other groups Action recognitionVideo indexing Robot localization

Vocabulary-guided pyramid match Uniform bins Tune pyramid partitions to the feature distribution Accurate for d > 100 Requires initial corpus of features to determine pyramid structure Small cost increase over uniform bins: kL distances against bin centers to insert points Vocabulary- guided bins [Grauman & Darrell, NIPS 2006]

Approximation quality Uniform bin pyramid match Vocabulary- guided pyramid match d=8 d=128 ETH-80 images, sets of SIFT features

Approximation quality

Bin structure and match counts Data-dependent bins allow more gradual distance ranges in high dimensions d=3 d=8d=13 d=23 d=128

Recognition on the ETH-80 ETH-80 data set 8 categories 50 images each One-vs-all SVM with PMK Features: –Harris detector (vary m with saliency threshold) –PCA-SIFT descriptor Data provided by Bastian Leibe and Bernt Schiele

Recognition on the ETH-80 Recognition accuracy (%) Training time (s) Mean number of features per set (m) ComplexityKernel Pyramid match Match [Wallraven et al.]

Today – Correspondence and Pyramid-based techniques C. Berg, T. L. Berg, and J. Malik, "Shape matching and object recognition using low distortion correspondences," in CVPR '05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) [K. Kin] K. Grauman and T. Darrell, "The pyramid match kernel: discriminative classification with sets of image features," ICCV, vol. 2, 2005, pp Vol. 2 S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," CVPR, vol. 2, 2006, pp [L. Bourdev] S. Maji, A. C. Berg, and J. Malik, "Classification using intersection kernel support vector machines is efficient," in Computer Vision and Pattern Recognition, CVPR IEEE Conference on, 2008, pp [S. Maji] K. Grauman and T. Darrell, "Approximate correspondences in high dimensions," in In NIPS, vol A. Bosch, A. Zisserman, and X. Munoz, "Representing shape with a spatial pyramid kernel," in CIVR '07: Proceedings of the 6th ACM international conference on Image and video retrieval [D. Bellugi]

Next Lecture – Category Discovery from the Web R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman, "Learning object categories from google's image search," ICCV vol. 2, 2005, pp Vol. 2. Available: L.-J. Li, G. Wang, and L. Fei-Fei, "Optimol: automatic online picture collection via incremental model learning," in Computer Vision and Pattern Recognition, CVPR '07. IEEE Conference on, 2007, pp Available: F. Schroff, A. Criminisi, and A. Zisserman, "Harvesting image databases from the web," in Computer Vision, ICCV IEEE 11th International Conference on, 2007, pp Available: K. Saenko and T. Darrell, "Unsupervised Learning of Visual Sense Models for Polysemous Words". Proc. NIPS, December 2008, Vancouver, Canada. T. Berg and D. Forsyth, "Animals on the Web". In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Washington, DC, Available: