A String Matching Approach for Visual Retrieval and Classification Mei-Chen Yeh* and Kwang-Ting Cheng Learning-Based Multimedia Lab Department of Electrical.

Slides:



Advertisements
Similar presentations
Distinctive Image Features from Scale-Invariant Keypoints David Lowe.
Advertisements

DONG XU, MEMBER, IEEE, AND SHIH-FU CHANG, FELLOW, IEEE Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment.
Multiclass SVM and Applications in Object Classification
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Presented by Xinyu Chang
Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.
Recognizing hand-drawn images using shape context Gyozo Gidofalvi Computer Science and Engineering Department University of California, San Diego
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features Kristen Grauman Trevor Darrell MIT.
CS395: Visual Recognition Spatial Pyramid Matching Heath Vinicombe The University of Texas at Austin 21 st September 2012.
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Ziming Zhang *, Ze-Nian Li, Mark Drew School of Computing Science, Simon Fraser University, Vancouver, B.C., Canada {zza27, li, Learning.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Computer Vision Group University of California Berkeley Shape Matching and Object Recognition using Shape Contexts Jitendra Malik U.C. Berkeley (joint.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
A Study of Approaches for Object Recognition
CS294‐43: Visual Object and Activity Recognition Prof. Trevor Darrell Spring 2009 March 17 th, 2009.
1 Invariant Local Feature for Object Recognition Presented by Wyman 2/05/2006.
Spatial Pyramid Pooling in Deep Convolutional
Shape Classification Using the Inner-Distance Haibin Ling David W. Jacobs IEEE TRANSACTION ON PATTERN ANAYSIS AND MACHINE INTELLIGENCE FEBRUARY 2007.
CAD’11, TaipeiDepartment of Engineering Design, IIT Madras M. Ramanathan Department of Engineering Design Indian Institute of Technology Madras.
Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.
Matthew Brown University of British Columbia (prev.) Microsoft Research [ Collaborators: † Simon Winder, *Gang Hua, † Rick Szeliski † =MS Research, *=MS.
Overview Introduction to local features
Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.
Computer vision.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.
Andrew Bender Alexander Cobian
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
SVM-KNN Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
In Defense of Nearest-Neighbor Based Image Classification Oren Boiman The Weizmann Institute of Science Rehovot, ISRAEL Eli Shechtman Adobe Systems Inc.
Computer Vision Group University of California Berkeley On Visual Recognition Jitendra Malik UC Berkeley.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Histograms of Oriented Gradients for Human Detection(HOG)
Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.
A feature-based kernel for object classification P. Moreels - J-Y Bouguet Intel.
Hierarchical Matching with Side Information for Image Classification
CSE 185 Introduction to Computer Vision Feature Matching.
Methods for 3D Shape Matching and Retrieval
Final Project Mei-Chen Yeh May 15, General In-class presentation – June 12 and June 19, 2012 – 15 minutes, in English 30% of the overall grade In-class.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.
Finding Clusters within a Class to Improve Classification Accuracy Literature Survey Yong Jae Lee 3/6/08.
CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.
Learning Mid-Level Features For Recognition
Nonparametric Semantic Segmentation
Paper Presentation: Shape and Matching
ICCV Hierarchical Part Matching for Fine-Grained Image Classification
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
CS 1674: Intro to Computer Vision Scene Recognition
Shape matching and object recognition using shape contexts
Objects as Attributes for Scene Classification
SIFT keypoint detection
Paper Reading Dalong Du April.08, 2011.
CSE 185 Introduction to Computer Vision
Presentation transcript:

A String Matching Approach for Visual Retrieval and Classification Mei-Chen Yeh* and Kwang-Ting Cheng Learning-Based Multimedia Lab Department of Electrical and Computer Engineering University of California, Santa Barbara

Representation Based on Local Features Belongie et al.; Berg. et al.; Ling & JacobsOzkan & Duyulu; Bicego et al.; Zou et al.

Matching Features are unordered Similarity is measured only on a feature subset

Alternative Representation? Globally ordered and locally unordered! Order carries useful information Incorporate order into representation

Matching Order of features is constrained Sequences may not be of equal lengths Tolerate errors Incorporate the ground distance Measure similarity based on both matched and unmatched features

Outline Introduction  Representation based on ordered features  Matching criteria Approach Experiments  Shape retrieval  Scene recognition Conclusion and future work

Approximate String Matching The distance is defined as the minimal cost of operations that transform X into Y

Edit Distance Three operations  Insertion δ (ε, a)  Deletion δ (a, ε)  Substitution δ (a, b) Application dependent Integrates the ground distance Example: Matching two shapes Shape Context Descriptor χ 2 distance d(∙,∙) δ (ε, a) = d ( 0, a ) δ (a, ε) = d ( a, 0 ) δ (a, b) = d ( a, b)

Edit Distance (cont.) Metric if each operation cost is also a metric Preserve the ordering The edit distance reflects  The similarity of the corresponding features by substitution  The dissimilarity of unmatched features by insertion/deletion Easy to incorporate the ground distance No need to create a visual alphabet

Computing the edit distance Computation O(mn) Space O(min(m, n))

Alignment of Two Sequences Cyclic sequences Search for a pattern X in a duplicate string Y-Y y1y1 y1y1 ynyn ynyn … … x1x1 xmxm min { D m, j }

Outline Introduction  Representation based on ordered features  Matching criteria Approach Experiments  Shape retrieval  Scene recognition Conclusion and future work

Shape Retrieval MPEG-7 Core Experiment CE-Shape-1 part B  1400 shapes, 70 categories  Bull’s eye test # {correct hits in top 40} # {total possible hits} Representation  Uniformly sample 100 points  60-d (5 bins, 12 orientations) shape context descriptor  χ 2 distance, δ (ε, a) = δ (a, ε) = 0.5

Shape Retrieval Same features (shape context descriptor) Different matching methods BipartiteSC+TPS [Belongie et al.] COPAP [Scott & Nowak] Ours Different features Fourier Descriptors Ours (SC+ASM) IDSC+DP [Ling & Jacobs] Find the alignment during matching No need for any transformation model

Scene Recognition Scene dataset [Lazebnik et al.] Scene dataset  15 categories, 4485 images, images per category  100 images for training, rest for testing Representation  Spatial pyramid representation  Harris-Hessian-Laplace detector + SIFT features  200 visual words Classification  SVM with specified kernel values

Spatial Pyramid Representation Level-2 partitioning (16 bags-of-features) achieves the best performance Each image is represented by 16 bags of features

Scene Recognition Spatial Pyramid Matching : bag-to-bag matching Our matching method could allow matching across bags Using the L-2 partitioning alone and the same ground distance χ 2 SPM [Lazebnik et al.] BipartiteOurs

back 86.74% 76.73%95.48% 48.71%78.69%55.87% 35.21%67.74%52.65% 58.52%73.91%65.63% 46.55%53.76%77.72%

Conclusion A globally ordered and locally unordered representation for visual data Approximate String Matching for measuring the similarity between such representations  Order is considered  Naturally integrates the ground distance between features  Similarity is derived based on both matched and unmatched features

Future Work Image registration by using correspondences found by implementing this approach. Video retrieval and video copy detection by exploring the local alignment ability of string matching approach.

Thank you More information: