Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Slides:

Advertisements

Similar presentations

Distinctive Image Features from Scale-Invariant Keypoints

Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.

Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.

Aggregating local image descriptors into compact codes

Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.

Three things everyone should know to improve object retrieval

Presented by Xinyu Chang

Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,

Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.

Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.

Query Specific Fusion for Image Retrieval

Large-Scale Object Recognition with Weak Supervision

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

São Paulo Advanced School of Computing (SP-ASC’10). São Paulo, Brazil, July 12-17, 2010 Looking at People Using Partial Least Squares William Robson Schwartz.

Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.

Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.

Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.

A Study of Approaches for Object Recognition

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

Scale Invariant Feature Transform (SIFT)

Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Matthew Brown University of British Columbia (prev.) Microsoft Research [ Collaborators: † Simon Winder, *Gang Hua, † Rick Szeliski † =MS Research, *=MS.

Generic object detection with deformable part-based models

Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.

Computer vision.

Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.

EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.

SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.

Overview Harris interest points Comparing interest points (SSD, ZNCC, SIFT) Scale & affine invariant interest points Evaluation and comparison of different.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.

Recognition using Regions (Demo) Sudheendra V. Outline Generating multiple segmentations –Normalized cuts [Ren & Malik (2003)] Uniform regions –Watershed.

International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.

Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.

Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

Local invariant features Cordelia Schmid INRIA, Grenoble.

Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,

Singer similarity / identification Francois Thibault MUMT 614B McGill University.

Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.

Overview Introduction to local features Harris interest points + SSD, ZNCC, SIFT Scale & affine invariant interest point detectors Evaluation and comparison.

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision.

Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.

Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.

Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.

SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.

Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.

CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.

WLD: A Robust Local Image Descriptor Jie Chen, Shiguang Shan, Chu He, Guoying Zhao, Matti Pietikäinen, Xilin Chen, Wen Gao 报告人：蒲薇榄.

SIFT Scale-Invariant Feature Transform David Lowe

Object detection with deformable part-based models

Distinctive Image Features from Scale-Invariant Keypoints

Scale Invariant Feature Transform (SIFT)

Data Driven Attributes for Action Detection

Efficient Image Classification on Vertically Decomposed Data

Nonparametric Semantic Segmentation

Object detection as supervised classification

Efficient Image Classification on Vertically Decomposed Data

CS 1674: Intro to Computer Vision Scene Recognition

Rob Fergus Computer Vision

“The Truth About Cats And Dogs”

Local Binary Patterns (LBP)

Brief Review of Recognition + Context

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Paper Reading Dalong Du April.08, 2011.

CSE 185 Introduction to Computer Vision

边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University

Presentation transcript:

Object Recognition on Mobile Platforms Using Mixture of Global Image Features Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science National Chengchi University, Taipei, TAIWAN August 30, 2012

Object Recognition ‘Scene’ Recognition ?

Outline Objective and system architecture Global vs. local descriptors Proposed global image descriptors Weighted gist feature Average Effective Number of Neighbors Experimental results Taiwan landmark Oxford dataset Conclusion

Objective Design a computationally efficient framework for automatic scene recognition on mobile platforms Constraints: limited resources (CPU, communication bandwidth, storage)

System architecture Network Back-end Front-end Identification Image Feature Extraction Feature Compression Front-end Query Data Matching Post-processing Feature Database Image Database Network Back-end

Global vs. local descriptors Global features: provide a succinct description of the scene structure faster to compute Examples: histogram, gist. Local features: Characterize distinct points/parts in an image Robust features require substantial computation resources Examples: SIFT, SURF or HoG

The proposed global features Weighted gist descriptor: based on the ‘gist’ descriptor, but weighted by saliency measure of the image region. Average effective number of neighbors (AENN): try to capture the overall structure of the scene based on the distribution of edge pixels.

The original gist feature Computed by convolving an oriented filter with the image at several different orientations and scales. The scores for the filter convolution at each orientation and scale are stored in an array, which is the gist feature for that image.

Illustration 4x4 blocks, 6 orientations and 5 frequencies (Oliva and Torralba, 2001)

Saliency Map Graph-based visual saliency, J. Harel et al. 2007.

Weighted gist descriptor

Effective Number of Neighbors (ENN) (L):8, (R):8 Effective number of neighbors: (L):8 (R):4/2+4/4=3 5x5 window

Computing Average ENN Step 1: Edge detection Step 2: Keep top q percent of the edge pixels Step 3: Compute ENN using a DxD neighborhood Step 4: a Average the ENN in an image block to form the feature vector

Average ENN descriptor: some examples

Parameter Settings (I) For weighted gist partition the image into 4x4 blocks. use 8 orientation channels at two different frequencies and 4 orientation channels at another frequency, totaling 20 coefficients for each block feature dimension=4x4x20=320 for each channel For color images: 960 dimension

Parameter Settings (II) For AENN Keep top 10% strong edges Partition into 8x8 image blocks Neighborhood size for computing ENN: 7x7

Classifier Support vector machine (SVM) with probability estimate output. Information fusion using linear combination

Experiment I: Taiwan Landmarks Data collection - A total of 9530 images from 50 landmarks have been gathered. On average, each landmark contains 190 images. - Test dataset - Randomly select 20 images from each category (20x50=1000) - Training dataset - The remaining 8530 images

Experiment I: Results

Comparison of individual and hybrid approaches

Experiment II: Oxford buildings dataset Publicly available dataset with ground truth Contains 5062 high resolution images 11 different categories Image quality: - Good - OK - Bad - Junk

Recognition rate using Oxford dataset Retain only Good and OK images Query data: Randomly select 3 images from each category Training data - The remaining as training data

Other approaches using Oxford dataset(1/3) Local configuration of SIFT-like features by a shape context - Image feature: PCA-SIFT local feature - Utilize shape context to describe local configuration The parameters are: scale of shape context in pixel, number of segmentation in scale/angle

Other approaches using Oxford dataset (2/3) Improving bag-of-features for large scale image search - Using Hessian-Affine detector and SIFT - Employ bag-of-feature (BOF) framework to search for approximate nearest neighbor - Methods tested include original BOF, Hamming embedding(HE), weak geometric consistency constraints(WGC) and multiple assignment(MA)

Other approaches using Oxford dataset(3/3)

Conclusion Proposed a framework for scene recognition on mobile platforms Formulated two global image descriptors for recognition tasks Experiment results and comparative analysis have demonstrated the efficacy of our proposed strategy.

Thank you Q & A