Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Distinctive Image Features from Scale-Invariant Keypoints David Lowe.
Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.
1 Challenge the future HON4D: Histogram of Oriented 4D Normals for Activity Recognition from Depth Sequences Omar Oreifej Zicheng Liu CVPR 2013.
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
電腦視覺 Computer and Robot Vision I Chapter2: Binary Machine Vision: Thresholding and Segmentation Instructor: Shih-Shinh Huang 1.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Watching Unlabeled Video Helps Learn New Human Actions from Very Few Labeled Snapshots Chao-Yeh Chen and Kristen Grauman University of Texas at Austin.
Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Robust and large-scale alignment Image from
A Study of Approaches for Object Recognition
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
E.G.M. PetrakisTexture1 Repeative patterns of local variations of intensity on a surface –texture pattern: texel Texels: similar shape, intensity distribution.
Scale Invariant Feature Transform (SIFT)
Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking and Event Recognition Oytun Akman.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Hand Signals Recognition from Video Using 3D Motion Capture Archive Tai-Peng Tian Stan Sclaroff Computer Science Department B OSTON U NIVERSITY I. Introduction.
DVMM Lab, Columbia UniversityVideo Event Recognition Video Event Recognition: Multilevel Pyramid Matching Dong Xu and Shih-Fu Chang Digital Video and Multimedia.
Computer vision.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Flow Based Action Recognition Papers to discuss: The Representation and Recognition of Action Using Temporal Templates (Bobbick & Davis 2001) Recognizing.
Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
PageRank for Product Image Search Kevin Jing (Googlc IncGVU, College of Computing, Georgia Institute of Technology) Shumeet Baluja (Google Inc.) WWW 2008.
Characterizing activity in video shots based on salient points Nicolas Moënne-Loccoz Viper group Computer vision & multimedia laboratory University of.
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Object Detection with Discriminatively Trained Part Based Models
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
CS654: Digital Image Analysis Lecture 25: Hough Transform Slide credits: Guillermo Sapiro, Mubarak Shah, Derek Hoiem.
Chao-Yeh Chen and Kristen Grauman University of Texas at Austin Efficient Activity Detection with Max- Subgraph Search.
Wenqi Zhu 3D Reconstruction From Multiple Views Based on Scale-Invariant Feature Transform.
CS654: Digital Image Analysis
Stable Multi-Target Tracking in Real-Time Surveillance Video
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Histograms of Oriented Gradients for Human Detection(HOG)
Boosted Particle Filter: Multitarget Detection and Tracking Fayin Li.
Final Review Course web page: vision.cis.udel.edu/~cv May 21, 2003  Lecture 37.
CSE 185 Introduction to Computer Vision Feature Matching.
A Tutorial on using SIFT Presented by Jimmy Huff (Slightly modified by Josiah Yoder for Winter )
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Line Matching Jonghee Park GIST CV-Lab..  Lines –Fundamental feature in many computer vision fields 3D reconstruction, SLAM, motion estimation –Useful.
Notes on HW 1 grading I gave full credit as long as you gave a description, confusion matrix, and working code Many people’s descriptions were quite short.
Learning video saliency from human gaze using candidate selection CVPR2013 Poster.
776 Computer Vision Jan-Michael Frahm Spring 2012.
CSCI 631 – Foundations of Computer Vision March 15, 2016 Ashwini Imran Image Stitching.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Another Example: Circle Detection
SIFT Scale-Invariant Feature Transform David Lowe
Lecture 07 13/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Scale Invariant Feature Transform (SIFT)
Real-Time Human Pose Recognition in Parts from Single Depth Image
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Brief Review of Recognition + Context
Paper Reading Dalong Du April.08, 2011.
Midterm Exam Closed book, notes, computer Similar to test 1 in format:
Midterm Exam Closed book, notes, computer Similar to test 1 in format:
Presented by Xu Miao April 20, 2005
Presentation transcript:

Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF

Outline Introduction Challenges Problem Definition Weighted Region Matching (WRM) – Pre-processing steps Human Detection Blob Extraction Alignment – Measuring the Distance Between Blobs – Determining the Voter’s Weight Experiments and Results

Introduction Identity recognition from aerial platforms is a daunting task. – Highly variant features in different poses – Vanish details under low quality images In tracking, objects are usually considered to have small displacements between observations. – Mean Shift [4] – Kalman filter-based tracking – with long temporal gaps, all assumptions of the continuous motion models become weak

Challenges Low quality images High pose variations Possibility of high density crowds We employ a robust region-based appearance matching.

Problem Definition A user is able to identify a target person over a short period of time. Humans maintained their clothing and general appearance. We define the problem as a voter-candidate race.

Weighted Region Matching (WRM) where P(vi) is the voter’s prior.

Weighted Region Matching (WRM) Equation (1) can be rewritten in a form similar to a mixture of Gaussians: where τ is a constant parameter Provide a robust representation of the distance between every voter-candidate pair. Specify the weight of every voter.

Human Detection We train a SVM classifier based on the HOG descriptor [6] positive images: – humans at different scales and poses 6000 negative examples: – the background and non-human objects Train over a subset of Validation using the rest of the dataset.

Blob Extraction The background regions contained in the bounding boxes do not provide any information about a specific person. Segmentation method: kernel density estimator [12, 15] Estimate the pdf directly from the data without any assumptions about the underlying distributions.

Alignment To eliminate the variations from camera orientation and human pose. Edge detection is noisy. A coarse alignment: – eight point head, shoulders and torso (HST) model – The model captures the basic orientation of the upper part of the body.

Alignment Find the best fit of the HST model over human blobs – we train an Active Appearance Model (AAM)

Alignment We employ to compute an affine transformation to a desired pose. Align all the blobs to the mean pose generated by the AAM training set.

Measuring the Distance Between Blobs Treat blob as a group of small regions of features. These features compose: – Histograms of HSV channels – The HOG descriptor We apply PCA on the feature space and extract the top 30 eigen vectors.

Measuring the Distance Between Blobs Using Earth Mover Distance [16, 14] (EMD) Compute the minimum cost of matching multiple regions. Having each region represented as a distribution in the feature space

Measuring the Distance Between Blobs Number of pixels bin Total cost in the example : 1·1+2·2=5, EMD=5/3 For two distributions, P = {pi} and Q = {qi} P Q

Determining the Voter’s Weight We rank the collection of input images according to the value of information. Given the set of regions from all voters, R = {r k } – We assign a weight for every region such that the most consistent regions are given higher weights – Use the PageRank algorithm [3]

PageRank Conception – Vote – based on a random walk algorithm A BDC PR(A) = PR(B) + PR(C) + PR(D) VisualRank: Applying PageRank to Large-Scale Image Search, 余償鑫

PageRank A B D C VisualRank: Applying PageRank to Large-Scale Image Search, 余償鑫

PageRank VisualRank: Applying PageRank to Large-Scale Image Search, 余償鑫

In G, we connect every region from voter i to the K nearest neighbor regions of voter j where i != j. The final weight for a region r k : Region sizePR the voter’s weight w i = normalized sum of weights of its regions

Matching Substituting the distances and the weights in equation 2, we compute a probability for every candidate to belong to the target. The best match should be the candidate with the highest probability.

Experiments and Results