Sparsity and Saliency Xiaodi Hou K-Lab, Computation and Neural Systems California Institute of Technology for the Crash Course on Visual Saliency Modeling:

Slides:



Advertisements
Similar presentations
Image Registration  Mapping of Evolution. Registration Goals Assume the correspondences are known Find such f() and g() such that the images are best.
Advertisements

CSCE 643 Computer Vision: Template Matching, Image Pyramids and Denoising Jinxiang Chai.
Face Alignment by Explicit Shape Regression
Group Meeting Presented by Wyman 10/14/2006
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Fast Algorithms For Hierarchical Range Histogram Constructions
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
Qualifying Exam: Contour Grouping Vida Movahedi Supervisor: James Elder Supervisory Committee: Minas Spetsakis, Jeff Edmonds York University Summer 2009.
Robust Object Tracking via Sparsity-based Collaborative Model
Classical inference and design efficiency Zurich SPM Course 2014
Bayesian Robust Principal Component Analysis Presenter: Raghu Ranganathan ECE / CMR Tennessee Technological University January 21, 2011 Reading Group (Xinghao.
Templates and Image Pyramids Computational Photography Derek Hoiem, University of Illinois 09/06/11.
Edge detection Goal: Identify sudden changes (discontinuities) in an image Intuitively, most semantic and shape information from the image can be encoded.
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
CSCE 641 Computer Graphics: Image Sampling and Reconstruction Jinxiang Chai.
A Study of Approaches for Object Recognition
CSCE 641 Computer Graphics: Image Sampling and Reconstruction Jinxiang Chai.
MSU CSE 803 Stockman Fall 2009 Vectors [and more on masks] Vector space theory applies directly to several image processing/representation problems.
Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.
Shadow Removal Seminar
Lecture 2: Image filtering
A Theory of Locally Low Dimensional Light Transport Dhruv Mahajan (Columbia University) Ira Kemelmacher-Shlizerman (Weizmann Institute) Ravi Ramamoorthi.
1 MSU CSE 803 Fall 2014 Vectors [and more on masks] Vector space theory applies directly to several image processing/representation problems.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Spatio-chromatic image content descriptors and their analysis using Extreme Value theory Vasileios Zografos and Reiner Lenz
Introduction of Saliency Map
FACE DETECTION AND RECOGNITION By: Paranjith Singh Lohiya Ravi Babu Lavu.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Computer vision.
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.
Model comparison and challenges II Compositional bias of salient object detection benchmarking Xiaodi Hou K-Lab, Computation and Neural Systems California.
Video Based Palmprint Recognition Chhaya Methani and Anoop M. Namboodiri Center for Visual Information Technology International Institute of Information.
University of Texas at Austin CS384G - Computer Graphics Fall 2010 Don Fussell Image processing.
Correspondence-Free Determination of the Affine Fundamental Matrix (Tue) Young Ki Baik, Computer Vision Lab.
Image Processing Edge detection Filtering: Noise suppresion.
Image Filtering Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/02/10.
Team 5 Wavelets for Image Fusion Xiaofeng “Sam” Fan Jiangtao “Willy” Kuang Jason “Jingsu” West.
Fields of Experts: A Framework for Learning Image Priors (Mon) Young Ki Baik, Computer Vision Lab.
Linear filtering. Motivation: Noise reduction Given a camera and a still scene, how can you reduce noise? Take lots of images and average them! What’s.
Geodesic Saliency Using Background Priors
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
Sparse Bayesian Learning for Efficient Visual Tracking O. Williams, A. Blake & R. Cipolloa PAMI, Aug Presented by Yuting Qi Machine Learning Reading.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Templates, Image Pyramids, and Filter Banks
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
3D reconstruction from uncalibrated images
Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.
MMC LAB Secure Spread Spectrum Watermarking for Multimedia KAIST MMC LAB Seung jin Ryu 1MMC LAB.
Non-linear filtering Example: Median filter Replaces pixel value by median value over neighborhood Generates no new gray levels.
- photometric aspects of image formation gray level images
Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics.
Article Review Todd Hricik.
Lit part of blue dress and shadowed part of white dress are the same color
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
A New Approach to Track Multiple Vehicles With the Combination of Robust Detection and Two Classifiers Weidong Min , Mengdan Fan, Xiaoguang Guo, and Qing.
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
CSCE 643 Computer Vision: Image Sampling and Filtering
Image and Video Processing
Midterm Exam Closed book, notes, computer Similar to test 1 in format:
Midterm Exam Closed book, notes, computer Similar to test 1 in format:
Intensity Transformation
Image Registration  Mapping of Evolution
Review and Importance CS 111.
DIGITAL IMAGE PROCESSING Elective 3 (5th Sem.)
Presentation transcript:

Sparsity and Saliency Xiaodi Hou K-Lab, Computation and Neural Systems California Institute of Technology for the Crash Course on Visual Saliency Modeling: Behavioral Findings and Computational Models CVPR 2013

Schedule 2

SPECTRAL SALIENCY DETECTION A brief history of 3

The surprising experiment 4 A hypothesis on natural image statistics and visual saliency 1.myFFT = fft2(inImg); 2.myLAmp = log(abs(myFFT)); 3.myPhase = angle(myFFT); 4.mySR = myLAmp - imfilter(myLAmp, fspecial('average', 3)); 5.salMap = abs(ifft2(exp(mySR + 1i*myPhase))).^2;

Is “spectral residual” really necessary? 5 Spectral residual reconstruction. Unit amplitude reconstruction. [Guo et. al., CVPR 08] – Phase-only Fourier Transform (PFT): All you need is the phase! – Quaternion Fourier Transform (PQFT): Computing grayscale image, color-opponent images, and frame difference image in one Quaternion transform. Also see: – [Bian et. al., ICONIP 09] – [Schauerte et. al., ECCV 12]

Extensions on Spectral Saliency 6 Quaternion algebra Feature Integration Theory: – [R, G, B]: 3x R 1 feature scalars Quaternion Fourier Transform [Guo et. al., CVPR 08]: – All channels should be combined together to transform. [RG, BY, I]: 3D feature vector [RG, BY, I, M]: 4D feature vector – Quaternion sum: similar to R 4. – Quaternion product: ×1ijk 11ijk iik-j jj-ki kkj-i Assume Left- hand rule

Extensions on Spectral Saliency Image Signature (SIG): [Hou et. al., PAMI 12] ImageSignature = sign(dct2(img)); – Theoretical justifications (will discuss later). – Simplest form. QDCT: [Schauerte et. al., ECCV 12] – Extending Image Signature to Quaternion DCT. 7 Spectral saliency in real domain

Extensions on Spectral Saliency PQFT [Guo et. al., CVPR 2008]: – Compute frame difference as the “motion channel”. – Apply spectral saliency (separately or using quaternion). Phase Discrepancy [Zhou and Hou, ACCV 2010]: mMap1=abs(ifft2((Amp2-Amp1).*exp(1i*Phase1))); mMap2=abs(ifft2((Amp1-Amp2).*exp(1i*Phase2))); – Compensate camera ego-motion to suppress background. – The limit of phase discrepancy is spectral saliency. 8 Saliency in videos Object 1 Object 2

Extensions on Spectral Saliency Scale is an ill-defined problem. No scale parameter in spectral saliency? – Scale is the size! – [32x24], [64x48], [128x96] are reasonable choices. Multi-scale spectral saliency: – [Schauerte et. al., ECCV 12] – [Li et. al., PAMI 13] 9 Scales and spectral saliency 64x48681x511

Extensions on Spectral Saliency Small object (sparse) assumption. Eye tracking v.s. Object mask (Ali will talk about it). 10 More caveats on scales Can spectral methods produce masks? – By performing amplitude spectrum filtering (HFT) [Li et. al., PAMI 13]. – “Good performance” in a limited sense: Better performance than spectral methods on salient object dataset Lower AUC than original spectral methods on an eye tracking dataset. Lower AUC than full-resolution methods on a salient object dataset. HFTSIG

PERFORMANCE EVALUATION A mini guide to 11

Performance Evaluation Dataset: – Freshly baked results on Bruce dataset. – Judd / Kootstra dataset results from [Schauerte et. al., ECCV 2012]. AUC score (0.5==chance) – Center bias normalized [Tatler et. al., Vision Research 2005]. Image size: – [64x48] for all methods. Benchmarking procedure: – Adaptive blurring based on [Hou et. al., PAMI 2012]. Platform and timing: – Single-thread MATLAB with Intel SNB i7 2600K. 12 Preliminaries

Performance Evaluation Is quaternion algebra necessary? – Same color space: [RG, BY, Grayscale] (OPPO). 13 Quaternion v.s. Feature Integration Theory [Schauerte et. al., ECCV 2012] – consistent ~1% advantage of PFT over PQFT on all 3 datasets. (perhaps different implementations of PQFT).

Performance Evaluation RGB, CIE-Lab, CIE-Luv, OPPO. SIG on each color channel, uniform channel weight. 14 On the choice of color spaces [Schauerte et. al., ECCV 2012]: Performance consistent among variations of spectral saliency. Performance fluctuates slightly among different datasets. How about combining all color channels together?

Performance Evaluation 15 Squeezing every last drop out of spectral saliency AUC contribution of each additional step. – Results from [Schauerte et. al., ECCV 2012]: BruceJuddKootstra SIG (Luv) Q-DCT (Luv) ( )( )( ) Multi-scale Q-DCT (Luv) ( )( )( ) BEST RESULTS: M-Q-DCT with Non-uniform colors and axis ( ) ( ) ( )

Conclusions 16

17

THE MECHANISMS OF SPECTRAL SALIENCY A quantitative analysis of 18

In search for a theory of spectral saliency From qualitative hypotheses: – Spectral Residual [Hou et. al., CVPR 07]: Smoothed amplitude spectrum represents the background. – Spectral Whitening [Bian et. al., ICONIP 09]: Taking phase spectrum is similar to Gabor filtering plus normalization. – Hypercomplex Fourier Transform [Li et. al., PAMI 13]: Background corresponds to amplitude spikes. To a theory: – Necessity. – Sufficiency. 19 Previous attempts

In search for a theory of spectral saliency Image = Foreground + Background. Saliency map is to detect the spatial support (mask) of the foreground. 20 What do we expect from a saliency algorithm? Image may contain negative values.

Evidence of low/high frequency components representing different content of the image: – Relationship to Hybrid Images/Gist of the Scene? In search for a theory of spectral saliency 21 Spectral saliency and low/high frequency components? Smoothed high frequency components – the saliency map. Low frequency component.

In search for a theory of spectral saliency Let me construct a counter example: – Background with both low and high frequencies. – 256x256 image, 30x30 foreground square. 22 Spectral saliency and low/high frequency components? Input image Low frequency components High frequency components

In search for a theory of spectral saliency Randomly select 10’000 (out of 65536) frequency components. Linearly combine them with Gaussian weight but wait, how did you generate that background? DCT Spectrum of the background Synthesized image Saliency map

In search for a theory of spectral saliency Because it didn’t work… 24 But… why not just Gaussian noise background? DCT spectrum of the background Image with Gaussian noise background Saliency map

More observations on spectral saliency Spectral saliency doesn’t care about how we choose those 10’000 (out of 65536) frequency components. 25 DCT spectrum of the background Square frequency component image Saliency map

More observations on spectral saliency Spectral saliency is blind to a big foreground: – Background uses 10’000 frequency components. – Foreground uses a [150, 150] square. 26 Big foreground image Raw saliency map Saliency map

More observations on spectral saliency Spiky background distracts spectral saliency: – Background uses 10’000 frequency components plus 10’000 random spikes. 27 Spiky image Raw saliency map Smoothed saliency map

More observations on spectral saliency Spectral saliency detects “invisible” foregrounds: – Background from 10’000 random DCT components. – Superimposing a super weak foreground patch (~ ). 28 Background image Foreground image, weighted by Smoothed saliency map >>eps == e-16

Characterizing the properties of spectral saliency Observation: – Background and saliency: Number of DCT component. Invariant to component selection. The construction noise. – Foreground and saliency: Size matters. Detects “invisible” foregrounds. Candidate hypotheses: – Smoothed amplitude spectrum represents the background. [Hou et. al., CVPR 07]. – Spectral saliency is, approximately, a contrast detector. [Li et. al., PAMI 13]. – Spikes in the amplitude spectrum determine the foreground- background composition. [Li et. al., PAMI 13]. – Spectral saliency is equivalent to Gabor filtering and normalization. [Bian et. al., ICONIP 09]. 29 Whyyyyy?????

SALIENCY AND SPARSITY 30

A quantitative analysis on spectral saliency Image Signature [Hou et. al., PAMI 12]: – Saliency as a problem of small foreground on a simple background. Small in terms of spatial sparsity. Simple in terms of spectral sparsity. ImageSignature = sign(dct2(img)); 31 fbx In pixel domain: += F B X In DCT (Discrete Cosine Transform) domain: +=

The structure of the proof Proposition 1: – Signature of the foreground-only image is highly correlated to the signature of the entire image. Proposition 2: – The reconstruction energy of the signature of the foreground-only image stays in the foreground region. 32 dct sign idct sign + idct f b F BX F-SIG SAL f-SAL X-SIG More details in the paper: X. Hou, J. Harel, and C. Koch: Image Signature: Highlighting Sparse Salient Regions, PAMI 2012

Spectral properties of the foreground Heisenberg Uncertainty: years of uncertainty principles: from Heisenberg to compressive sensing A single spike Spike amplitude spectrum A Dirac Comb Mallat, Academic Press 08 Signals can’t be sparse in both spatial and spectral domains! Amplitude spectrum of a Dirac Comb

Spectral properties of the foreground years of uncertainty principles: from Heisenberg to compressive sensing Uniform Uncertainty Principle: – Inequality holds in probability. – Almost true for most realistic sparse signals. (Dirac comb signals are rare.) – Tight bounds on the sparsity of natural signals in spatial and Fourier domain – very close to experimental data. E. Candes and T. Tao: Near Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?

Spectral saliency, explained Sparse background: – Related to the number of DCT component. – Invariant to specific component selection. – Related to construction noises. Small foreground: – Related to foreground size. – Invariant to foreground intensity. 35 Theory meets the empirical observations

Related works Robust PCA [Candes et. al., JACM 11] – Surveillance video = Low rank background + spasre foreground. – Faces = Intrinsic face images + spectacularities/shadows. 36 From saliency to background modeling EXACT solutions for 250 frames, in 36 minutes.

Beyond saliency maps d = sum(sign(dct2(x1))~=sign(dct2(x2))); KNN on FERET face database: – 20, 10, 0, -10, -20, expression, illumination. – 700 training, 700 testing. 37 Saliency as an image descriptor 98.86% accuracy. Hou et. al., rejected unpublished work

Conclusions The devil is in the details – Qualitative descriptions are hypotheses, not theories. The devil is in the counter-examples – Algorithm, know your limits! The devil is in the sparsity 38

THANK YOU! 39