FOCUS PRIOR ESTIMATION FOR SALIENT OBJECT DETECTION

Slides:



Advertisements
Similar presentations
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
Advertisements

Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
A generic model to compose vision modules for holistic scene understanding Adarsh Kowdle *, Congcong Li *, Ashutosh Saxena, and Tsuhan Chen Cornell University,
Ming-Ming Cheng 1 Ziming Zhang 2 Wen-Yan Lin 3 Philip H. S. Torr 1 1 Oxford University, 2 Boston University 3 Brookes Vision Group Training a generic objectness.
KE CHEN 1, SHAOGANG GONG 1, TAO XIANG 1, CHEN CHANGE LOY 2 1. QUEEN MARY, UNIVERSITY OF LONDON 2. THE CHINESE UNIVERSITY OF HONG KONG CUMULATIVE ATTRIBUTE.
Hongliang Li, Senior Member, IEEE, Linfeng Xu, Member, IEEE, and Guanghui Liu Face Hallucination via Similarity Constraints.
A Sampled Texture Prior for Image Super-Resolution Lyndsey C. Pickup, Stephen J. Roberts and Andrew Zisserman, Robotics Research Group, University of Oxford.
1 Micha Feigin, Danny Feldman, Nir Sochen
Robust Object Tracking via Sparsity-based Collaborative Model
Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Dictionary-Learning for the Analysis Sparse Model Michael Elad The Computer Science Department The Technion – Israel Institute of technology Haifa 32000,
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Miguel Lourenço, João P. Barreto, Abed Malti Institute for Systems and Robotics, Faculty of Science and Technology University of Coimbra, Portugal Feature.
Matthew Brown University of British Columbia (prev.) Microsoft Research [ Collaborators: † Simon Winder, *Gang Hua, † Rick Szeliski † =MS Research, *=MS.
Speaker: Chi-Yu Hsu Advisor: Prof. Jian-Jung Ding Leveraging Stereopsis for Saliency Analysis, CVPR 2012.
earthobs.nr.no Land cover classification of cloud- and snow-contaminated multi-temporal high-resolution satellite images Arnt-Børre Salberg and.
Matching 3D Shapes Using 2D Conformal Representations Xianfeng Gu 1, Baba Vemuri 2 Computer and Information Science and Engineering, Gainesville, FL ,
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
Participation in the NIPS 2003 Challenge Theodor Mader ETH Zurich, Five Datasets were provided for experiments: ARCENE: cancer diagnosis.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Mingyang Zhu, Huaijiang Sun, Zhigang Deng Quaternion Space Sparse Decomposition for Motion Compression and Retrieval SCA 2012.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Geodesic Saliency Using Background Priors
Region-Based Saliency Detection and Its Application in Object Recognition IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM FOR VIDEO TECHNOLOGY, VOL. 24 NO. 5,
PARALLEL FREQUENCY RADAR VIA COMPRESSIVE SENSING
Hierarchical Matching with Side Information for Image Classification
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Image Decomposition, Inpainting, and Impulse Noise Removal by Sparse & Redundant Representations Michael Elad The Computer Science Department The Technion.
CHAPTER 4 THE VISUALIZATION PIPELINE. CONTENTS The focus is on presenting the structure of a complete visualization application, both from a conceptual.
Colour and Texture. Extract 3-D information Using Vision Extract 3-D information for performing certain tasks such as manipulation, navigation, and recognition.
Stas Goferman Lihi Zelnik-Manor Ayellet Tal Technion.
Zhilin Zhang, Bhaskar D. Rao University of California, San Diego March 28,
Determining 3D Structure and Motion of Man-made Objects from Corners.
Learning video saliency from human gaze using candidate selection CVPR2013 Poster.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.
Single Image Interpolation via Adaptive Non-Local Sparsity-Based Modeling The research leading to these results has received funding from the European.
A computational model of stereoscopic 3D visual saliency School of Electronic Information Engineering Tianjin University 1 Wang Bingren.
ICCV 2007 Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1,
Jianchao Yang, John Wright, Thomas Huang, Yi Ma CVPR 2008 Image Super-Resolution as Sparse Representation of Raw Image Patches.
PRESENTED BY KE CHEN DEPARTMENT OF SIGNAL PROCESSING TAMPERE UNIVERSITY OF TECHNOLOGY, FINLAND CUMULATIVE ATTRIBUTE SPACE FOR AGE AND CROWD DENSITY ESTIMATION.
Sparsity Based Poisson Denoising and Inpainting
What Convnets Make for Image Captioning?
ALADDIN A Locality Aligned Deep Model for Instance Search
Speech Enhancement with Binaural Cues Derived from a Priori Codebook
Compositional Human Pose Regression
Saliency detection with background model
Nonparametric Semantic Segmentation
Fast Preprocessing for Robust Face Sketch Synthesis
Learning to Detect a Salient Object
Image Deblurring Using Dark Channel Prior
Enhanced-alignment Measure for Binary Foreground Map Evaluation
Structure-measure: A New Way to Evaluate Foreground Maps
Text Detection in Images and Video
Sparse and Redundant Representations and Their Applications in
Saliency detection Donghun Yeo CV Lab..
CornerNet: Detecting Objects as Paired Keypoints
Sparse and Redundant Representations and Their Applications in
Outline Background Motivation Proposed Model Experimental Results
Heterogeneous convolutional neural networks for visual recognition
Jie Chen, Shiguang Shan, Shengye Yan, Xilin Chen, Wen Gao
Human-object interaction
Learning and Memorization
SDSEN: Self-Refining Deep Symmetry Enhanced Network
Lecture 7 Patch based methods: nonlocal means, BM3D, K- SVD, data-driven (tight) frame.
Presentation transcript:

FOCUS PRIOR ESTIMATION FOR SALIENT OBJECT DETECTION Xiaoli Sun Associate Professor Xiujun Zhang, PHD student Shenzhen University, Shenzhen Guangdong China xlsun@szu.edu.cn Sept.20, 2017

Content 1 2 3 4 1. Motivation 2. Related Work 3. Our Model 4. Evaluation 4

Motivation Depth of field is often used in photography. For the salient object detection task, the focus prior is very important, because the salient objects are always photographed in focus. This is determined by the physiological mechanism of the human eye, so the focus prior is a strong indicator for the salient object. But in the state-of-the-art methods, the focus priors are not well considered. Focus is a naturally strong indicator for the salient object detection task.

Motivation An inspired example: As shown in Fig.\ref{fig:focusexample}, the input image has clear contrast between the object and the background. The photographer made the meat in focus and the dog in blur through setting the depth of field. It indicates that the meat is the really attractive object. But seven state-of-the-art methods have failed in this case, for not considering the focus prior. An challenging example for seven state-of-the-art methods. All results of these seven methods are not satisfied, since the focus prior is not well considered. Focus is a strong indicator for the salient object detection, but is easily igored and not well studied.

Motivation Our motivation: 1. Study how to estimation the focus prior for the salient object detection. 2. The focus prior should be indepent from the speicial model and easily be intergrated by the other models. 3. The model should be lightweight and run fastly.

Related Work Defocus Map Estimation: Elder and Zucker, TPAMI, 1998 S.J. Zhuo et al., PR, 2011 Y. Zhang et al., CVPR, 2013 Brown and Tai, ICIP, 2009 Ali Karaali et al., ICIP, 2014 ...... All are gradient based methods.

Related Work Focusness Estimation: P. Jiang et al., CVPR, 2013 The first to propose the focus prior for salient object detection. Closely integrated with their model and hard to be used for the other methods.

Our Model Dataset Construction for Evaluating: New dataset consists of 231 images. All images have the salient objects in focus and the background in blur.

Our Model Sparse Dictionary Learning for Defocus: (1) Given an input matrix , each vector can be represented by a set of dictionary atoms as above. where $D \in {R^{d \times n}}$ is an over-complete dictionary learned from $Y$. $x_i$ is the coefficient to reconstruct $y_i$. The $l_2$ norm forces the representation error small enough to accurately recover the original signal. The $l_0$ norm of $x_i$ forces a few number of dictionary atoms in $D$ which has be used to reconstruct $y_i$, namely indicating sparsity.

Our Model Our learned Sparse Dictionary 1. A slight blurred procedure by Gaussian kernel with Sigma = 2 is imposed on images of the proposed dataset. 2. Extract overlapped image patches from them. Size of each patch is 8*8, forming an input vector of d=64. 3. Our sparse dictionary D with 64 atoms is trained according to Eq.1. 4. Based on our sparse dictionary D, each image patch can be reconstructed by a few atoms together with their non-zero coefficients. Our learned Sparse Dictionary

Our Model Focus prior estimation map: (2) Here, denotes the i-th patch. The focus strength for the i-th patch, denoted as , is defined as the number of non-zero elements in .

Our Model Enhancement by Objectness Analysis: It is observed that when glancing at a scene, objects are much easier to abstract the attention of human than the flat regions. Use the objectness proposal method in [21] to enhance the focus prior map.

Our Model

Evaluation Dataset Evaluation Category ASD The proposed dataset Comparisons with the Other Focus Map for Salient Object Detection Improvement to State-of-the-art Methods for Salient

Evaluation  Comparisons with the Other Focus Map for Salient Object Detection: To the best of our knowledge, the work in \cite{UFO2013CVPR} is the first method of estimating the focusness for salient object detection. In our experiments, we have shown the comparison results with \cite{UFO2013CVPR} in Fig.\ref{fig:focuscompare}. As can be seen, our method has more accurate focus prior map than \cite{UFO2013CVPR}. Especially, benefiting from the object-level analysis, the final results in Fig.\ref{fig:focuscompare} (d) is much cleaner. In some cases, the final results are much closer to the ground truths, even without introducing the contrast features in color, shape, etc. This comparison experiment has confirmed the effectiveness and importance of our focus prior map.

Evaluation  Improvement to State-of-the-art Methods for Salient Object Detection: Our focus prior map is an estimation of the object in focus to the given image. It can be easily integrated into the other methods and help to improve their performances. We have integrated our focus prior map in four state-of-the-art methods: wCtr\cite{wei2014rbd}, SF\cite{sf2012cvpr}, GS\cite{wei2012geodesic}, MR\cite{yang2013saliency}. The experiment results on the ASD dataset\footnote{\url{http://ivrgwww.epfl.ch/supplementary_material/RK_CVPR09/}.} have been shown in Fig.\ref{fig:pr} (a). The performances of all methods have been improved. In order to better show the effect of our method, in Fig.\ref{fig:pr} (b), we have also shown the comparison results on the proposed datasets. It is more obvious on the proposed dataset, the performances of all methods have been improved by a large margin due to integrating our focus prior map.

Conclusions 1. We have proposed a novel method of estimating the focus prior map for any given images. 2.The sparse representation has been used to learn the defocus dictionary and the non-zero number of the coefficients is used to describe the focus strength of each patch. 3.Object level analysis is introduced to boost the performance, since the focus prior is meaningful to the objects in focus.

Thank You!