黃文中 2009-01-08 1. Introduction The Model Results Conclusion 2.

Slides:



Advertisements
Similar presentations
Gabor Filter: A model of visual processing in primary visual cortex (V1) Presented by: CHEN Wei (Rosary) Supervisor: Dr. Richard So.
Advertisements

A Neural Model for Detecting and Labeling Motion Patterns in Image Sequences Marc Pomplun 1 Julio Martinez-Trujillo 2 Yueju Liu 2 Evgueni Simine 2 John.
Texture Segmentation Based on Voting of Blocks, Bayesian Flooding and Region Merging C. Panagiotakis (1), I. Grinias (2) and G. Tziritas (3)
Hierarchical Saliency Detection School of Electronic Information Engineering Tianjin University 1 Wang Bingren.
黃文中 Preview 2 3 The Saliency Map is a topographically arranged map that represents visual saliency of a corresponding visual scene. 4.
A Novel Method for Generation of Motion Saliency Yang Xia, Ruimin Hu, Zhenkun Huang, and Yin Su ICIP 2010.
Learning to Detect A Salient Object Reporter: 鄭綱 (3/2)
Michigan State University1 Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways Zhengping Ji Embodied Intelligence.
Aula 5 Alguns Exemplos PMR5406 Redes Neurais e Lógica Fuzzy.
Recovering Intrinsic Images from a Single Image 28/12/05 Dagan Aviv Shadows Removal Seminar.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Pattern Recognition Topic 1: Principle Component Analysis Shapiro chap
Visual Attention More information in visual field than we can process at a given moment Solutions Shifts of Visual Attention related to eye movements Some.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 6: Low-level features 1 Computational Architectures in Biological.
Feature extraction: Corners and blobs
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC. Lecture 12: Visual Attention 1 Computational Architectures in Biological Vision,
Texture Reading: Chapter 9 (skip 9.4) Key issue: How do we represent texture? Topics: –Texture segmentation –Texture-based matching –Texture synthesis.
Texture Readings: Ch 7: all of it plus Carson paper
Scale Invariant Feature Transform (SIFT)
Visual Attention Jeremy Wyatt. Where to look? Many visual processes are expensive Humans don’t process the whole visual field How do we decide what to.
Christian Siagian Laurent Itti Univ. Southern California, CA, USA
Michigan State University 1 “Saliency-Based Visual Attention” “Computational Modeling of Visual Attention”, Itti, Koch, (Nature Reviews – Neuroscience.
Attention in Computer Vision Mica Arie-Nachimson and Michal Kiwkowitz May 22, 2005 Advanced Topics in Computer Vision Weizmann Institute of Science.
Overview 1.The Structure of the Visual Cortex 2.Using Selective Tuning to Model Visual Attention 3.The Motion Hierarchy Model 4.Simulation Results 5.Conclusions.
A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Michael Arbib & Laurent Itti: CS664 – Spring Lecture 5: Visual Attention (bottom-up) 1 CS 664, USC Spring 2002 Lecture 5. Visual Attention (bottom-up)
BATTLECAM™: A Dynamic Camera System for Real-Time Strategy Games Yangli Hector Yee Graphics Programmer, Petroglyph Elie Arabian.
Introduction of Saliency Map
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
1.  Introduction  Gaussian and Laplacian pyramid  Application Salient region detection Edge-aware image processing  Conclusion  Reference 2.
Multiclass object recognition
Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.
Computer vision.
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis Laurent Itti, Christof Koch, and Ernst Niebur IEEE PAMI, 1998.
CSNDSP’06 Visual Attention based Region of Interest Coding for Video - telephony Applications Nicolas Tsapatsoulis Computer Science Dept. University of.
Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE
Active Vision Key points: Acting to obtain information Eye movements Depth from motion parallax Extracting motion information from a spatio-temporal pattern.
INDEPENDENT COMPONENT ANALYSIS OF TEXTURES based on the article R.Manduchi, J. Portilla, ICA of Textures, The Proc. of the 7 th IEEE Int. Conf. On Comp.
Title of On the Implementation of a Information Hiding Design based on Saliency Map A.Basu, T. S. Das and S. K. Sarkar/ Jadavpur University/ Kolkata/ India/
2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.
10/24/2015 Content-Based Image Retrieval: Feature Extraction Algorithms EE-381K-14: Multi-Dimensional Digital Signal Processing BY:Michele Saad
National Taiwan A Road Sign Recognition System Based on a Dynamic Visual Model C. Y. Fang Department of Information and.
Fitting: The Hough transform
Department of Psychology & The Human Computer Interaction Program Vision Sciences Society’s Annual Meeting, Sarasota, FL May 13, 2007 Jeremiah D. Still,
Region-Based Saliency Detection and Its Application in Object Recognition IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM FOR VIDEO TECHNOLOGY, VOL. 24 NO. 5,
Vision and SLAM Ingeniería de Sistemas Integrados Departamento de Tecnología Electrónica Universidad de Málaga (Spain) Acción Integrada –’Visual-based.
Autonomous Robots Vision © Manfred Huber 2014.
Street Smarts: Visual Attention on the Go Alexander Patrikalakis May 13, XXX.
Efficient Color Boundary Detection with Color-opponent Mechanisms CVPR2013 Posters.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 12: Visual Attention 1 Computational Architectures in Biological.
1 Computational Vision CSCI 363, Fall 2012 Lecture 6 Edge Detection.
Spatio-temporal saliency model to predict eye movements in video free viewing Gipsa-lab, Grenoble Département Images et Signal CNRS, UMR 5216 S. Marat,
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
Computer vision. Applications and Algorithms in CV Tutorial 3: Multi scale signal representation Pyramids DFT - Discrete Fourier transform.
 Mentor : Prof. Amitabha Mukerjee Learning to Detect Salient Objects Team Members - Avinash Koyya Diwakar Chauhan.
WLD: A Robust Local Image Descriptor Jie Chen, Shiguang Shan, Chu He, Guoying Zhao, Matti Pietikäinen, Xilin Chen, Wen Gao 报告人:蒲薇榄.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Biologically Inspired Vision-based Indoor Localization Zhihao Li, Ming Yang
Motion tracking TEAM D, Project 11: Laura Gui - Timisoara Calin Garboni - Timisoara Peter Horvath - Szeged Peter Kovacs - Debrecen.
National Taiwan Normal A System to Detect Complex Motion of Nearby Vehicles on Freeways C. Y. Fang Department of Information.
A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004.
A. M. R. R. Bandara & L. Ranathunga
Improving the Performance of Fingerprint Classification
BATTLECAM™: A Dynamic Camera System for Real-Time Strategy Games
Implementation of a Visual Attention Model
Outline Linear Shift-invariant system Linear filters
Early Processing in Biological Vision
Jeremy Bolton, PhD Assistant Teaching Professor
SIFT keypoint detection
Presentation transcript:

黃文中

Introduction The Model Results Conclusion 2

Introduction Introduction The Model Results Conclusion 3

4

Many visual processes are expensive Humans don’t process the whole visual field How do we decide what to process? How can we use insights about this to make machine vision more efficient? 5

Salience ~ visual prominence Must be cheap to calculate Related to features that we collect from very early stages of visual processing Colour, orientation, intensity change and motion are all important indicators of salience 6

The Saliency Map is a topographically arranged map that represents visual saliency of a corresponding visual scene. 7

Two kinds of stimuli type:  Bottom-up  Depend only on the instantaneous sensory input  Without taking into account the internal state of the organism  Top-down  Take into account the internal state  Such as goals the organisms has at this time, personal history and experiences, etc 8

Introduction The Model The Model Results Conclusion 9

10

Extraction  extract feature vectors at locations over the image plane Activation  form an "activation map" (or maps) using the feature vectors Normalization / Combination  normalize the activation map (or maps, followed by a combination of the maps into a single map) 11

Nine spatial scales are created using dyadic Gaussian pyramids. Each features is computed by a set of linear “center- surround” operations akin to visual receptive fields. Normalization Across-scale combination into three “conspicuity maps.” Linear combinations to create saliency map. Winner-take-all 12

The original image is decomposed into sets of lowpass and bandpass components via Gaussian and Laplacian pyramids. The Gaussian pyramid consists of lowpass filtered (LPF). The Laplacian pyramid consists of bandpass filtered (BPF). 13

14 W

Intensity image: Color channels: Local orientation information: Obtained from using oriented Gabor pyramids 15

16

Nine spatial scales are created using dyadic Gaussian pyramids. Each features is computed by a set of linear “center-surround” operations akin to visual receptive fields. Normalization Across-scale combination into three “conspicuity maps.” Linear combinations to create saliency map. Winner-take-all 17

is obtained by interpolation to the finer scale and point-by-point substraction. Intensity contrast: Color double-opponent: Orientation feature maps: 18

Nine spatial scales are created using dyadic Gaussian pyramids. Each features is computed by a set of linear “center- surround” operations akin to visual receptive fields. Normalization Across-scale combination into three “conspicuity maps.” Linear combinations to create saliency map. Winner-take-all 19

Map normalization operator: 20

1) Normalizing the values in the map to a fixed range [0..M], in order to eliminate modality- dependent amplitude differences 2) Finding the location of the map’s global maximum M and computing the average m of all its other local maxima 3) Globally multiplying the map by. 21

The method is called the global non-linear normalization. Pros: 1) Computationally very simple. 2) Easily allows for real-time implementation because it is non-iterative. Cons: 1) This strategy is not very biologically plausible, since global computations are used. 2) Not robust to noise, when noise can be stronger than the signal. 22

Non-classical surround inhibition Interactions within each individual feature map rather than between maps Inhibition appears strongest at a particular distance from the center, and weakens both with shorter and longer distances. The structure of non-classical interactions can be coarsely modeled by a two-dimensional difference-of- Gaussians(DoG) connection pattern. 23

  24

25

26

Nine spatial scales are created using dyadic Gaussian pyramids. Each features is computed by a set of linear “center- surround” operations akin to visual receptive fields. Normalization Across-scale combination into three “conspicuity maps.” Linear combinations to create saliency map. Winner-take-all 27

“ ”, which consists of reduction of each map to scale 4 and point-by-point addition: 28

Nine spatial scales are created using dyadic Gaussian pyramids. Each features is computed by a set of linear “center- surround” operations akin to visual receptive fields. Normalization Across-scale combination into three “conspicuity maps.” Linear combinations to create saliency map. Winner-take-all 29

The three conspicuity maps are normalized and summed into the final input S to the saliency map: The weights of each channel is tunable. 30

Nine spatial scales are created using dyadic Gaussian pyramids. Each features is computed by a set of linear “center- surround” operations akin to visual receptive fields. Normalization Across-scale combination into three “conspicuity maps.” Linear combinations to create saliency map. Winner-take-all 31

At any given time, only one location is selected from the early representation and copied into the central representation. 32

1) The FOA is shifted to the location of the winner neuron. 2) The global inhibition of the WTA is triggered and completely inhibits (resets) all WTA neurons. 3) Local inhibition is transiently activated in the SM, in an area with the size and new location of the FOA. 33

Introduction The Model Results Results Conclusion 34

35

36

37

Introduction The Model Results Conclusion Conclusion 38

Have proposed a conceptually simple computational model for saliency-driven focal visual attention. The framework can consequently be easily tailored to arbitrary tasks through the implementation of dedicated feature maps. 39

L. Itti, C. Koch, E. Niebur, “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 11, pp , Nov L. Itti, C. Koch, A saliency-based search mechanism for overt and covert shifts of visual attention, Vision Research, Vol. 40, No , pp , May H. Greenspan, S. Belongie, R. Goodman, P. Perona, S. Rakshit, and C.H. Anderson, “Overcomplete Steerable Pyramid Filters and Rotation Invariance,” Proc. IEEE Computer Vision and Pattern Recognition, pp , Seattle, Wash., June