Michigan State University1 Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways Zhengping Ji Embodied Intelligence.

Slides:



Advertisements
Similar presentations
Chapter 2.
Advertisements

Page 1 of 50 Optimization of Artificial Neural Networks in Remote Sensing Data Analysis Tiegeng Ren Dept. of Natural Resource Science in URI (401)
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
HMAX Models Architecture Jim Mutch March 31, 2010.
Un Supervised Learning & Self Organizing Maps. Un Supervised Competitive Learning In Hebbian networks, all neurons can fire at the same time Competitive.
Unsupervised learning. Summary from last week We explained what local minima are, and described ways of escaping them. We investigated how the backpropagation.
5/16/2015Intelligent Systems and Soft Computing1 Introduction Introduction Hebbian learning Hebbian learning Generalised Hebbian learning algorithm Generalised.
Artificial neural networks:
Hierarchical Saliency Detection School of Electronic Information Engineering Tianjin University 1 Wang Bingren.
黃文中 Preview 2 3 The Saliency Map is a topographically arranged map that represents visual saliency of a corresponding visual scene. 4.
Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Supervised and Unsupervised learning and application to Neuroscience Cours CA6b-4.
Simple Neural Nets For Pattern Classification
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Slides are based on Negnevitsky, Pearson Education, Lecture 8 Artificial neural networks: Unsupervised learning n Introduction n Hebbian learning.
1 Abstract This paper presents a novel modification to the classical Competitive Learning (CL) by adding a dynamic branching mechanism to neural networks.
Un Supervised Learning & Self Organizing Maps Learning From Examples
Visual Attention More information in visual field than we can process at a given moment Solutions Shifts of Visual Attention related to eye movements Some.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC. Lecture 12: Visual Attention 1 Computational Architectures in Biological Vision,
Christian Siagian Laurent Itti Univ. Southern California, CA, USA
Michigan State University 1 “Saliency-Based Visual Attention” “Computational Modeling of Visual Attention”, Itti, Koch, (Nature Reviews – Neuroscience.
Overview 1.The Structure of the Visual Cortex 2.Using Selective Tuning to Model Visual Attention 3.The Motion Hierarchy Model 4.Simulation Results 5.Conclusions.
Artificial Neural Networks
Spatial Pyramid Pooling in Deep Convolutional
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
November 24, 2009Introduction to Cognitive Science Lecture 21: Self-Organizing Maps 1 Self-Organizing Maps (Kohonen Maps) In the BPN, we used supervised.
Michael Arbib & Laurent Itti: CS664 – Spring Lecture 5: Visual Attention (bottom-up) 1 CS 664, USC Spring 2002 Lecture 5. Visual Attention (bottom-up)
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
Radial-Basis Function Networks
Radial Basis Function Networks
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Intrusion Detection Using Hybrid Neural Networks Vishal Sevani ( )
Artificial Neural Network Unsupervised Learning
Chapter 3 Neural Network Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.
黃文中 Introduction The Model Results Conclusion 2.
ARTIFICIAL NEURAL NETWORKS. Overview EdGeneral concepts Areej:Learning and Training Wesley:Limitations and optimization of ANNs Cora:Applications and.
Projects: 1.Predictive coding in balanced spiking networks (Erwan Ledoux). 2.Using Canonical Correlation Analysis (CCA) to analyse neural data (David Schulz).
Modelling Language Evolution Lecture 1: Introduction to Learning Simon Kirby University of Edinburgh Language Evolution & Computation Research Unit.
Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.
Self Organizing Feature Map CS570 인공지능 이대성 Computer Science KAIST.
National Taiwan A Road Sign Recognition System Based on a Dynamic Visual Model C. Y. Fang Department of Information and.
Street Smarts: Visual Attention on the Go Alexander Patrikalakis May 13, XXX.
Insight: Steal from Existing Supervised Learning Methods! Training = {X,Y} Error = target output – actual output.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Fast Learning in Networks of Locally-Tuned Processing Units John Moody and Christian J. Darken Yale Computer Science Neural Computation 1, (1989)
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 12: Visual Attention 1 Computational Architectures in Biological.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.
National Taiwan Normal A System to Detect Complex Motion of Nearby Vehicles on Freeways C. Y. Fang Department of Information.
A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004.
J. Kubalík, Gerstner Laboratory for Intelligent Decision Making and Control Artificial Neural Networks II - Outline Cascade Nets and Cascade-Correlation.
Machine Learning 12. Local Models.
Big data classification using neural network
Summary of “Efficient Deep Learning for Stereo Matching”
Data Mining, Neural Network and Genetic Programming
Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang.
Neuro-Computing Lecture 4 Radial Basis Function Network
Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang.
network of simple neuron-like computing elements
Neural Networks Geoff Hulten.
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Michigan State University1 Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways Zhengping Ji Embodied Intelligence Laboratory Computer Science and Engineering Michigan State University, Lansing, USA

Michigan State University2 Outline l Attention and recognition: Chicken-egg problem l Motivation: brain inspired, neuromorphic, brain’s visual pathway l Saliency-based attention l Where-what Network (WWN): l How to integrate the saliency-based attention & top- down attention control l How attention and recognition helps each other l Conclusions and future work

Michigan State University3 What is attention?

Michigan State University4 Bottom-up Attention (Saliency)

Michigan State University5 Bottom-up Attention (Saliency)

Michigan State University6 Attention Shifting

Michigan State University7 Attention Shifting

Michigan State University8 Attention Shifting

Michigan State University9 Attention Shifting

Michigan State University10 Spatial Top-down Attention Control

Michigan State University11 Spatial Top-down Attention Control e.g. pay attention to the center

Michigan State University12 Object-based Top-down Attention Control

Michigan State University13 Object-based Top-down Attention Control e.g. pay attention to the square

Michigan State University14 Chicken-egg Problem l Without attention, recognition cannot do well: l recognition requires attended areas for the further processing. l Without recognition, attention is limited: l not only bottom-up saliency-based cues, but also top-down object-dependant signals and top-down spatial controls.

Michigan State University15 Problem

Michigan State University16 Challenge l High-dimensional space l Background noise l Large variance l Scale l Shape l Illumination l View point l …..

Michigan State University17 Saliency-based Attention (I) IHDR Tree Heading Direction Boundary Detection Part The mapping from two visual images to correct road boundary type for each sub-window (Reinforcement Learning) Action Generation Part The mapping from road boundary type to correct heading direction (Supervised Learning) e1e1 Desired Path Win1 Win2 Win3Win4 Win5 Win6 e2e2 e3e3 e4e4 e5e5 e6e6 Naïve way: attention window by guessing

Michigan State University18 Saliency-based Attention (II) Low-level image processing Itti & Koch et al. 1998

Michigan State University19 Review l Attention and recognition: Chicken-egg problem l Motivation: brain inspired, neuromorphic, brain’s visual pathway l Saliency-based attention l Where-what Network (WWN): l How to integrate the saliency-based attention & top- down attention control l How attention and recognition helps each other l Conclusions and future work

Michigan State University20 Biological Motivations

Michigan State University21 Challenge: Foreground Teaching l How does a neuron separate a foreground from a complex background? l No need for a teacher to hand-segment the foreground l Fixed foreground, changing background l E.g., during baby object tracking l The background weights are averaged out (no effect during neuronal competition)

Michigan State University22 Novelty l Bottom-up attention: l Koch & Ullman in 1985, Itti & Koch et al. 1998, Baker et al. 2001, etc. l Position based top-down control: l Olshausen et al. 1993, Tsotsos et al. 1995, Mozer et al. 1996, Schill et al. 2001, Rao et al. 2004, etc. l Object based top-down control: l Deco & Rolls 2004 (no performance evaluation), etc. l Our work: l Saliency is developed features l Both bottom-up and top-down based control l Top-down: either object, position or none l Attention and recognition is a single process

Michigan State University23 ICDL Architecture Image V1 V2 “what”-motor 40*40 11*11 21*21 “where”-motor (r, c) 40*40 pixel- based Size fixed: 20*20 global

Michigan State University24 Multi-level Receptive Fields

Michigan State University25 Layer Computation l Compute pre-response of cell (i, j) at time t l Sort: z 1 ≥ z 2 ≥ … z k … ≥ z m ; l Only top-k neurons respond to keep selectiveness and long- term memory l Response range is normalized l Update the local winners

Michigan State University26 In-place Learning Rule l Do not use back-prop l Not biologically plausible l Does not give long-term memory l Do not use any distribution model (e.g., Gaussian mixture) l Avoid high complexity of covariance matrix l New Hebbian like rule: l With automatic plasticity scheduling: only winners update l Minimum error toward target in every incremental estimation stage (local first principal component)

Michigan State University27 Top-down Attention Recruit & identify class invariant features Recruit & identify position invariant features

Michigan State University28 Experiment Foreground objects defined by “what” motor (20*20) Attended areas defined by “where” motor Randomly Selected background patches (40*40)

Michigan State University29 Developed Layer 1 Bottom-up synaptic weights of neurons in Layer 1, developed through randomly selected patches from natural images.

Michigan State University30 Developed Layer 2 Bottom-up synaptic weights of neurons in Layer 2. Not Intuitive for understanding!!

Michigan State University31 Response Weighted Stimuli for Layer 2

Michigan State University32 Experimental Result I Recognition rate with incremental learning

Michigan State University33 Experimental Result II (a) Examples of input images; (b) Responses of attention (“where”) motors when supervised by “what” motors. (c) Responses of attention (“where”) motor when “what” supervision is not available.

Michigan State University34 Summary l “What” motor helps to direct attention of network to features of particular object; l “Where” motor helps to direct attention to positional information (from 45% to 100% accurate when “where” information is present); l Saliency-based bottom-up attention, location-based top-down attention, and object-based top-down attention are integrated in the top-k spatial competition rule;

Michigan State University35 Problems l The accuracy for the “where” motors is not good: 45.53% l Layer 1 was developed offline; l More layers are needed to handle more positions l Where motor should be given externally, instead of retina-based representation l No internal iterations especially when the number of hidden layers is larger than one l No cross-level projections

Michigan State University36 Fully Implemented WWN (Original Design) “where”-motor Image (40*40) V1 (40*40) V2 (40*40) V4 (40*40) “what”-motor: 4 objects 11*11 21*21 V3 LIP 31*31 IT (40*40) MT PP (r, c) 25 center Fixed size motor global

Michigan State University37 Problems l The accuracy for “where” and “what” motors are not good: 25.53% for “what” motor and 4.15% for “where” motor l Too many parameters to be tuned l Training is extremely slow l How to do the internal iterations l “Sweeping” way: always use recently updated weights and responses. l Always use p-1 weights and responses, where p records the current number of iterations. l The response should not be normalized in each lateral inhibition neighborhood.

Michigan State University38 Modified Simple Architecture Image V1 V2 “what”-motor : 5 Objects 40*40 11*11 21*21 “where”-motor (r, c) 5 centers Size fixed: 20*20 global Retina-based supervision

Michigan State University39 Advantage l Internal iterations are not necessary l Network is running much faster l Easier to track neural representations and evaluate performance l Performance evaluation l What motor reaches 100% accuracy for disjoint test l Where motor reaches 41.09% accuracy for disjoint test

Michigan State University40 Problems Top-down projection from motor + Bottom-up responses Top-down responses Total responses Dominance by Top-down Projection

Michigan State University41 Solution l Sparse bottom-up responses by only keeping local top-k winner of bottom-up responses l The performance of where motor increases from around 40% to 91%.

Michigan State University42 Fully Implemented WWN (Latest) “where”-motor Image (40*40) V1 (40*35) V2 (40*40) V4 (40*40) “what”-motor: 5 objects (smoothing by Gaussian) 11*11 21*21 MT (r, c) 3*3 center Fixed-size: 20*20 (smoothing by Gaussian) (40*40) Each cortex: Modified ADAST

Michigan State University43 Modified ADAST Previous Cortex L4 L2/3 L6 (ranking) L5 (ranking) L2/3 Next Cortex

Michigan State University44 Other improvements l Smooth the external motors using Gaussian function l Where motors are evaluated by regression errors l Local top-k is adaptive by neuron positions l The network does not converge by internal iterations l learning rate for top-down excitation is adaptive by internal iterations. l Using context information

Michigan State University45 Layer 1 – Bottom-up Weights

Michigan State University46 Layer 2 – Response-weighted Stimuli

Michigan State University47 Layer 3 (Where) – Top-down Weights

Michigan State University48 Layer 3 (What) – Top-down Weights

Michigan State University49 Test Samples Input “Where” motor (ground truth) “What” motor (ground truth) “Where” output (Saliency-based) “Where” output (“What” supervised) “What” output (Saliency-based) “What” output (“Where” supervised)

Michigan State University50 Performance Evaluation Without supervision Supervise “Where” Supervise “What” “Where” motor (regression error: MSE) pixelsN/A4.137 pixels “What” motor (classification error: %) 12.7%12.1 %N/A Average error for “where” and “what” motors (250 test samples)

Michigan State University51 Discussions