Understanding early visual coding from information theory By Li Zhaoping Lecture at EU advanced course in computational neuroscience, Arcachon, France,

Slides:



Advertisements
Similar presentations
What do we know about Primary Visual Cortex (V1)
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Chapter 2.
E ffi cient Coding: From Retina Ganglion Cells To V2 Cells Honghao Shan Garrison W. Cottrell The Temporal Dynamics of Learning Center Gary's Unbelievable.
Color Imaging Analysis of Spatio-chromatic Decorrelation for Colour Image Reconstruction Mark S. Drew and Steven Bergner
Un Supervised Learning & Self Organizing Maps. Un Supervised Competitive Learning In Hebbian networks, all neurons can fire at the same time Competitive.
黃文中 Preview 2 3 The Saliency Map is a topographically arranged map that represents visual saliency of a corresponding visual scene. 4.
Chapter 6 The Visual System
Application of Statistical Techniques to Neural Data Analysis Aniket Kaloti 03/07/2006.
Spike-triggering stimulus features stimulus X(t) multidimensional decision function spike output Y(t) x1x1 x2x2 x3x3 f1f1 f2f2 f3f3 Functional models of.
Un Supervised Learning & Self Organizing Maps Learning From Examples
Color vision Different cone photo- receptors have opsin molecules which are differentially sensitive to certain wavelengths of light – these are the physical.
The Human Visual System Vonikakis Vasilios, Antonios Gasteratos Democritus University of Thrace
Extraction of Salient Contours in Color Images Vonikakis Vasilios, Ioannis Andreadis and Antonios Gasteratos Democritus University of Thrace
ICA Alphan Altinok. Outline  PCA  ICA  Foundation  Ambiguities  Algorithms  Examples  Papers.
CS4670: Computer Vision Kavita Bala Lecture 7: Harris Corner Detection.
1 MSU CSE 803 Fall 2014 Vectors [and more on masks] Vector space theory applies directly to several image processing/representation problems.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Matched Filters By: Andy Wang.
MIMO Multiple Input Multiple Output Communications © Omar Ahmad
Entropy and some applications in image processing Neucimar J. Leite Institute of Computing
Unsupervised learning
Studying Visual Attention with the Visual Search Paradigm Marc Pomplun Department of Computer Science University of Massachusetts at Boston
Another viewpoint: V1 cells are spatial frequency filters
Low Level Visual Processing. Information Maximization in the Retina Hypothesis: ganglion cells try to transmit as much information as possible about the.
Neural Information in the Visual System By Paul Ruvolo Bryn Mawr College Fall 2012.
Low Level Visual Processing. Information Maximization in the Retina Hypothesis: ganglion cells try to transmit as much information as possible about the.
Sensory systems basics. Sensing the external world.
2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.
1 BIEN425 – Lecture 8 By the end of the lecture, you should be able to: –Compute cross- /auto-correlation using matrix multiplication –Compute cross- /auto-correlation.
Lecture 2b Readings: Kandell Schwartz et al Ch 27 Wolfe et al Chs 3 and 4.
Projects: 1.Predictive coding in balanced spiking networks (Erwan Ledoux). 2.Using Canonical Correlation Analysis (CCA) to analyse neural data (David Schulz).
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
How natural scenes might shape neural machinery for computing shape from texture? Qiaochu Li (Blaine) Advisor: Tai Sing Lee.
Digital Image Processing Image Compression
On Natural Scenes Analysis, Sparsity and Coding Efficiency Redwood Center for Theoretical Neuroscience University of California, Berkeley Mind, Brain.
1 Computational Vision CSCI 363, Fall 2012 Lecture 5 The Retina.
EE565 Advanced Image Processing Copyright Xin Li Image Denoising Theory of linear estimation Spatial domain denoising techniques Conventional Wiener.
Ch5 Image Restoration CS446 Instructor: Nada ALZaben.
What is the neural code?. Alan Litke, UCSD What is the neural code?
1 Perception and VR MONT 104S, Fall 2008 Lecture 4 Lightness, Brightness and Edges.
September 3, 2013Computer Vision Lecture 1: Human Vision 1 Welcome to CS 675 – Computer Vision Fall 2013 Instructor: Marc Pomplun Instructor: Marc Pomplun.
CS654: Digital Image Analysis
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
Digital Imaging and Remote Sensing Laboratory Maximum Noise Fraction Transform 1 Generate noise covariance matrix Use this to decorrelate (using principle.
Discrete-time Random Signals
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
1 Computational Vision CSCI 363, Fall 2012 Lecture 16 Stereopsis.
Outline Of Today’s Discussion 1.LGN Projections & Color Opponency 2.Primary Visual Cortex: Structure 3.Primary Visual Cortex: Individual Cells.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 10: PRINCIPAL COMPONENTS ANALYSIS Objectives:
Eizaburo Doi, CNS meeting at CNBC/CMU, 2005/09/21 Redundancy in the Population Code of the Retina Puchalla, Schneidman, Harris, and Berry (2005)
Chapter 13 Discrete Image Transforms
Independent Component Analysis features of Color & Stereo images Authors: Patrik O. Hoyer Aapo Hyvarinen CIS 526: Neural Computation Presented by: Ajay.
CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.
- photometric aspects of image formation gray level images
LECTURE 09: BAYESIAN ESTIMATION (Cont.)
Computational Vision --- a window to our brain
LECTURE 10: DISCRIMINANT ANALYSIS
Image Analysis Image Restoration.
All about convolution.
4.2 Data Input-Output Representation
Early Processing in Biological Vision
The General Linear Model (GLM)
Volume 55, Issue 3, Pages (August 2007)
Feature space tansformation methods
LECTURE 09: DISCRIMINANT ANALYSIS
Lecture 2: Image filtering
Outline Announcements Human Visual Information Processing
Outline Human Visual Information Processing – cont.
Presentation transcript:

Understanding early visual coding from information theory By Li Zhaoping Lecture at EU advanced course in computational neuroscience, Arcachon, France, August, Reading materials: download from Contact:

Facts: neurons in early visual stages: retina, V1, have particular receptive fields. E.g., retinal ganglion cells have center surround structure, V1 cells are orientation selective, color sensitive cells have, e.g., red-center- green-surround receptive fields, some V1 cells are binocular and others monocular, etc. Question: Can one understand, or derive, these receptive field structures from some first principles, e.g., information theory?

Example: visual input, 1000x1000 pixels, 20 images per second --- many megabytes of raw data per second. Information bottle neck at optic nerve. Solution (Infomax): recode data into a new format such that data rate is reduced without losing much information. Redundancy between pixels. 1 byte per pixel at receptors 0.1 byte per pixel at retinal ganglion cells?

Consider redundancy and encoding of stereo signals Redundancy is seen at correlation matrix (between two eyes) 0<= r <= 1. Assume signal (S L, S R ) is gaussian, it then has probability distribution:

An encoding: Gives zero correlation in output signal (O +, O - ), leaving output Probability P(O +,O - ) = P(O + ) P(O - ) factorized. The transform S to O is linear. O + is binocular, O - is more monocular-like. Note: S + and S - are eigenvectors or principal components of the correlation matrix R S, with eigenvalues = (1± r)

In reality, there is input noise N L,R and output noise N o,±, hence: Effective output noise: Let: Input S L,R + N L,R has Bits of information about signal S L,R

Input S L,R + N L,R has bits of information about signal S L,R Whereas outputs O +,- has bits of information about signal S L,R Note: redundancy between S L and S R cause higher and lower signal powers and in O + and O - respectively, leading to higher and lower information rate I + and I - If cost ~ Gain in information per unit cost smaller in O + than in O - channel.

If cost ~ Gain in information per unit cost smaller in O + than in O - channel. Hence, gain control on O ± is motivated. O ± V ± O ± To balance the cost and information extraction, optimize by finding the gain V ± such that Is minimized. This gives

This equalizes the output power ≈ --- whitening When output noise N o is negligible, output O and input S+N convey similar amount of information about signal S, but uses much less output power with small gain V ±

~O - 2 > --- whitening also means that output correlation matrix R o ab = Is proportional to identity matrix, (since = 0). Any rotation (unitary or ortho-normal transform): Preserves de-correlation = 0 Leaves output cost Tr (R o ) unchanged Leaves amount of information extracted I = unchanged Tr, det, denote trace and determinant of matrix.

Both encoding schemes: With former a special case of latter, are optimal in making output decorrelated (non-redundant), in extracting information from signal S, and in reducing cost. In general, the two different outputs: prefer different eyes. In particular, θ = 45 o gives O 1,2 ~ The visual cortex indeed has a whole spectrum of neural ocularity.

Summary of the coding steps: Input: S+N, with signal correlation (input statistics) R s get eigenvectors (principal components) S’ of R s S +N S’+N’ = K o (S+N) rotation of coordinates gain control V on each principal component S’+N’ O = V(S’+N’) +N o rotation U’ (multiplexing) of O O’ U’O = U’V K o S + noise { Receptive field, encoding kernel Neural output = U’V K o sensory input + noise

Variations in optimal coding: Factorial codes Minimum entropy, or minimum description length codes Independent components analysis Redundancy reduction Sparse coding Maximum entropy code Predictive codes Minimum predictability codes, or least mutual information between output Channels. They are all related!!!

Another example, visual space coding, i.e., spatial receptive fields Signal at spatial location x is S x = S(x) Signal correlation is R S x,x’ = = R S (x-x’) --- translation invariant Principal components S K are Fourier transform of S x Eigenvalue spectrum (power spectrum): Assuming white noise power = constant, high S/N region is at low frequency, i.e., small k, region. Gain control, V(k) ~ -1/2 ~ k, --- whitening in space At high k, where S/N is small, V(k) decays quickly with k to cut down noise according to

A band-pass filter Let the multiplexing rotation be the inverse Fourier transform: The full encoding transform is O x’ = Σ k U x’k V(k) Σ x e -kx S x = Σ k V(k) Σ x e -k(x’-x) S x + noise {

Understanding adaptation by input strength When overall input strength is lowered, the peak of V(k) is lowered to lower spatial frequency k, a band-pass filter becomes a low pass (smoothing) filter. Where S/N ~ 1 Noise power Receptive field at high S/N Receptive field at lower S/N

Another example: optimal color coding Analogous to stereo coding, but with 3 input channels, red, green, blue. For simplicity, focus only on red and green Input signal S r, S g Input correlation R S rg >0 Eigenvectors: S r + S g S r - S g Luminance channel, higher S/N Chromatic channel, lower S/N Gain control on S r + S g --- lower gain until at higher spatial k Gain control on S r -S g --- higher gain then decay at higher spatial k

Multiplexing in the color space: - G R R

How can one understand the orientation selective receptive fields in V1? Recall the retinal encoding transform: O x’ = Σ k U x’k V(k) Σ x e -kx S x = Σ k V(k) Σ x e -k(x’-x) S x + noise { If one changes the multiplexing filter U x’k, such that it is block diagonal, and for each output cell x’, it is limited in frequency band in frequency magnitude and orientation --- V1 receptive fields. ( U x’k ) K X’ ( ) K X’ U x’k Different frequency bands

( ) K X’ U x’k Different frequency bands Lower frequency k bands, for chromatic channels Higher frequency k bands, for luminance channel only V1 Cortical color coding In V1, color tuned cells have larger receptive fields, have double opponency Orientation tuned cells

Question: if retinal ganglion cell have already done a good job in optimal coding by the center-surround receptive fields, why do we need change of such coding to orientation selective? As we know such change of coding does not improve significantly the coding efficiency or sparseness. Answer? Ref: (Olshausen, Field, Simoncelli, etc) Why is there a large expansion in the number of cells in V1? This leads to increase in redundancy, response in V1 from different cells are highly correlated. What is the functional role of V1? It should be beyond encoding for information efficiency, some cognitive function beyond economy of information bits should be attributed to V1 to understand it.