Image Stabilization by Bayesian Dynamics Yoram Burak Sloan-Swartz annual meeting, July 2009
What does neural activity represent? In Bayesian models: probabilities Direction of motion: single, static variable Accumulated evidence in area LIP Shadlen and Newsome (2001)
What does neural activity represent? In Bayesian models: probabilities Direction of motion: single, static variable What about multi-dimensional, dynamic quantities? Accumulated evidence in area LIP Shadlen and Newsome (2001)
Foveal vision and fixational drift
- between micro-saccades - ~20 receptive fields Image from: X. Pitkow - between spikes (100 Hz) - ~2-4 receptive fields ! Fixational drift is large in the fovea: cone separation: 0.5 arcmin
Foveal vision and fixational drift - between micro-saccades - ~20 receptive fields Image from: X. Pitkow - between spikes (100 Hz) - ~2-4 receptive fields ! Downstream areas require knowledge of trajectory to interpret spikes Fixational drift is large in the fovea: cone separation: 0.5 arcmin
Joint decoding of image and position Bayesian: Discrimination task: vs. X. Pitkow et al, Plos Biology (2007) N x 2 probabilities # positions
Bayesian: Discrimination task: vs. X. Pitkow et al, Plos Biology (2007) N x 2 probabilities Unconstrained image 30 x 30 binary pixels # positions N x probabilities Joint decoding of image and position
Bayesian: Discrimination task: vs. X. Pitkow et al, Plos Biology (2007) N x 2 probabilities Unconstrained image 30 x 30 binary pixels # positions N x probabilities Can the brain apply a Bayesian approach to this problem? Joint decoding of image and position
Can the brain apply a Bayesian approach to this problem? Decoding strategy Performance in parameter space What are the biological implications?
Can the brain apply a Bayesian approach to this problem? Decoding strategy Performance in parameter space What are the biological implications?
Decoding strategy Discards information about correlations Factorized representation:
Decoding strategy Discards information about correlations minimize D KL Factorized representation: Exact if trajectory is known. evidence, diffusion Update dynamics:
Decoding strategy Discards information about correlations minimize D KL Factorized representation: Exact if trajectory is known. evidence, diffusion evidence - Poisson spiking (rate λ 1 for on pixels, λ 0 for off) diffusion - Random walk (diffusion coefficient D) Retinal encoding model: Update dynamics:
Decoding strategy Discards information about correlations Neural Implementation - Two populations: where, what For 30 x 30 pixels: N × → N quantities. Factorized representation:
Update rules Update of what neurons: multiplicative gating Ganglion cells What Where nonlinearity
Update rules Update of what neurons: Update of where neurons: multiplicative gating Ganglion cells What Where What multiplicative gating Ganglion cells + diffusion nonlinearity
Demo image retina m x m binary pixels 2d diffusion (D) Poisson spikes: 100 Hz (on), 10 Hz (off) Decoder
Demo
Decoding strategy Performance in parameter space What are the biological implications? Can the brain apply a Bayesian approach to this problem?
Performance DD Convergence time [s] accuracy Performance degrades with larger D (and smaller λ)
Performance DD Convergence time [s] Faster and more accurate for larger images m = 5, 10, 30, 50, 100 accuracy
Demo
Performance DD Convergence time [s] Faster and more accurate for larger images accuracy m = 5, 10, 30, 50, 100
Performance DD Convergence time [s] Faster and more accurate for larger images accuracy m = 5, 10, 30, 50, 100
Performance DD Convergence time [s] Faster and more accurate for larger images accuracy m = 5, 10, 30, 50, 100
Performance D/m Convergence time [s] accuracy scales with linear image size m m x m pixels
Performance D/m Convergence time [s] accuracy scales with linear image size m Analytical scaling: D* m x m pixels
Performance Performance improves with image size. Success for images 10 x 10 or larger Prediction for psychophysics: Degradation in high acuity tasks when visual scene contains little background detail.
Temporal response of Ganglion cells Common view: fixational motion important to activate cells, due to biphasic response f(t) t Temporal response makes decoding much more difficult. 50 ms Need history Non-Markovian:
Temporal response of Ganglion cells Approach: Choose decoder that is Bayes optimal if the trajectory is known. What Ganglion “filtered trajectory” Where history dependent decoder / naive decoder Convergence time [s] accuracy D D
Temporal response of Ganglion cells Is fixational motion beneficial? Known trajectory, perfect inhibitory balance Convergence time [s] D Optimal D - order of magnitude smaller than biological value
Can the brain apply a Bayesian approach to this problem? Decoding strategy Performance in parameter space What are the biological implications?
Network architecture Each ganglion cell innervates multiple what & where cells (spread: ~10 arcmin) WhereWhat Ganglion Reciprocal, multiplicative gating
Activity: What neurons Slow dynamics, evidence accumulation Where neurons Fewer. Highly dynamic activity Tonic, sparse in retinal stabilization conditions.
Activity: What neurons Slow dynamics, evidence accumulation Where neurons Fewer. Highly dynamic activity Tonic, sparse in retinal stabilization conditions. Where in the brain? Monocular LGN? V1? If so, suggests LGN or V1 Modulatory inputs to relay cells (gating?) Lateral connectivity in where network, Increase in number of neurons.
Summary Strategy for stabilization of foveal vision Factorized Bayesian approach to multi-dimensional inference
Summary Strategy for stabilization of foveal vision Explicit representation of stabilized image “What” and “where” populations Factorized Bayesian approach to multi-dimensional inference
Summary Strategy for stabilization of foveal vision Explicit representation of stabilized image “What” and “where” populations Good performance at 1 arcmin resolution Problem is easier for large images, for coarser reconstruction Factorized Bayesian approach to multi-dimensional inference
Summary Strategy for stabilization of foveal vision Explicit representation of stabilized image “What” and “where” populations Good performance at 1 arcmin resolution Problem is easier for large images, for coarser reconstruction Factorized Bayesian approach to multi-dimensional inference Network architecture: Many-to-one inputs from retina, multiplicative gating (what/where)
Uri Rokni Haim Sompolinsky Markus Meister Special thanks - the Swartz foundation Acknowledgments