Natural Stimulus Statistics Alter the Receptive Field Structure of V1 Neurons Stephen V David, William E Vinje, Jack L Gallant J Neurosci, Aug 2004 yan karklin – cns meeting – 9/7/05
executive summary goal: model V1 neurons’ response to natural stimuli approach: phase-sep spatio-temporal Fourier domain filters natural vision movies, flashing gratings, hybrid data evaluation: receptive field analysis PSTH prediction results: 0-90% accuracy of prediction (correlation coef) (mean = 42%) for 55% of neurons, nat stim yield better predicting models for 7% of neurons, nat stim yield worse predicting models nat. stim. = better predictions for nat. stim. gratings = better predictions for gratings temporal and spatial inhibitory components account for most of prediction improvement
setup stimuli: natural vision movies (nat) - natural images (landscapes, man-made obj, people) - model saccades - uniform distribution of saccade direction - empirical distr of velocities/lengths - circular patches sampled along saccade path gratings (syn) - flashing randomly oriented, random sp freq, phase natural image sequences (hyb) - natural spatial stats - no temporal correlation neurons: parafoveal V1, 2 awake monkeys, extracelullar recording estimate CRF, present stimuli 2-4x CRF diameter hold-out set for testing of prediction accuracy
neural response
model also tried time/space inseparable phase-sep Fourier transform expansive, sigmoidal non-linearities phase-sep Fourier transform linear kernel estimation (reverse corr) time/space separable threshold rectification
model estimation 2. linear filter estimation by cross-correlation 1. phase-sep Fourier domain representation 3. correction for stimulus correlations 4. time-space sepatation (see next slide) 5. final rate output
model estimation (contd) estimate space-time insep STF regularize (jack-knife, shrinkage estim) convert to space-time sep STF (SVD = approx sol) iteratively estimate spatial, temporal functions stimuli downsampled to 18x18 responses smoothed and binned to 14ms STRF – phase-sep Fourier domain (spatial), ~200msec (temporal) 10% hold-out set for testing (i.e. comparing one fitted model to another) only after all params fixed
one neuron’s STRF estimated with natural vision movies estimated with gratings
another neuron’s STRF estimated with natural vision movies and with residual spatial bias corrected estimated with gratings
which parts of the spatial STRFs are different when estimated with natural images vs gratings? computed correlation between nat STRF and normalized syn STF the whole spatial STRF positive component only negative component only
predicting response to natural vision movies (neuron 1)
predicting response to natural vision movies (neuron 2)
what makes prediction better? prediction evaluated on natural vision movies using models estimated on nat, syn, and hyb stimuli signif better signif worse 24 neurons (55%) better. 3 neurons (7%) worse + mean r=0.42 mean r=0.34 + mean r=0.19 mean r=0.30 mean r=0.35
temporal responses averaged across all neurons natural vision gratings relative amount of inhibition natural vision gratings temp response, normalized and averaged integral of temp response
what affects neural response properties? it’s possible that natural spatial statistics are all that’s required for neuron to change properties test with new class of stimuli: natural image sequences natural spatial stats no temporal correlation
executive summary goal: model V1 neurons’ response to natural stimuli approach: phase-sep spatio-temporal Fourier domain filters natural vision movies, flashing gratings, hybrid data evaluation: receptive field analysis PSTH prediction results: 0-90% accuracy of prediction (correlation coef) (mean = 42%) for 55% of neurons, nat stim yield better predicting models for 7% of neurons, nat stim yield worse predicting models nat. stim. = better predictions for nat. stim. gratings = better predictions for gratings temporal and spatial inhibitory components account for most of prediction improvement
how good are the predictions? prediction evaluated on natural vision movies using models estimated on nat, syn, and hyb stimuli trained on nat syn 0.42 0.19 0.11 0.31 tested on