Presentation is loading. Please wait.

Presentation is loading. Please wait.

How do neurons deal with uncertainty?

Similar presentations


Presentation on theme: "How do neurons deal with uncertainty?"— Presentation transcript:

1 How do neurons deal with uncertainty?
Sophie Deneve Group for Neural Theory Ecole Normale Supérieure Paris, France.

2 Dealing with uncertainties: two approaches
Signal processing Sensory variable State of the word or body Sensor noise Neural representation Efficient encoding (information maximization) Tuning curves, neural code Behavioral estimate Neural noise Behavioral response Efficient decoding (ideal observer) Effector noise

3 Bayesian perception and action
Velocity perception as bayesian inference: Weiss, Simoncelli and Adelson, 2002 Infering 3D structures from 2D images. Knill and Richards, 1996 Bayesian integration for optimal motor strategies. Kording and Wolpert, 2004. Multisensory integration. Van Beers, Sittig and Gon, 1999, Ernst and Banks 2002 The bayesian framework has been used recently to explain a wide range of sensory and motor phenomena, and build comprehensive thepries of perception and action. Indeed, to infer useful sensory variable and compute appropriate strategies, we must integrate multiple noisy sensory cues, both fast and accurately. Often these cues are unreliable or ambiguous, and must combined with prior knowledge about the environment, the body, what is likely to occur and what is not. Bayesian inference consist in computing the posterior . And it seems indeed that animals and humans use these startegies to…

4 Dealing with uncertainties: two approaches
Bayesian models State of the word or body Sensory variable Inference Prediction Posterior probabilities Internal beliefs Prediction Likelihood Models, Probabilities, priors…. decisions explorations Behavioral strategies Maximize Utility/Reward

5 Two parts of the lecture
Part 1: How can variables encoded by population of noisy neurons be estimated? How can they be combined? Part 2: Are probability distributions and belief states encoded in neurons or neural populations?

6 Population coding

7 Population coding Georgopoulos 1986

8 Population Codes s? Tuning Curves Average pattern of activity 100 100
80 80 60 60 Activity Activity 40 40 20 20 -100 100 -100 100 Direction (deg) Preferred Direction (deg) Tuning Curves Average pattern of activity

9 Poisson Variability in Cortex
Variance of Spike Count Mean Spike Count Trial 1 Trial 2 Trial 3 Trial 4

10 Noisy population Codes
Pattern of activity (r) -100 100 20 40 60 80 Preferred Direction (deg) Activity s? X? Poisson noise: Independent:

11 x Population Vector P riPi s? X? 100 80 Activity 60 40 20 -100 100
riPi -100 100 x Preferred Direction (deg)

12 Maximum likelihood Maximum performance for an unbiased estimator
100 80 60 Activity 40 20 - 45 45 Preferred Direction of motion (deg) The optimal is ML If this procedure is repeated over many trials, that is, over many different noisy responses to the same stimulus, the estimate will have a certain variance which define its precision. The variance of the ML estimate is the minimum variance that can be achieved by an unbiased estimator, and which is called, is statistical terms, the cramer-rao bound. This cramer rao bound can be derived mathematically from the shape of the tuning curves for each neuron in the population, and the covariance of the neural noise. Once we have computed this bound, we can use it to measure the quality of an estimate by how much larger its variance is from the maximum likelihood performance. Maximum performance for an unbiased estimator Fisher information

13 Maximum likelihood 100 80 60 Activity 40 20 - 45 45
- 45 45 Preferred Direction of motion (deg) The optimal is ML If this procedure is repeated over many trials, that is, over many different noisy responses to the same stimulus, the estimate will have a certain variance which define its precision. The variance of the ML estimate is the minimum variance that can be achieved by an unbiased estimator, and which is called, is statistical terms, the cramer-rao bound. This cramer rao bound can be derived mathematically from the shape of the tuning curves for each neuron in the population, and the covariance of the neural noise. Once we have computed this bound, we can use it to measure the quality of an estimate by how much larger its variance is from the maximum likelihood performance.

14 Fisher Information For one neuron with Poisson noise
x -100 100 20 40 60 80 Direction (deg) Activity For one neuron with Poisson noise For n independent neurons : Derivative of the tuning curve Tuning curve (mean activity) Large slope is good! The more neurons, the better! Small variance is good!

15 Variance of the maximum likelihood estimate
Population of neuron with independent Poisson noise:

16 Recurrent network 100 80 60 Activity 40 20 - 45 45
- 45 45 Preferred Direction of motion (deg) The question is whether maximum likelihood, which is a very non-linear computation, could be implemented in the biologically realistic fashion. We showed that in fact, maximum likelihood estimation could be performed quite naturally by an interconnected cortical network, with local activation, that is, lateral connections pooling activities locally, and global inhibition, which ensure stability. These networks takes noisy population codes as an input….

17 Recurrent network Derivative of the tuning curve
100 80 60 Activity 40 20 - 45 45 Preferred Direction of motion (deg) and converge to smooth hills of activity (or noiseless population code). This smooth hills of activities can peak anywhere on the neural layer, and thus these networks are said to have a continuous attractor. %%%%%%%%% The position of the peak of the stable hills is an estimate of the variables that were encoded in the noisy input pattern. We found that we could tune these continuous attractor networks so that the variance of their estimate reached the Cramer-Rao bound. Thus, interconnected cortical networks could act as ideal observers of noisy population codes. In this particular example, the network recovers the best estimate for the eye-centered position of the stimulus given its noisy visual input. Derivative of the tuning curve If the network is a line attractor, and if We have: Network parameters Covariance of the neural noise

18 Line attractor networks
100 80 Activity 60 40 20 - 45 45 Preferred Direction of motion (deg) - 45 20 40 60 80 100 Activity Preferred Direction of motion (deg) - 45 20 40 60 80 100 Activity Preferred Direction of motion (deg) Neuron 2 Neuron 3 Neuron 1

19 Line attractor networks
100 80 Activity 60 40 20 - 45 45 Preferred Direction of motion (deg) 100 80 Activity 60 40 20 100 Neuron 2 80 - 45 45 Preferred Direction of motion (deg) Activity 60 40 Neuron 3 20 - 45 45 Preferred Direction of motion (deg) Neuron 1

20 Line attractor networks
100 80 60 40 20 - 45 45 Neuron 2 Neuron 3 Neuron 1

21 Line attractor networks
100 80 60 40 20 - 45 45 100 80 Activity 60 Neuron 2 40 20 Neuron 3 - 45 45 Preferred Direction of motion (deg) Neuron 1

22 Line attractor networks
Neuron 2 Neuron 3 Neuron 1

23 Line attractor networks
Neuron 2 Neuron 3 Neuron 1

24 Line attractor networks
100 80 Activity 60 Neuron 2 40 20 Neuron 3 - 45 45 Preferred Direction of motion (deg) Neuron 1

25 Line attractor networks
Covariance of neural noise Direction of projection Neuron 2 Shape of stable manifold Neuron 3 Neuron 1

26 Line attractor networks
Covariance of neural noise Direction of projection Neuron 2 Shape of stable manifold Neuron 3 Neuron 1

27 Line attractor networks
Covariance of neural noise Direction of projection Neuron 2 Shape of stable manifold Neuron 3 Neuron 1 =

28 Cue integration

29 Visual capture: The ventriloquism effect
Multi-sensory integration Visual capture: The ventriloquism effect

30 Combining cues from several modalities
Probability ^ ^ Position of the object, x

31 Combining cues from several modalities: Gaussiab distributions
Probability ^ ^ Position of the object, x

32 Combining cues from several modalities
Probability ^ ^ Position of the object, x

33 Analogy with the center-of-mass
Visual more reliable: Visual capture Visual xvis Auditory xaud Visual xvis Auditory xaud Bimodal Bimodal Auditory more reliable: Auditory capture: Visual xvis Auditory xaud Bimodal

34 Ernst and Banks, 2002 From: Banks et al, Nature, 2002 Visual width
xv Haptic width xt Bimodal width 67 133 200 0.05 0.1 0.15 0.2 Threshold (STD) Visual noise level (%) Measured bimodal STD Predicted by the optimal model 0.4 0.6 0.8 1 Visual weight Unimodal visual STD Unimodal Tactile STD Measured visual weight Do human use such optimal strategy to integrate inputs in different sensory modalities? Evidence from this come from recent banks et al. In this experiments, subject had to estimate the width of bar that was presented visually, through random dot stereograms, or haptically, through a force feedback device. Moreover, different level of noise were added to the visual display, in order to change the reliability of the visual input compare to the haptic input. For 4 level of visual noise, they mesured the discrimination threshold of their subjects presented with visual bar, haptic bars, and bimodal, visual and haptic bar. They compared the measured bimodal discrimination threshold, the pink diamond in this graph, with the bimodal thresholds predicted if visual and haptic cues were combined optimally. As you can see, the optimal combination model predicts very well the bimodal discrimination threshold. They also measure the weight given to the visual modality in estimating the width of bimodal bars. Remember that an optimal combination model would predict that this weight depends on the reliability of the visual modality. On this graph you have the measure visual weight, together with the weight predicted if visual and haptic cues are combined optimally. As you can see, they match quite well. Thus, humans seems to combine visual and haptic cue optimally, taking into account the reliability of each cue before combining them. From: Banks et al, Nature, 2002

35 Prior, likelihood and posterior
- 45 20 40 60 80 100 Prior: Neural response Posterior (Bayes Theorem) Preferred position (x)

36 Prior, likelihood and posterior
- 45 20 40 60 80 100 Flat prior Neural response Preferred position (x)

37 Gain as encoding certainty
Visual input: 100 80 Activity 60 Product 40 20 - 45 45 Preferred Direction of motion (deg) - 45 20 40 60 80 100 Activity Preferred Direction of motion (deg) Auditory input:

38 Integrating several noisy population code
Activity -45 45 10 20 Preferred eye-centered position Preferred eye position Preferred head-centered position Auditory layer Eye position This uncertainty can be decreased by integrating information from different modalities. For example consider the task of localizing a bee around the head. The bee can be seen, heard, or felt if it lands on the skin, and these different sensory inputs can be combined to get a better idea of where the bee is. In visual cortex, cells have eye-centered receptive fields, and the position of the object is represented by a noisy hill of activity peaking at the eye-centered position. In auditory and tactile cortex, however, the neuron’s receptive fields are anchored on the head. The position of an object is represented in these areas by a noisy hill of activity peaking at the head-centered position of the object. This two information can not be combined together without knowing the eye position, which relate the two frame of reference. Similarly, noisy population code for eye position have been observed in various brain areas. Eye-centered position Activity Head-centered position Visual layer Eye position layer

39 Auditory layer Visual layer Eye position layer Maximum Likelihood
20 Activity 10 -45 45 Preferred head-centered position Maximum Likelihood Eye position To combine the information contained in these 3 population codes on the position of the object, the optimal thing to do is to find the most probable positions that could have given this response on the sensory and postural layers. This is called the bayesian estimate of position. Eye-centered position 20 20 Activity Activity Activity Head-centered position 10 10 -45 45 -45 45 Preferred eye-centered position Preferred eye position Visual layer Eye position layer

40 Auditory layer Visual layer Eye position layer
20 Activity 10 -45 45 Preferred head-centered position Multidirectional sensory prediction Eye position Instead of spatial alignment of sensory maps, we propose another interpretation of multi-sensory area as computational intermediate in a multi-directional sensory prediction. For example, visuo-tactile areas would serve as intermediate for predicting the position of a tactile input from its visual position, and simultaneously predicting the visual position from the tactile position. The intuition is that if this multi-directional sensory prediction is iterated it will eventually find the best prediction of position. Eye-centered position 20 20 Activity Activity Activity Head-centered position 10 10 -45 45 -45 45 Preferred eye-centered position Preferred eye position Visual layer Eye position layer

41 How do we do coordinate transform?
Auditory layer 20 Activity 10 -45 45 Preferred head-centered position 20 20 Activity Activity 10 10 -45 45 -45 45 Preferred eye-centered position Preferred eye position Visual layer Eye position layer

42 + Not Like This! Auditory layer Visual layer Eye position layer
20 Activity 10 -45 45 Preferred head-centered position + 20 20 Activity Activity 10 10 -45 45 -45 45 Preferred eye-centered position Preferred eye position Visual layer Eye position layer

43 Look up table. Visual layer Auditory layer Eye position layer
-45 45 10 20 Preferred eye-centered position Preferred eye position Activity Preferred head-centered position Visual layer Auditory layer Eye position layer Look-up table

44 Radial basis function map
Iterative (IBF) Radial basis function map -45 45 10 20 Preferred eye-centered position Preferred eye position Activity Preferred head-centered position Visual layer Auditory layer Eye position layer Radial basis function map

45 Bimodal visuo-tactile input
Auditory layer We first tested the network by initializing the activity of input maps to noisy population code, and let it iterate over time. This is a case were the network receive bimodal, visual and tactile input. Neuron 2 Neuron 3 Neuron 1 Visual layer Eye position layer

46 Bimodal visuo-tactile input
^ Auditory layer We first tested the network by initializing the activity of input maps to noisy population code, and let it iterate over time. This is a case were the network receive bimodal, visual and tactile input. Neuron 2 Neuron 3 Neuron 1 Visual layer Eye position layer

47 Bimodal visuo-tactile input, weak tactile input
Auditory layer We first tested the network by initializing the activity of input maps to noisy population code, and let it iterate over time. This is a case were the network receive bimodal, visual and tactile input. Neuron 2 Neuron 3 Neuron 1 Visual layer Eye position layer

48 Bimodal visuo-tactile input
^ Auditory layer We first tested the network by initializing the activity of input maps to noisy population code, and let it iterate over time. This is a case were the network receive bimodal, visual and tactile input. Neuron 2 Neuron 3 Neuron 1 Visual layer Eye position layer

49 Unimodal visual input Neuron 2 Neuron 3 Neuron 1 Auditory layer
This is a instances were we tested the network for unimodal inputs. In this case there is no tactile input and the network must rely solely on the visual and postural input. Neuron 2 Neuron 3 Neuron 1 Visual layer Eye position layer

50 Result: IBF is an optimal bayesian estimator
Performance: The variance of the estimates are less than 4% worse than variance of the best (bayesian) estimator. We tested weither these estimates were bayesian estimates of position for the input by comparing the error made by this estimates with the error made by a bayesian estimator. We found that the network estimates were in all condition less than 4% worse than the best possible estimator. Thus information from different sensory inputs can be integrated quasi optimally with this network. We were able to prove this result to be true for all IBF, regardless of what computation it performs. ^ ^

51 From: Banks et al, Nature, 2002
Tactile layer From: Banks et al, Nature, 2002 Visual layer Eye position layer

52 Ernst and Banks, 2002 From: Banks et al, Nature, 2002
0.2 Unimodal Tactile STD 0.15 Unimodal visual STD Threshold (STD) 0.1 Measured bimodal STD 0.05 67 133 200 Visual noise level (%) Predicted by the optimal model Do human use such optimal strategy to integrate inputs in different sensory modalities? Evidence from this come from recent banks et al. In this experiments, subject had to estimate the width of bar that was presented visually, through random dot stereograms, or haptically, through a force feedback device. Moreover, different level of noise were added to the visual display, in order to change the reliability of the visual input compare to the haptic input. For 4 level of visual noise, they mesured the discrimination threshold of their subjects presented with visual bar, haptic bars, and bimodal, visual and haptic bar. They compared the measured bimodal discrimination threshold, the pink diamond in this graph, with the bimodal thresholds predicted if visual and haptic cues were combined optimally. As you can see, the optimal combination model predicts very well the bimodal discrimination threshold. They also measure the weight given to the visual modality in estimating the width of bimodal bars. Remember that an optimal combination model would predict that this weight depends on the reliability of the visual modality. On this graph you have the measure visual weight, together with the weight predicted if visual and haptic cues are combined optimally. As you can see, they match quite well. Thus, humans seems to combine visual and haptic cue optimally, taking into account the reliability of each cue before combining them. 1 0.8 Measured visual weight 0.6 Visual weight 0.4 0.2 From: Banks et al, Nature, 2002 67 133 200 Visual noise level (%)

53 Reason 1: Line attractor networks work for different gains
Why does it work? Reason 1: Line attractor networks work for different gains 100 80 60 Activity 40 G=Gain 20 - 45 45 Preferred Direction of motion (deg)

54 = Neuron 2 Neuron 3 Neuron 1 - 45 20 40 60 80 100 Activity
20 40 60 80 100 Activity Preferred Direction of motion (deg) Neuron 2 Neuron 3 Neuron 1 =

55 Why does it work? Reason 2: For independent Poisson noise, doing sum is equivalent to multiplying the posteriors Visual input: 100 sum - 45 40 80 120 160 200 Activity Preferred Direction of motion (deg) 80 Activity 60 Product 40 20 - 45 45 Preferred Direction of motion (deg) - 45 20 40 60 80 100 Activity Preferred Direction of motion (deg) Auditory input:

56 Generalization: Implicit probabilistic encoding
Linear combinations correspond to optimal cue integration as long as the distribution of neural noise belong to the exponential family: We have Beck, Latham and Pouget, 2007

57 Structure of the iterative basis function network.
??? Multi-sensory Representation. Of courses we are interested in the nature of the spatial representation in the intermediate layer. What are the prediction of the model for the receptive fields of multi-sensory neurons?

58 Receptive fields in different modalities are roughly aligned in space.
Visual and tactile receptive fields in VIP: Is multi-sensory integration simply realized by the convergence of all sensory inputs on a common spatial map?

59 Frame of reference of a multi-sensory cell’s receptive field: Sensory alignment hypothesis.
Well, no, it does not work. Either the visual receptive field remain eye-centered, in which case the tactile receptive field is forced to move on the skin to keep the alignment, or the tactile receptive field remain skin centered, in which case the visual receptive field has to shift on the retina. In both cases, multi-sensory integration by spatial map alignment requires a change in coordinate of at least one sensory input to remain valid for all eye position.

60 Model prediction: partially shifting receptive fields.
Are these receptive field eye-centered? No. Here is a plot of the response of one intermediate unit as a function of the eye-centered position of a visual stimulus, one the left, and a tactile stimulus, on the right, for 3 eye positions. If the visual receptive field was eye-centered it should not depend on eye position and the three curve should peak at the same position, which is not the case. Thus, the receptive field shift on the retina. Are these receptive field head-centered? No. If receptive fields were head-centered they should shift in equal amount and opposite to the eye shift, and peak at the position of the dotted lines. They do not. In fact, the receptive field move partially with the eye, and the response is gain modulated by eye position. Finally, are visual and tactile receptive fields aligned in this visuo-tactile cell? Not even that! On these plots the visual and tactile responses look similar, but in fact of you look carefully, the two receptive fields do not remain in register for all eye movement. The tactile receptive field tend be closer to an head-centered frame of reference. The spatial discrepancy is illustrated on this graph. For eye looking straigh ahead receptive fields are aligned but not when the monkey look on the left where the tactile receptive field shift less than it should. Partially shifting visual receptive fields have been reported by Duhamel et al et VIP. Consistent with our model, visual receptive shift partially but tactile receptive fields of the same neurons remain anchored to the skin. This is in opposition to the traditional view that sensory inputs have to be remapped in a common frame of reference to be integrated. Duhamel et al, 1997, 2001

61 The modality dominance influence the receptive field shift.
Visual dominant: Tactile dominant: Estimate Estimate Visual estimate Tactile estimate Another way to manipulate the representation of the intermediate cell is to change the overall strengh of the connections coming from the different modalities. In one multiply the weights of the connection coming from the visual layer by 2, then, as plotted on this graph, the intermediate cells will tend to be dominated by the eye-centered input, and have eye-centered receptive fields. On the other hand, if one mutliply weights of the connections coming from the tactile layer, then the multi-sensory units will tend to have head-centered receptive fields. The question is how cramping up the weights in one modality changes the outcome of the network as a multi-sensory integrator. We have seen previously that for the network to work properly, the gains of each sensory input should be proportional to its reliability. Thus, increasing the weight, which is equivalent to increasing the gain of the sensory input in that modality, will result in the network giving more confidence to this modality. If we increase the visual weights, the network will give more prior confidence to the visual modality and the bimodal estimate of position will be closer to the visual input. If, on the other hand, we increase the tactile weight, the network gives more confidence to the tactile modality and the bimodal estimate will be closer to the tactile estimate. Thus, there is a correspondence between the representation is multisensory cells and the confidence they give to each sensory modaility. The multi-sensory cells are dominated by the frame of reference of the input they most trust.

62 Visual and tactile receptive fields of a VIP cell
(Avillac et al, 2001)

63 Shift Ratio as a function of eye-centered dominance
Visual input Visual input Tactile (auditory) input LIP SC Jay and Sparks, 1986 Stricanne et al, 1995 - - - - VIP Duhamel et al, 1997 Avillac et al, 2001 PMv Graziano and Gross, 1998 1 1 Shift Ratio Shift Ratio 0.5 0.5 Head-centered We can interpret experimental results obtained multi-sensory cells in various brains areas in terms of confidence given to the visual modality. The amount of visual receptive field shifts with the eye as a function of the strengh of the connection weights coming from the visual layer, which we call the eye-centered dominance, and the modality of the input is summarized on this plot. The blue curve correspond to the tactile receptive field and the pink curve correspond to the visual receptive field. When the eye-centered dominace is weak, that is, when the network trust more the visual modality than the tactile modality, the visual and tactiele receptive fields tend to be eye-centered. When the eye-centered dominance is weak and the network trust more the tactile modality, the visual and tactile receptive fields tend to be head-centered. However, for intermediate eye-centered dominance, when both modalities are trusted equally, the visual receptive fields tend to be more eye-centered than the tactile receptive fields. As a consequence, there is a discrepancy between the spatial representations in the visual and tactile maps. In VIP, receptive fields are partially shifting and tactile receptive fields are mainly head-centered. Thus, we could be in this part of the state space where the eye-centered is weak, meaning that this are gives more confidence to the tactile modality in its multi-sensory integration. In the superior colliculus and in LIP, auditory receptive fields are partially shifting and visual receptive fields are eye-centered. This could correspond to this part of the workspace, where more confidence is given to the visual modality than to the auditory modality. 4 2 1 1 0.5 0.25 Eye-centered dominance Eye-centered dominance (confidence given to the visual modality)

64 Taking time into account
Inference and predictions need to take place on short time scale. Events start and end unpredictably, objects move... We need to estimate the states of relevant variable on-line. No time to converge to an attractor. Spike per spike computations (and coding?)

65 Explicit encoding of probabilities by population codes
Convolutional codes (Zemel, Dayan, Pouget, Sahani) Basis function coding (Andersen, Barber, Eliasmith) Time? Inference is hard. Experimental evidence is lacking.

66 It is not enough to store past input to combine it with new input.
P(guilty) Dupont Durand Smith

67 It is not enough to store past input to combine it with new input.
P(guilty) Dupont Durand Smith

68 It is not enough to store past input to combine it with new input.
P(guilty) Dupont Durand Smith

69 It is not enough to store past input to combine it with new input.
P(guilty) Dupont Durand Smith

70 It is not enough to store past input to combine it with new input.
P(guilty) Dupont Durand Smith

71 It is not enough to store past input to combine it with new input.
P(guilty) Dupont Durand Smith

72 It is not enough to store past input to combine it with new input.
P(guilty) Dupont Durand Smith

73 It is not enough to store past input to combine it with new input.
P(search) London Paris

74 It is not enough to store past input to combine it with new input.
P(search) London Paris

75 It is not enough to store past input to combine it with new input.
P(search) London Paris

76 Hidden Markov Model Hidden variable cause Observations
Inference can be performed recurrently: Observations are spikes: 0 (no spike) or 1 (spike)

77 ? Hidden Markov Model Hidden variable cause Observations
Example 1: Detecting the presence of a stimulus ? stimulus present/absent Probability that stimulus appears/disappears Firing rate of neuron i given that stimulus is present/absent

78 Hidden Markov Model Hidden variable cause Observations
Example 2: Tracking the state of the arm during movement Position, speed and acceleration of the arm Noisy dynamics of the arm. Tuning curve of neuron i

79 Evidence for “explicit” encoding of probability in neural activity: Shadlen et al
Gold and Shadlen, 2004 Rightward motion: saccade right Leftward motion: saccade left

80 Evidence for “explicit” encoding of probability in neural activity: Shadlen et al
BRAIN Medio-temporal cortex (MT): Lateral intra-parietal cortex (LIP) Preferred direction Increase with coherence Firing rate Non-preferred direction Time

81 Evidence for “explicit” encoding of probability in neural activity: Shadlen et al
Lateral intra-parietal cortex (LIP) Medio-temporal cortex (MT): Firing rate Time

82 LIP/MT model (sequential probability test)
Response left>response right Saccade right Saccade Left Response right>response left

83 LIP/MT model (sequential probability test)
Left>Right Saccade right Saccade Left Right>Left

84 LIP/MT model (sequential probability test)
Left>Right Saccade right Saccade Left Right>Left

85 LIP/MT model (sequential probability test)
Left>Right Saccade right Saccade Left Right>Left

86 LIP/MT model (sequential probability test)
Left>Right Saccade right Saccade Left Right>Left

87 LIP/MT model (sequential probability test)
Left>Right Saccade right Saccade Left Right>Left

88 LIP/MT model (sequential probability test)
Left>Right Saccade right Saccade Left Right>Left

89 LIP/MT model (sequential probability test)
Left>Right Saccade right Saccade Left Right>Left

90 LIP/MT model (sequential probability test)
Left>Right Saccade right Saccade Left Right>Left

91 Evidence for “explicit” encoding of probability in neural activity: Shadlen et al
Assume the direction of motion never change during the trial: Go to the log (replace products by sums): In the limit of small temporal discretization step, dt: OR Firing rate when x is 1 Firing rate when x is 0

92 Bayesian temporal integration in single neurons
?

93 ? Bayesian temporal integration in single neurons
Rate of switching on: Rate of switching off:

94 ? Bayesian temporal integration in single neurons
Rate of switching on: Rate of switching off:

95 Bayesian integration corresponds to leaky synaptic integration.
? Leak Synaptic input Time

96 Accumulation of evidence results in linear ramps followed by a saturation.
Slowly dynamics Fast dynamics A. B. 5 5 Log odd ratio Log odd ratio -5 -5 -10 -10 100 200 300 400 100 200 300 400 Time Time If the stimulus varies very slowly, the neuron is an integrator For faster stimuli, this integration is leaky

97 Importance of using the right temporal statistics
Left>Right Motion right Motion left Motion right Right>Left Motion left

98 Generalization to multiple states
Multiple choice Continuous variables Sequences of discrete state Input layer Recurrent layer W M

99 Alternative approaches
Linearization of the recurrent equations. Recurrent network of leaky integrate and fire neuron. (Rao, 2001) Membrane potentials are log probabilities. Spike generation exponentiates (Rao, 2003) Firing rate are probabilities. Multiplicative dendrites (Beck and Pouget, 2007) Generalization of line attractor networks (Deneve et al, 2007). Deterministic spike generation mechanism: spikes signal increase in probability (Deneve, 2007).

100 Optimal filtering of noisy sensory input in the presence of movement
Position input Speed input Force Activity Time Activity This is a very simplified example of the problem posed by the sensory noise in the presence of movements. Consider a subject instructed to move its arm along a trademill in the dark, and to estimate the position of his arm at the end of the movement. To a first approximation, the hand movement can be approximated by a linear dynamical equation, the position and speed at time t being a function of the position and speed at time t-1, the force exerted at time t-1, plus a motor noise term, N(t). Let us suppose that there are somatosensory neurons representing the hand speed and location by noisy population codes. Moreover, we suppose that an estimate of the force exerted on the arm is available, for example in the form of efferent motor commands. How could this noisy population codes be integrated together to get the best estimate of hand location? One possibility to average all spike emitted by position cells during movement, but obviously, that would not work, given that this location vary over time. Another possibility would be to consider only the spike received during a very short time window, when the arm is approximately static. An estimate of arm position could be obtained by maximum likelihood estimation. However, during that short time window, only a few spikes have been emitted by the somatosensory cells. With this technique, the arm position estimate will be very unreliable. One could also take advantage of the fact that the dynamical equations and the force exerted on the arm available. If we suppose that the initial state of the arm is known, an on-line estimate of the arm positionat time 1 could be simply obtained by applying the set of dynamical equations to the position and speed of the arm at time 0. Then, an estimate of arm position at time 2 could be obtained by applying the dynamical equations to the sttate estimate at time 1, and so on. and then the set of dymaical equation and the force at time 1 to get the position and speed at time 2, and so on. However, due to the motor noise N(t), this forward estimate will accumulate errors over time, getting further and further from the real arm position for longer movements. In fact, the optimal way to it is to use a forward estimate, but corrected at each time step by the sensory measurements obtained from the somatosensory noisy population codes. This process is called Kalman filtering. The relative contribution of the forward estimate and the sensory feedback is weighted by a kalman gain matrix that can be derived from the sensory and motor noise. Spikes # Activity Preferred arm position Preferred arm speed Preferred force INTERNAL MODEL

101 Network model Predictive (lateral) connections:
Position input Activity Predictive (lateral) connections: Speed input Arm position Arm speed Activity Activity Sensori-motor Map Efferent motor command Efferent motor command Feedforward connections: W Lateral connections: M

102 # Spikes # Spikes # Spikes # Spikes # Spikes # Spikes
Preferred arm speed # Spikes Preferred arm position # Spikes Preferred arm speed # Spikes Preferred arm speed # Spikes Preferred arm position # Spikes Preferred arm position

103 Position Speed Tuning curves of sensorimotor neuron reflects the reliability and covariance of the different sensory and motor variables.

104 Prediction: Partially shifting representation of position and speed.
Individual cells shifts their velocity tunings curves with position Arm at 0cm 5 10 Activity 20 30 40 50 Speed (cm/sec) Arm at 20cm Population vector points in the right direction! What does this model predict for the neural representation in sensory-motor cells? On this plot we represented the tuning curve of a basis function cell to speed, plotted for two different arm position. On the left, we plotted the response to speed at the beginning of the movement. In this case, the cell appear to be tuned to speed and gain modulated by arm position. In the center, we plotted the response to speed after 400ms of movement. This time, the cell is still tuned to speed but its speed tuning curve is not invariant we arm position: it shift with the arm position. It could be said to be partially shifting. Finally, on the right, we have the same plot after 400 ms of movement, but this time with sensory inputs that are 4 times more reliable. This could be the case for example if the movement is made in full view, and visual inputs about arm position and speed is added to the somatosensory input. This time, the speed tuning curve is even less invariant to speed, and shift more with the arm position. Thus, the basis function cells with have tuning curves that are non invariant, and whose amount of shift will depend on the sensory and motor context. This is due to the fact that this cell must take into account the reliabilities of the sensory inputs and correlations between the input variables to perform optimal filtering. The correlation between arm and speed vary during the movement, and thus does the neural representation in sensory-motor cells. This versatility of neural representation in basis function maps might explain why I is so difficult to find invariant tuning curves in sensory-motor cells. It might explain in part why the nature of the neural representation in the primary motor cortex is so controversial.

105 Prediction: Partially shifting representation of position and speed.
Individual cells shifts their velocity tunings curves with position Arm at 0cm 5 10 Activity 20 30 40 50 Speed (cm/sec) Arm at 20cm Cell’s selectivity is described by the 2D tuning curve: What does this model predict for the neural representation in sensory-motor cells? On this plot we represented the tuning curve of a basis function cell to speed, plotted for two different arm position. On the left, we plotted the response to speed at the beginning of the movement. In this case, the cell appear to be tuned to speed and gain modulated by arm position. In the center, we plotted the response to speed after 400ms of movement. This time, the cell is still tuned to speed but its speed tuning curve is not invariant we arm position: it shift with the arm position. It could be said to be partially shifting. Finally, on the right, we have the same plot after 400 ms of movement, but this time with sensory inputs that are 4 times more reliable. This could be the case for example if the movement is made in full view, and visual inputs about arm position and speed is added to the somatosensory input. This time, the speed tuning curve is even less invariant to speed, and shift more with the arm position. Thus, the basis function cells with have tuning curves that are non invariant, and whose amount of shift will depend on the sensory and motor context. This is due to the fact that this cell must take into account the reliabilities of the sensory inputs and correlations between the input variables to perform optimal filtering. The correlation between arm and speed vary during the movement, and thus does the neural representation in sensory-motor cells. This versatility of neural representation in basis function maps might explain why I is so difficult to find invariant tuning curves in sensory-motor cells. It might explain in part why the nature of the neural representation in the primary motor cortex is so controversial. shifting unbiased

106 Spike generation. Rate coding: information is in spike count. Spike times are random (Poisson). Deterministic spike generation rule: each spike signals a change in the belief state.

107 LIP/MT model Left>Right Rightward motion Firing rate
Leftward motion Right>Left

108 Noisy reports on a murder case?
P(guilty) Dupont Durand Smith Durand Durand Smith Durand Dupont

109 LIP/MT model Left>Right Rightward motion Integrate and fire
Leftward motion Right>Left

110 Deterministic firing Synaptic input time 2 -2 2000 Spikes Ot time

111 Mechanism for spike generation in Bayesian neurons
Integrate input (what I know) (What I know) Integrate output (what I told) - (What I told in the past) -4 -2 2 4 Log odds 500 1000 1500 2000 2500 3000 The second problem is more tricky. If we think in trerms of firing rate it is screwed. Another solution…Threshold…What needs to be said. What has already been told.

112 Mechanism for spike generation in Bayesian neurons
Integrate input (what I know) - Integrate output (what I told) -4 -2 2 4 Log odds 500 1000 1500 2000 2500 3000 The second problem is more tricky. If we think in trerms of firing rate it is screwed. Another solution…Threshold…What needs to be said. What has already been told. Integrator with adapting threshold Integrate and fire with adaptive time constant 2 4 1 3 go 2 1 -1 -2 -1 -5 100 Time 200 300 400 100 200 Time 300 400 -3 -4 -4 -5

113 Neurons cannot all be integrators
? ? ? 10 10 10 Lt -10 -10 -10 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000 Time

114 Spikes deterministically signal an increase in probability
? ? ? 10 10 10 Lt -10 -10 -10 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000 Time

115 Generalization to continuous variables
x time W M Threshold for spiking:

116 Implication 1: Spikes signal increase in probability
synapse Time P(xt) Time The second problem is already solved. The intregration takes place at any time and is everywere… A spike signals an increase in the probability of a binary variable.

117 Implication 2: Firing rate statistics are similar to Poisson, but these fluctuations purely reflects input noise. 40 Ot Time 1000 2000 3000 Poisson statistics. 5000 40 4000 30 Variance spike count Count 3000 20 2000 10 1000 10 20 30 40 200 400 600 Mean spike count ISI The spike train is a deterministic function of the input.

118 Implication 3: Firing precision depends on the proportion meaningful fluctuations.
Trials Trials 1000 2000 1000 2000 Time Time

119 Conclusion Information encoded in noisy population codes can be decoded using recurrent line attractor networks. However, in order to combine cues and to integrate information over time, it is necessary to represent probability distributions, not only estimates of sensory and motor variables. This representation could be implicit (i.e. gain encoding) but this implies strong limitations on the statistical problems that can to be solved. This representation could be more explicit. This seems to account for the integrate and fire dynamics of biological neurons. But this puts into question the traditional concept of “neural signal” and “neural noise”. Spikes signaling deterministic changes in the belief state share properties with Poisson distributed “rate models”, but merely reflect fluctuations in the input. This also implies whole new sets of predictions for link between biophysics and behavior.

120 Towards the Bayesian Brain?
Bayesian models State of the word or body Inference Prediction Internal beliefs Prediction Likelihood Models, Probabilities, priors…. decisions explorations Behavioral strategies Maximize Utility/Reward

121 Expectation-Maximization
1 Repeat Expectation step: Compute the expected value of the state given the observations and the current set of parameters. Maximization step: Choose parameters maximizing the probability of observations given the expected value of the state.

122 Learning 1 time Repeat E(xt)
1 time E(xt) Repeat Expectation step: Compute the expected value of the state given the observations and the current set of parameters. Maximization step: Choose parameters maximizing the probability of observations given the expected value of the state. time Lf Lb Ltot

123 Learning 1 time E(xt) Repeat
1 time E(xt) Repeat Expectation step: Compute the expected value of the state given the observations and the current set of parameters. Maximization step: Choose parameters maximizing the probability of observations given the expected value of the state.

124 Learning 1 time E(xt) Repeat
1 time E(xt) Repeat Expectation step: Compute the expected value of the state given the observations and the current set of parameters. Maximization step: Choose parameters maximizing the probability of observations given the expected value of the state.

125 Learning 1 time E(xt) Repeat
1 time E(xt) Repeat Expectation step: Compute the expected value of the state given the observations and the current set of parameters. Maximization step: Choose parameters maximizing the probability of observations given the expected value of the state.

126 Learning 1 time E(xt) Repeat
1 time E(xt) Repeat Expectation step: Compute the expected value of the state given the observations and the current set of parameters. Maximization step: Choose parameters maximizing the probability of observations given the expected value of the state.

127 Slow time constant: Learning, adaptation
Dynamics at multiple time scales in spiking neurons Time (hours) Slow time constant: Learning, adaptation (on-line EM) 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10 0.008 0.01 0.012 0.04 0.06 0.2 0.4 -0.1 -0.05 2 2.5 3 3.5 4 4.5 5 5.5 6 -10 -5 10 Fast time constant: Inference, spiking Time (hours) Lt Ot

128 Learning in networks h(cuddly zebra)=1 h(dangerous tiger)=1
s(striped patch)=1

129 Learning in networks Divisive inhibition Contrast 0.1 0.2 0.3 0.4 0.5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 1 2 3 4 5 6 7 8


Download ppt "How do neurons deal with uncertainty?"

Similar presentations


Ads by Google