Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modelling and Control Issues Arising in the Quest for a Neural Decoder Computation, Control, and Biological Systems Conference VIII, July 30, 2003 Albert.

Similar presentations


Presentation on theme: "Modelling and Control Issues Arising in the Quest for a Neural Decoder Computation, Control, and Biological Systems Conference VIII, July 30, 2003 Albert."— Presentation transcript:

1 Modelling and Control Issues Arising in the Quest for a Neural Decoder Computation, Control, and Biological Systems Conference VIII, July 30, 2003 Albert E. Parker Complex Biological Systems Department of Mathematical Sciences Center for Computational Biology Montana State University Collaborators: Tomas Gedeon, Alex Dimitrov, John Miller, and Zane Aldworth

2  The Neural Coding Problem  A Clustering Problem  The Dynamical System  The Role of Bifurcation Theory  A new algorithm to solve the Neural Coding Problem Talk Outline

3 The Neural Coding Problem GOAL: To understand the neural code. EASIER GOAL: We seek an answer to the question, How does neural activity represent information about environmental stimuli? “The little fly sitting in the fly’s brain trying to fly the fly”

4 inputs: stimuli X outputs: neural responses Y Looking for the dictionary to the neural code … decoding encoding

5 … but the dictionary is not deterministic! Given a stimulus, an experimenter observes many different neural responses: X Y i | X i = 1, 2, 3, 4

6 … but the dictionary is not deterministic! Given a stimulus, an experimenter observes many different neural responses: Neural coding is stochastic!! X Y i | X i = 1, 2, 3, 4

7 Similarly, neural decoding is stochastic: Y X i |Y i = 1, 2, …, 9

8 Probability Framework X Y environmental stimuli neural responses decoder: P(X|Y) encoder: P(Y|X)

9 The Neural Coding Problem: How to determine the encoder P(Y|X) or the decoder P(X|Y)? Common Approaches: parametric estimations, linear methods Difficulty: There is never enough data.

10 One Approach: Cluster the responses X Y StimuliResponses YNYN q(Y N |Y) Clustered Responses K objects {y i } N objects {y Ni }L objects {x i } p(X,Y)

11 One Approach: Cluster the responses X Y StimuliResponses YNYN q(Y N |Y) Clustered Responses K objects {y i } N objects {y Ni }L objects {x i } p(X,Y)

12 One Approach: Cluster the responses X Y StimuliResponses YNYN q(Y N |Y) Clustered Responses K objects {y i } N objects {y Ni }L objects {x i } p(X,Y) P(Y|X) P(X|Y)

13 One Approach: Cluster the responses X Y StimuliResponses YNYN q(Y N |Y) Clustered Responses K objects {y i } N objects {y Ni }L objects {x i } p(X,Y) P(Y|X) P(X|Y)

14 One Approach: Cluster the responses X Y StimuliResponses YNYN q(Y N |Y) Clustered Responses K objects {y i } N objects {y Ni }L objects {x i } p(X,Y) P(Y|X) P(Y N |X) P(X|Y)P(X|Y N )

15 One Approach: Cluster the responses q(Y N |Y) is a stochastic clustering of the responses To address the insufficient data problem, one clusters the outputs Y into clusters Y N so that the information that one can learn about X by observing Y N, I(X;Y N ), is as close as possible to the mutual information I(X;Y) X Y StimuliResponses YNYN q(Y N |Y) K objects {y i } N objects {y Ni }L objects {x i } p(X,Y) Clustered Responses

16 Information Bottleneck Method (Tishby, Pereira, Bialek 1999) min I(Y,Y N ) constrained by I(X;Y N )  I 0 max –I(Y,Y N ) +  I(X;Y N ) Information Distortion Method (Dimitrov and Miller 2001) max H(Y N |Y) constrained by I(X;Y N )  I 0 max H(Y N |Y) +  I(X;Y N ) Two optimization problems which use this approach

17 In General: We have developed an approach to solve optimization problems of the form max q  G(q) constrained by D(q)  D 0 or (using the method of Lagrange multipliers) max q  F(q,  ) = max q  (G(q)+  D(q)) where   [0,  ).  is a subset of valid stochastic clusterings in R NK. G and D are sufficiently smooth in . G and D have symmetry: they are invariant to relabelling of the classes of Y N.

18 Symmetry: invariance to relabelling of the clusters of Y N Y YNYN q(Y N |Y) : a clustering K objects {y i } N objects {y Ni } class 1 class 2

19 Symmetry: invariance to relabelling of the clusters of Y N Y YNYN q(Y N |Y) : a clustering K objects {y i } N objects {y Ni } class 2 class 1

20 An annealing algorithm to solve max q  (G(q)+  D(q)) Let q 0 be the maximizer of max q G(q), and let  0 =0. For k  0, let (q k,  k ) be a solution to max q G(q) +  D(q ). Iterate the following steps until  K =  max for some K. 1.Perform  -step: Let  k+1 =  k + d k where d k >0 2.The initial guess for q k+1 at  k+1 is q k+1 (0) = q k +  for some small perturbation . 3.Optimization: solve max q (G(q) +  k+1 D(q)) to get the maximizer q k+1, using initial guess q k+1 (0).

21 Application of the annealing method to the Information Distortion problem max q  (H(Y N |Y) +  I(X;Y N )) when p(X,Y) is defined by four gaussian blobs Stimuli Responses X Y 52 responses 52 stimuli p(X,Y) YYNYN q(Y N |Y) 52 responses4 clusters

22 Evolution of the optimal clustering: Observed Bifurcations for the Four Blob problem: We just saw the optimal clusterings q * at some  * =  max. What do the clusterings look like for  <  max ??

23 ?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating solutions are there? What do the bifurcating branches look like? Are they subcritical or supercritical ? What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? Are there bifurcations after all of the classes have resolved ?  q* Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem

24 Bifurcation theory in the presence of symmetries enables us to answer the questions previously posed …

25 Recall the Symmetries: To better understand the bifurcation structure, we capitalize on the symmetries of the function G(q)+  D(q) Y YNYN q(Y N |Y) : a clustering K objects {y i } N objects {y Ni } class 1 class 3

26 Y YNYN q(Y N |Y) : a clustering K objects {y i } N objects {y Ni } class 3 class 1 Recall the Symmetries: To better understand the bifurcation structure, we capitalize on the symmetries of the function G(q)+  D(q)

27 The symmetry group of all permutations on N symbols is S N.

28 Formulate a Dynamical System Goal: To solve max q  (G(q) +  D(q)) for each , incremented in sufficiently small steps, as   . Method: Study the equilibria of the of the gradient flow Equilibria of this system are possible solutions of the the maximization problem (satisfy the necessary conditions of constrained optimality) The Jacobian  q, L (q *,  * ) is symmetric, and so only bifurcations of equilibria can occur.

29 Observed Bifurcation Structure

30 Observed Bifurcation Structure Group Structure

31  q* Observed Bifurcation Structure The Equivariant Branching Lemma shows that the bifurcation structure contains the branches …

32 Group Structure  q* Observed Bifurcation Structure The Smoller-Wasserman Theorem shows additional structure …

33  q* Theorem: There are at exactly K/N bifurcations on the branch (q 1/N,  ) for the Information Distortion problem There are 13 bifurcations on the first branch Observed Bifurcation Structure

34 ?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating solutions are there? What do the bifurcating branches look like? Are they subcritical or supercritical ? What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? Are there bifurcations after all of the classes have resolved ?  q* Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem

35 ?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? There are N-1 symmetry breaking bifurcations from S M to S M-1 for M  N. What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating solutions are there? There are at least N from the first bifurcation, at least N-1 from the next one, etc. What do the bifurcating branches look like? They are subcritical or supercritical depending on the sign of the bifurcation discriminator  (q *,  *,u k ). What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? No. Are there bifurcations after all of the classes have resolved ? In general, no. Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem  q*

36 Continuation techniques provide numerical confirmation of the theory

37

38 A closer look …  q*

39 Bifurcation from S 4 to S 3 …  q*

40 The bifurcation from S 4 to S 3 is subcritical … (the theory predicted this since the bifurcation discriminator  (q 1/4,  *,u)<0 )

41 Additional structure!!

42

43 Conclusions … We have a complete theoretical picture of how the clusterings evolve for any problem of the form max q  (G(q)+  D(q)) subject to the assumptions stated earlier. oWhen clustering to N classes, there are N-1 bifurcations. oIn general, there are only pitchfork and saddle-node bifurcations. oWe can determine whether pitchfork bifurcations are either subcritical or supercritical (1 st or 2 nd order phase transitions) oWe know the explicit bifurcating directions SO WHAT?? There are theoretical consequences … This yields a new and improved algorithm for solving the neural coding problem …

44 A numerical algorithm to solve max(G(q)+  D(q)) Let q 0 be the maximizer of max q G(q),  0 =1 and  s > 0. For k  0, let (q k,  k ) be a solution to max q G(q) +  D(q ). Iterate the following steps until  K =  max for some K. 1.Perform  -step: solve for and select  k+1 =  k + d k where d k = (  s sgn(cos  )) /(||   q k || 2 + ||   k || 2 +1) 1/2. 2.The initial guess for (q k+1, k+1 ) at  k+1 is (q k+1 (0), k+1 (0) ) = (q k, k ) + d k (   q k,   k ). 3.Optimization: solve max q (G(q) +  k+1 D(q)) using pseudoarclength continuation to get the maximizer q k+1, and the vector of Lagrange multipliers k+1 using initial guess (q k+1 (0), k+1 (0) ). 4.Check for bifurcation: compare the sign of the determinant of an identical block of each of  q [G(q k ) +  k D(q k )] and  q [G(q k+1 ) +  k+1 D(q k+1 )]. If a bifurcation is detected, then set q k+1 (0) = q k + d_k u where u is bifurcating direction and repeat step 3.

45

46 Application to cricket sensory data E(X|Y N ): stimulus means conditioned on each of the classes typical spike patterns optimal quantizer


Download ppt "Modelling and Control Issues Arising in the Quest for a Neural Decoder Computation, Control, and Biological Systems Conference VIII, July 30, 2003 Albert."

Similar presentations


Ads by Google