Download presentation
Presentation is loading. Please wait.
1
Symmetry breaking clusters when deciphering the neural code September 12, 2005 Albert E. Parker Department of Mathematical Sciences Center for Computational Biology Montana State University Collaborators: Tomas Gedeon, Alex Dimitrov, John Miller, and Zane Aldworth
2
Deciphering the Neural Code A Clustering Problem The Dynamical System Bifurcations Theoretical Results Numerical Results Talk Outline
3
Deciphering the Neural Code: How does neural activity represent information about environmental stimuli? “The little fly sitting in the fly’s brain trying to fly the fly”
4
Inputs: stimuli X outputs: neural response Y Looking for the dictionary to the neural code … decoding encoding
5
… but the dictionary is not deterministic! Given a stimulus, an experimenter observes many different neural responses: X Y i | X i = 1, 2, 3, 4
6
… but the dictionary is not deterministic! Given a stimulus, an experimenter observes many different neural responses: Neural encoding is stochastic!! X Y i | X i = 1, 2, 3, 4
7
Similarly, neural decoding is stochastic: Y X i |Y i = 1, 2, …, 9
8
Probability Framework X Y environmental stimuli neural responses decoder: P(X|Y) encoder: P(Y|X)
9
Deciphering the Neural Code = Determining the encoder P(Y|X) or the decoder P(X|Y) Common Approaches: parametric estimations, linear methods Difficulty: There is never enough data.
10
One Approach: Cluster the responses X Y StimuliResponses Z q(Z |Y) Clustered Responses K objects {y i } N objects {z i }L objects {x i } p(X,Y)
11
One Approach: Cluster the responses X Y StimuliResponses Z q(Z |Y) Clustered Responses K objects {y i } N objects {z i }L objects {x i } p(X,Y)
12
One Approach: Cluster the responses X Y StimuliResponses Z q(Z |Y) Clustered Responses K objects {y i } N objects {z i }L objects {x i } p(X,Y) P(Y|X) P(X|Y)
13
One Approach: Cluster the responses X Y StimuliResponses Z q(Z |Y) Clustered Responses K objects {y i } N objects {z i }L objects {x i } p(X,Y) P(Y|X) P(X|Y)
14
One Approach: Cluster the responses X Y StimuliResponses Z q(Z |Y) Clustered Responses K objects {y i } N objects {z i }L objects {x i } p(X,Y) P(Y|X) P(Z|X) P(X|Y)P(X|Z)
15
One Approach: Cluster the responses q(Z|Y) is a stochastic clustering of the responses The outputs Y are clustered in Z so that the information that one can learn about X by observing Z, I(X;Z), is as close as possible to the mutual information I(X;Y) X Y StimuliResponses Z q(Z |Y) K objects {y i } N objects {z i }L objects {x i } p(X,Y) Clustered Responses
16
Rate Distortion Theory (Shannon 1950’s) Minimal Informative Compression min I(X,Z) constrained by D(X,Z) D 0 Deterministic Annealing (Rose 1990’s) A Clustering Algorithm max H(Z|X) constrained by D(X,Z) D 0 Relationship between these formulations: I(X,Z)=H(Z) – H(Z|X) Two optimization problems which use this approach optimizing at a distortion level D(Y,Z) D 0
17
Information Bottleneck Method (Tishby, Pereira, Bialek 1999) min I(Y,Z) constrained by I(X;Z) I 0 max –I(Y,Z) + I(X;Z) Information Distortion Method (Dimitrov and Miller 2001) max H(Z|Y) constrained by I(X;Z) I 0 max H(Z|Y) + I(X;Z) Examples:
18
A basic annealing algorithm to solve max q (G(q)+ D(q)) Let q 0 be the maximizer of max q G(q), and let 0 =0. For k 0, let (q k, k ) be a solution to max q G(q) + D(q ). Iterate the following steps until K = max for some K. 1.Perform -step: Let k+1 = k + d k where d k >0 2.The initial guess for q k+1 at k+1 is q k+1 (0) = q k + for some small perturbation . 3.Optimization: solve max q (G(q) + k+1 D(q)) to get the maximizer q k+1, using initial guess q k+1 (0).
19
Application of the annealing method to the Information Distortion problem max q (H(Z|X) + I(X;Z)) when p(X,Y) is defined by four gaussian blobs Y, Inputs X, Outputs YX K=52 outputs L=52 inputs p(X,Y) XZ q(Z|X) K=52 outputsN=4 clustered outputs X, Outputs Z, Clustered Outputs
20
Evolution of the optimal clustering: Observed Bifurcations for the Four Blob problem: We just saw the optimal clusterings q * at some * = max. What do the clusterings look like for < max ?? I(Y,Z) bits
21
?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating branches are there? What do the bifurcating branches look like? Are they subcritical or supercritical ? What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? Are there bifurcations which alter the classes after all of the classes have resolved ? q* Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem I(Y,Z) bits
22
A General Problem: To determine the bifurcations of solutions to clustering problems of the form max q G(q) constrained by D(q) I 0 where q is a vector of conditional probabilities in R NK. G = g(q i ) and D= d(q i ) are sufficiently smooth on , and q=(q 1 T … q N T ) T where q i R K. This implies that: 1. G and D have symmetry: they are invariant to re-labeling of the classes of Z; 2. The Hessians d 2 G and d 2 D are block diagonal. The Hessians d 2 G and d 2 D satisfy a set of generic regularity conditions at bifurcation. XZ q(Z|X) K objectsN clusters
23
A similar formulation: Using the method Lagrange multipliers, the goal of determining the bifurcation structure of solutions of the optimization problem can be rephrased as finding the bifurcation structure of stationary points of the problem max q (G(q)+ D(q)) where [0, ). q is a vector of conditional probabilities in R NK. G = g(q i ) and D= d(q i ) are sufficiently smooth on , and q=(q 1 T … q N T ) T where q i R K. The Hessians d 2 G and d 2 D satisfy a set of generic regularity conditions at bifurcation. XZ q(Z|X) K objectsN clusters
24
The Dynamical System Goal: To solve max q (G(q) + D(q)) for each , incremented in sufficiently small steps, as . Method: Study the equilibria of the of the gradient flow Equilibria of this system are possible solutions of the the maximization problem (satisfy the necessary conditions of constrained optimality) If w T d 2 q (G(q * ) + D(q * ))w < 0 for every w ker J, then q * is a maximizer of. The Jacobian q, L (q *, * ) is symmetric, and so only bifurcations of equilibria can occur. The first equilibrium is q*( 0 = 0) 1/N.
25
The Symmetries: To better understand the bifurcation structure, we capitalize on the symmetries of the function G(q)+ D(q) X Z q(Z|X) : a clustering K objects {x i } N objects {z i } class 1 class 3
26
X Z q(Z|X) : a clustering K objects {x i } N objects {z i } class 3 class 1 The Symmetries: To better understand the bifurcation structure, we capitalize on the symmetries of the function G(q)+ D(q)
27
The symmetry group of all permutations on N symbols is S N.
28
Equivariant Branching Lemma: The subgroups of S N with 1D fixed point spaces determine the Bifurcation Structure
29
A partial subgroup lattice for S 4 and the corresponding bifurcating directions
30
A partial subgroup lattice for S 4 and the corresponding bifurcating directions corresponding to subgroups isomorphic to S 2 x S 2.
31
Symmetry Breaking Bifurcations q*
32
Symmetry Breaking Bifurcations q*
33
Symmetry Breaking Bifurcations q*
34
Symmetry Breaking Bifurcations q*
35
Symmetry Breaking Bifurcations q*
36
Symmetry Breaking Bifurcations q*
37
Observed Bifurcation Structure Group Structure
38
q* Observed Bifurcation Structure The Equivariant Branching Lemma shows that the bifurcation structure contains the branches …
39
Group Structure q* Observed Bifurcation Structure The subgroups {S 2 x S 2 } give additional structure …
40
Group Structure q* Observed Bifurcation Structure The subgroups {S 2 x S 2 } give additional structure …
41
q* Theorem: There are at exactly K bifurcations on the branch (q 1/N, ) whenever G(q 1/N ) is nonsingular There are K=52 bifurcations on the first branch Observed Bifurcation Structure
42
?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating solutions are there? What do the bifurcating branches look like? Are they subcritical or supercritical ? What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? Are there bifurcations which alter the classes after all of the classes have resolved ? q* Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem
43
?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? There are N-1 symmetry breaking bifurcations from S M to S M-1 for M N. What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating solutions are there? There are at least N from the first bifurcation (S N S N –1 ), at least N-1 from the next one (S N -1 S N –2 ), etc, as well as branches with symmetry breaking from S M S m x S n for all (m,n) where m + n =M. What do the bifurcating branches look like? They are subcritical or supercritical depending on the sign of the bifurcation discriminator (q *, *,m,n). What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? Yes for , No for the annealing problem. Are there bifurcations which alter the classes after all of the classes have resolved ? Generically, no. Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem q*
44
Symmetry breaking pitchfork-like bifurcation Impossible scenario Saddle-node bifurcation Impossible scenario Non-generic
45
Continuation techniques provide numerical confirmation of the theory
46
q* I(Y,Z) bits
47
Bifurcating branches with symmetry S 2 x S 2 = q* I(Y,Z) bits
48
A closer look … q* I(Y,Z) bits
49
Bifurcation from S 4 to S 3 … q* I(Y,Z) bits
50
The bifurcation from S 4 to S 3 is subcritical … (the theory predicted this since the bifurcation discriminator (q 1/4, *,m,n)<0 ) I(Y,Z) bits
51
Theorem: The bifurcation discriminator of the pitchfork-like branch (q *, *, * ) + (tu,0, (t)) with symmetry S m x S n is If (q *, *,m,n) 0, then the branch is supercritical.
52
Additional structure!! I(Y,Z) bits
54
Conclusions … We have a complete theoretical picture of how the clusterings evolve for any problem of the form max q (G(q)+ D(q)) subject to the assumptions stated earlier. SO WHAT?? There are theoretical consequences for “Rate Distortion Curve”… This yields a new and improved algorithm for solving the neural coding problem …
55
A numerical algorithm to solve max(G(q)+ D(q)) Let q 0 be the maximizer of max q G(q), 0 =1 and s > 0. For k 0, let (q k, k ) be a solution to max q G(q) + D(q ). Iterate the following steps until K = max for some K. 1.Perform -step: solve for and select k+1 = k + d k where d k = ( s sgn(cos )) /(|| q k || 2 + || k || 2 +1) 1/2. 2.The initial guess for (q k+1, k+1 ) at k+1 is (q k+1 (0), k+1 (0) ) = (q k, k ) + d k ( q k, k ). 3.Optimization: solve max q (G(q) + k+1 D(q)) using pseudoarclength continuation to get the maximizer q k+1, and the vector of Lagrange multipliers k+1 using initial guess (q k+1 (0), k+1 (0) ). 4.Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(q k ) + k D(q k )] and q [G(q k+1 ) + k+1 D(q k+1 )]. If a bifurcation is detected, then set q k+1 (0) = q k + d_k u where u is bifurcating direction and repeat step 3.
58
Application to cricket sensory data E(Y|Z): stimulus means conditioned on each of the classes Y: Neural responses Z:optimal clustering
60
More about Bifurcations Theorem: All symmetry breaking bifurcations are pitchfork-like. Outline of proof: ’(0)=0 since 2 xx r(0,0) =0. Theorem: Generically, bifurcations which alter the classes do not occur after all of the classes have resolved. That is, only saddle-node bifurcations are possible, which do not alter class structure due to explicit bifurcating direction. Theorem: If d 2 D(q * ) is positive definite on ker d 2 F (q *, *), then the singularity (q *, *, * ) is a bifurcation. In particular, if d 2 G(q * ) is negative definite on ker d 2 F (q *, *), then d 2 D(q * ) is positive definite on ker d 2 F (q *, *). Theorem: A symmetry breaking bifurcating direction u is an eigenvector of d 2 q, L ((q *, *)+tu, *+ (t)) for small t. If the corresponding eigenvalue is positive, then the branch consists of stationary points which are not solutions of. Theorem: Subcritical bifurcating branches may be solutions of either or Solutions of need not be solutions of. Solutions of are always solutions of. Theorem: If there exists a saddle-node bifurcation of solutions to the Information Bottleneck problem at I 0 = I *, then R I (I 0 ) is neither concave, nor convex in any neighborhood of I¤. Similarly, the existence of a saddle-node bifurcation of solutions to the Information Distortion problem at I 0 = I * implies that R H (I 0 ) is neither concave, nor convex in any neighborhood of I *.
61
Continuation A local maximum q k * ( k ) of is an equilibrium of the gradient flow. Initial condition q k+1 (0) ( k+1 (0) ) is sought in tangent direction q k, which is found by solving the matrix system The continuation algorithm used to find q k+1 * ( k+1 ) is based on Newton’s method. Parameter continuation follows the dashed (---) path, pseudoarclength continuation follows the dotted (…) path
62
The Groups Let P be the finite group of n ×n “block” permutation matrices which represents the action of S N on q and F(q, ). For example, if N=3, permutes q(Z 1 |y) with q(Z 2 |y) for every y F(q, ) is P -invariant means that for every P, F( q, ) = F(q, ) Let be the finite group of (n+K) × (n+K) block permutation matrices which represents the action of S N on and q, L (q,, ): q, L (q,, ) is -equivariant means that for every q, L (q,, ) = q, L ( , )
63
Notation and Definitions The symmetry of is measured by its isotropy subgroup An isotropy subgroup is a maximal isotropy subgroup of if there does not exist an isotropy subgroup of such that . At bifurcation, the fixed point subspace of q*, * is
64
Equivariant Branching Lemma One of the Existence Theorems we use to describe a bifurcation in the presence of symmetries is the Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1). Idea: The bifurcation structure of local solutions is described by the isotropy subgroups of which have dim Fix( )=1. System:. r(x, ) is G -equivariant for some compact Lie Group G Fix( G )={0} Let H be an isotropy subgroup of G such that dim Fix ( H ) = 1. Assume r(0,0) 0 (crossing condition). Then there is a unique smooth solution branch (tx 0, (t)) to r = 0 such that x 0 Fix ( H ) and the isotropy subgroup of each solution is H.
65
Smoller-Wasserman Theorem Another Existence Theorem: Smoller-Wasserman Theorem (1985-6) For variational problems where there is a bifurcating solution tangential to Fix( H ) for every maximal isotropy subgroup H, not only those with dim Fix( H ) = 1. dim Fix( H ) =1 implies that H is a maximal isotropy subgroup
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.