Download presentation
Presentation is loading. Please wait.
1
Symmetry Breaking Bifurcations of the Information Distortion Dissertation Defense April 8, 2003 Albert E. Parker III Complex Biological Systems Department of Mathematical Sciences Center for Computational Biology Montana State University
2
Goal: Solve the Information Distortion Problem The goal of my thesis is to solve the Information Distortion problem, an optimization problem of the form max q G(q) constrained by D(q) D 0 where is a subset of R n. G and D are sufficiently smooth in . G and D have symmetry: they are invariant to some group action. Problems of this form arise in the study of clustering problems or optimal source coding systems.
3
Goal: Another Formulation Using the method Lagrange multipliers, the goal of finding solutions of the optimization problem can be rephrased as finding stationary points of the problem max q F(q, ) = max q (G(q)+ D(q)) where [0, ). is a subset of R NK. G and D are sufficiently smooth in . G and D have symmetry: they are invariant to some group action.
4
How: Determine the Bifurcation Structure We have described the bifurcation structure of stationary points to any problem of the form max q F(q, ) = max q (G(q)+ D(q)) where [0, ). is a linear subset of R NK. G and D are sufficiently smooth in . G and D have symmetry: they are invariant to some group action.
5
Thesis Topics The Data Clustering Problem The Neural Coding Problem Information Theory / Probability Theory Optimization Theory Dynamical Systems Bifurcation Theory with Symmetries Group Theory Continuation Techniques
6
Outline of this talk The Data Clustering Problem A Class of Optimization Problems Bifurcation with Symmetries Numerical Results
7
The Data Clustering Problem Data Classification: identifying all of the books printed in 2002 which address the martial art Kempo Data Compression: converting a bitmap file to a jpeg file Y YNYN q(Y N |Y) : a clustering K objects {y i } N objects {y Ni }
8
A Symmetry: invariance to relabelling of the clusters of Y N Y YNYN q(Y N |Y) : a clustering K objects {y i } N objects {y Ni } class 1 class 2
9
A Symmetry: invariance to relabelling of the clusters of Y N Y YNYN q(Y N |Y) : a clustering K objects {y i } N objects {y Ni } class 2 class 1
10
Requirements of a Clustering Method The original data is represented reasonably well by the clusters –Choosing a cost function, D(Y,Y N ), called a distortion function, rigorously defines what we mean by the “data is represented reasonably well”. Fast implementation
11
Deterministic Annealing (Rose 1998) A Fast Clustering Algorithm max H(Y N |Y) constrained by D(Y,Y N ) D 0 Rate Distortion Theory (Shannon ~1950) Minimum Informative Compression min I(Y,Y N ) constrained by D(Y,Y N ) D 0 Examples optimizing at a distortion level D(Y,Y N ) D 0
12
Inputs and Outputs and Clustered Outputs The Information Distortion method clusters the outputs Y into clusters Y N so that the information that one can learn about X by observing Y N, I(X;Y N ), is as close as possible to the mutual information I(X;Y) The corresponding information distortion function is D I (Y;Y N )=I(X;Y) - I(X;Y N ) X Y InputsOutputs YNYN q(Y N |Y) Clusters K objects {y i } N objects {y Ni }L objects {x i } p(X,Y)
13
Information Distortion Method (Dimitrov and Miller 2001) max H(Y N |Y) constrained by D I (Y,Y N ) D 0 max H(Y N |Y) + I(X;Y N ) Information Bottleneck Method (Tishby, Pereira, Bialek 1999) min I(Y,Y N ) constrained by D I (Y,Y N ) D 0 max –I(Y,Y N ) + I(X;Y N ) Two optimization problems which use the information distortion function
14
An annealing algorithm to solve max q F(q, ) = max q (G(q)+ D(q)) Let q 0 be the maximizer of max q G(q), and let 0 =0. For k 0, let (q k, k ) be a solution to max q G(q) + D(q ). Iterate the following steps until K = max for some K. 1.Perform -step: Let k+1 = k + d k where d k >0 2.The initial guess for q k+1 at k+1 is q k+1 (0) = q k + for some small perturbation . 3.Optimization: solve max q (G(q) + k+1 D(q)) to get the maximizer q k+1, using initial guess q k+1 (0).
15
Application of the annealing method to the Information Distortion problem max q (H(Y N |Y) + I(X;Y N )) when p(X,Y) is defined by four gaussian blobs Inputs Outputs X Y 52 objects p(X,Y) YYNYN q(Y N |Y) 52 objectsN objects I(X;Y N )=D(q(Y N |Y))
16
Observed Bifurcations for the Four Blob problem: We just saw the optimal clusterings q * at some * = max. What do the clusterings look like for < max ??
17
Bifurcations of q * ( ) Observed Bifurcations for the 4 Blob Problem Conceptual Bifurcation Structure q*
18
?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating solutions are there? What do the bifurcating branches look like? Are they subcritical or supercritical ? What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? Are there bifurcations after all of the classes have resolved ? q* Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem
19
Bifurcations with symmetry To better understand the bifurcation structure, we capitalize on the symmetries of the function G(q)+ D(q) The “obvious” symmetry is that G(q)+ D(q) is invariant to relabelling of the N classes of Y N The symmetry group of all permutations on N symbols is S N. switch labels 1 and 3
20
Symmetry Breaking Bifurcations q*
21
Symmetry Breaking Bifurcations q*
22
Symmetry Breaking Bifurcations q*
23
Symmetry Breaking Bifurcations q*
24
Symmetry Breaking Bifurcations q*
25
Existence Theorems for Bifurcating Branches Given a bifurcation at a point fixed by S N, Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1) There are N bifurcating branches, each which have symmetry S N-1. The Smoller-Wasserman Theorem (Smoller and Wasserman 1985-6) There are bifurcating branches which have symmetry for every prime p|N, p<N. q*
26
Given a bifurcation at a point fixed by S N-1, Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1) Gives N-1 bifurcating branches which have symmetry S N-2. The Smoller-Wasserman Theorem (Smoller and Wasserman 1985-6) Gives bifurcating branches which have symmetry for every prime p|N-1, p<N-1. When N = 4, N-1=3, there are no bifurcating branches given by SW Theorem. q* Existence Theorems for Bifurcating Branches
27
Bifurcation Structure corresponds with Group Structure
28
A partial subgroup lattice for S 4 and the corresponding bifurcating directions given by the Equivariant Branching Lemma
29
A partial subgroup lattice for S 4 and the corresponding bifurcating directions given by the Smoller-Wasserman Theorem
30
q* Conceptual Bifurcation Structure
31
q* Conceptual Bifurcation Structure The Equivariant Branching Lemma shows that the bifurcation structure from S M to S M-1 is … Group Structure
32
q* Conceptual Bifurcation Structure q* Group Structure The Equivariant Branching Lemma shows that the bifurcation structure from S M to S M-1 is …
33
The Smoller-Wasserman Theorem shows additional structure … q* Conceptual Bifurcation Structure Group Structure
34
q* Conceptual Bifurcation Structure Group Structure q* The Smoller-Wasserman Theorem shows additional structure … 3 branches from the S 4 to S 3 bifurcation only.
35
q* Conceptual Bifurcation Structure q* If we stay on a branch which is fixed by S M, how many bifurcations are there?
36
q* Conceptual Bifurcation Structure Group Structure q* Theorem: There are at exactly K/N bifurcations on the branch (q 1/N, ) for the Information Distortion problem There are 13 bifurcations on the first branch
37
Bifurcation theory in the presence of symmetries enables us to answer the questions previously posed …
38
?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating solutions are there? What do the bifurcating branches look like? Are they subcritical or supercritical ? What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? Are there bifurcations after all of the classes have resolved ? q* Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem
39
?????? Why are there only 3 bifurcations observed? In general, are there only N-1 bifurcations? There are N-1 symmetry breaking bifurcations from S M to S M-1 for M N. What kinds of bifurcations do we expect: pitchfork-like, transcritical, saddle-node, or some other type? How many bifurcating solutions are there? There are at least N from the first bifurcation, at least N-1 from the next one, etc. What do the bifurcating branches look like? They are subcritical or supercritical depending on the sign of the bifurcation discriminator (q *, *,u k ). What is the stability of the bifurcating branches? Is there always a bifurcating branch which contains solutions of the optimization problem? No. Are there bifurcations after all of the classes have resolved ? In general, no. Conceptual Bifurcation Structure Observed Bifurcations for the 4 Blob Problem q*
40
We can explain the bifurcation structure of all problems of the form max q F(q, ) = max q (G(q)+ D(q)) where [0, ). is a subset of R NK. G and D are sufficiently smooth in . G and D are invariant to relabelling of the classes of Y N The blocks of the Hessian q (G+ D) at bifurcation satisfy a set of generic conditions. This class of problems includes the Information Distortion problem.
41
Symmetry breaking bifurcation Impossible scenario Saddle-node bifurcation Impossible scenario Non-generic chapter 6 chapter 8chapter 4
42
Continuation techniques provide numerical confirmation of the theory
43
Previously Observed Bifurcation Structure for the Four Blob problem:
44
Equivariant Branching Lemma: Previous vs. Actual Bifurcation Structure We used Continuation Techniques and the Theory of Bifurcations with Symmetries on the 4 Blob Problem using the Information Distortion method to get this picture. Previous results: Actual structure: Singularity of F: Singularity of L : *
45
q*
46
Smoller-Wasserman Theorem: there are bifurcating branches with symmetry = q*
47
A closer look … q*
48
Bifurcation from S 4 to S 3 … q*
49
The bifurcation from S 4 to S 3 is subcritical … (the theory predicted this since the bifurcation discriminator (q 1/4, *,u)<0 )
50
q* Bifurcation from S 3 to S 2 …
51
The bifurcation from S 3 to S 2 is subcritical …
52
q* Bifurcation from S 2 to S 1 …
53
The bifurcation from S 2 to S 1 …
54
What are these branches ??? q*
56
Theorem: In general, either symmetry breaking bifurcations or saddle-node bifurcations can occur. Outline of proof: The Equivariant Branching Lemma, Smoller-Wasserman Theorem, and the following singularity structure: Conclusions Symmetry breaking bifurcation Impossible Scenario Saddle-node bifurcation Impossible scenario Non-generic
57
Theorem: All symmetry breaking bifurcations from S M to S M-1 are pitchfork-like, and there exists M bifurcating branches, for which we have explicit directions. Conclusions q*
58
Theorem: The bifurcation discriminator of the pitchfork-like branch (q *, *, * ) + (tu,0, (t)) is If (q *, *,u k ) 0, then the branch is supercritical. Conclusions
59
Theorem: Solutions of the optimization problem do not always persist from bifurcation. Theorem: In general, bifurcations do not occur after all of the classes have resolved. Conclusions
60
A numerical algorithm to solve max(G(q)+ D(q)) Let q 0 be the maximizer of max q G(q), 0 =1 and s > 0. For k 0, let (q k, k ) be a solution to max q G(q) + D(q ). Iterate the following steps until K = max for some K. 1.Perform -step: solve for and select k+1 = k + d k where d k = ( s sgn(cos )) /(|| q k || 2 + || k || 2 +1) 1/2. 2.The initial guess for (q k+1, k+1 ) at k+1 is (q k+1 (0), k+1 (0) ) = (q k, k ) + d k ( q k, k ). 3.Optimization: solve max q (G(q) + k+1 D(q)) using pseudoarclength continuation to get the maximizer q k+1, and the vector of Lagrange multipliers k+1 using initial guess (q k+1 (0), k+1 (0) ). 4.Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(q k ) + k D(q k )] and q [G(q k+1 ) + k+1 D(q k+1 )]. If a bifurcation is detected, then set q k+1 (0) = q k + d_k u where u is bifurcating direction and repeat step 3.
62
Details … The Dynamical System Types of Singularities Continuation Techniques The Explicit Group of Symmetries Explicit Existence Theorems for bifurcating branches
63
A Class of Problems max F(q, ) = max(G(q)+ D(q)) G and D are sufficiently smooth in . G and D must be invariant under relabelling of the classes.
64
The Dynamical System Goal: To determine the bifurcation structure of solutions to max q (G(q) + D(q)) for [0, ). Method: Study the equilibria of the of the flow The Jacobian wrt q of the K constraints { YN q(Y N |y)-1} is J=(I K I K … I K ). If w T q F(q *, ) w < 0 for every w ker J, then q * ( ) is a maximizer of. The first equilibrium is q*( 0 = 0) 1/N. If w T q F(q *, ) w < 0 for every w ker J, then q * ( ) is a maximiYNer of. The first equilibrium is q*( 0 = 0) 1/N.
65
In our dynamical system the hessian determines the stability of equilibria and the location of bifurcation.. Properties of the Dynamical System
66
Symmetry breaking bifurcation Impossible scenario Saddle-node bifurcation Impossible scenario Non-generic chapter 6 chapter 8chapter 4
67
The Dynamical System How: Use numerical continuation in a constrained system to choose and to choose an initial guess to find the equilibria q*( ). Use bifurcation theory with symmetries to understand bifurcations of the equilibria. Investigating the Dynamical System
68
Continuation A local maximum q k * ( k ) of is an equilibrium of the gradient flow. Initial condition q k+1 (0) ( k+1 (0) ) is sought in tangent direction q k, which is found by solving the matrix system The continuation algorithm used to find q k+1 * ( k+1 ) is based on Newton’s method. Parameter continuation follows the dashed (---) path, pseudoarclength continuation follows the dotted (…) path
69
The Groups Let P be the finite group of n ×n “block” permutation matrices which represents the action of S N on q and F(q, ). For example, if N=3, permutes q(YN 1 |y) with q(YN 2 |y) for every y F(q, ) is P -invariant means that for every P, F( q, ) = F(q, ) Let be the finite group of (n+K) × (n+K) block permutation matrices which represents the action of S N on and q, L (q,, ): q, L (q,, ) is -equivariant means that for every q, L (q,, ) = q, L ( , )
70
Notation and Definitions The symmetry of is measured by its isotropy subgroup An isotropy subgroup is a maximal isotropy subgroup of if there does not exist an isotropy subgroup of such that . At bifurcation, the fixed point subspace of q*, * is
71
Equivariant Branching Lemma One of the Existence Theorems we use to describe a bifurcation in the presence of symmetries is the Equivariant Branching Lemma (Vanderbauwhede and Cicogna 1980-1). Idea: The bifurcation structure of local solutions is described by the isotropy subgroups of which have dim Fix( )=1. System:. r(x, ) is G -equivariant for some compact Lie Group G Fix( G )={0} Let H be an isotropy subgroup of G such that dim Fix ( H ) = 1. Assume r(0,0) 0 (crossing condition). Then there is a unique smooth solution branch (tx 0, (t)) to r = 0 such that x 0 Fix ( H ) and the isotropy subgroup of each solution is H.
72
From bifurcation, the Equivariant Branching Lemma shows that the following solutions emerge: An stationary point q * is M-uniform if there exists 1 M N and a K x 1 vector P such that q(y Ni |Y)=P for M and only M classes, {y Ni } N i=1 of Y N. These M classes of Y N are unresolved classes. The classes of Y N that are not unresolved are called resolved. The first equilibria, q * 1/N, is N-uniform. Theorem: q * is M-uniform if and only if q * is fixed by S M. Symmetry Breaking from S M to S M-1
73
Theorem: dim ker q F (q *, )=M with basis vectors {v i } M i=1 Theorem: dim ker q, L (q *,, )=M-1 with basis vectors Point: Since the bifurcating solutions whose existence is guaranteed by the EBL and the SW Theorem are tangential to ker q, L (q *,, ), then we know the explicit form of the bifurcating directions. Kernel of the Hessian at Symmetry Breaking Bifurcation
74
Assumptions: Let q * be M-uniform Call the M identical blocks of q F (q *, ): B. Call the other N-M blocks of q F (q *, ): {R }. We assume that B has a single nullvector v and that R is nonsingular for every . If M<N, then B R -1 + MI K is nonsingular. Theorem: Let (q *, *, * ) be a singular point of the flow such that q * is M-uniform. Then there exists M bifurcating (M-1)- uniform solutions (q *, *, * ) + (tu k,0, (t)), where Symmetry Breaking Bifurcation from M-uniform solutions
75
Symmetry breaking bifurcation Impossible scenario Saddle-node bifurcation Impossible scenario Non-generic chapter 6 chapter 8chapter 4
76
Some of the bifurcating branches when N = 4 are given by the following isotropy subgroup lattice for S 4
77
For the 4 Blob problem: The isotropy subgroups and bifurcating directions of the observed bifurcating branches isotropy group: S 4 S 3 S 2 1 bif direction: (-v,-v,3v,-v,0) T (-v,2v,0,-v,0) T (-v,0,0,v,0) T … No more bifs!
78
Smoller-Wasserman Theorem The other Existence Theorem: Smoller-Wasserman Theorem (1985-6) For variational problems where there is a bifurcating solution tangential to Fix( H ) for every maximal isotropy subgroup H, not only those with dim Fix( H ) = 1. dim Fix( H ) =1 implies that H is a maximal isotropy subgroup
79
The Smoller-Wasserman Theorem shows that (under the same assumptions as before) if M is composite, then there exists bifurcating solutions with isotropy group for every element of order M in and every prime p|M, p<M. Furthermore, dim (Fix )=p-1 Other branches
80
Bifurcating branches from a 4-uniform solution are given by the following isotropy subgroup lattice for S 4
81
Maximal isotropy subgroup for S 4
82
Issues: S M The full lattice of subgroups of the group S M is not known for arbitrary M. The lattice of maximal subgroups of the group S M is not known for arbitrary M.
83
More about the Bifurcation Structure Theorem: All symmetry breaking bifurcations from S M to S M-1 are pitchfork-like. Outline of proof: ’(0)=0 since 2 xx r(0,0) =0. Theorem: The bifurcation discriminator of the pitchfork-like branch (q *, *, * ) + (tu k,0, (t)) is If (q *, *,u k ) 0, then the branch is supercritical. Theorem: Generically, bifurcations do not occur after all of the classes have resolved. Theorem: If dim (ker q, L (q *,, )) = 1, and if a crossing condition is satisfied, then saddle-node bifurcation must occur.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.