Download presentation
Presentation is loading. Please wait.
1
We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q, )=max (G(q)+ D(q)). The goal is to solve for = B (0, ), where:. G and D are infinitely differentiable in . G is strictly concave. D is convex. G and D must be invariant under relabeling of the classes. The hessian of F is block diagonal with N blocks {B } and B =B if q(z |y)= q(z |y) for every y Y. Problems in this Class Deterministic Annealing (Rose 1998) max H(Z|Y) - D(Y,Z) Clustering Algorithm Rate Distortion Theory (Shannon ~1950) max –I(Y,Z) - D(Y,Z) Optimal Source Coding Information Distortion (Dimitrov and Miller2001) max H(Z|Y) + I(X,Z) Used in neural coding. Information Bottleneck Method (Tishby, Pereira, Bialek 2000) max –I(Y,Z) + I(X,Z) Used for document classification, gene expression, neural coding and spectral analysis A Class of Problems
2
2 H(X) input sequences 2 H(Y) output sequences 2 I(X,Y) distinguishable input/output classes of (x,y) pairs Y X 12341234 Size of an input/output class: 2 (H(X|Y) + H(Y|X)) pairs Rate Distortion How well is the source X represented by Z? Information Distortion Goal: Determine the input/output classes of (x,y) pairs. Idea: We seek to quantize (X,Y) into clusters which correspond with the input/output classes. Method: We determine a quantizer, Q *, between X and Z, a representation of Y using N elements, such that F(Q*,B) is a maximum for some B (0, ). X Y P(Y |X) input source output source Z clustered outputs q * (Z |Y) Q * (Z |X) X p(X) Z is a representation of X using N symbols (or clusters) A good communication system has p(X,Y) like:
3
Some nice properties of the problem The feasible region , a product of simplices, is nice. Lemma is the convex hull of vertices ( ). The optimal quantizer q* is DETERMINISTIC. Theorem The extrema of lie generically on the vertices of .. Corollary The optimal quantizer is invariant to small perturbations in the model. Solution of the problem when p(X,Y):= 4 gaussian blobs p(X,Y)I(X,Z) vs. N
4
Goal: To efficiently solve max q (G(q) + D(q)) for each , incremented in sufficiently small steps, as B. Method: Study the equilibria of the of the flow The Jacobian wrt q of the K constraints { z q(z|y)-1} is J = (I K I K … I K ). The first equilibrium is q*( 0 = 0) 1/N.. determines stability and location of bifurcation. Assumptions: Let q * be a local solution to and fixed by S M. Call the M identical blocks of q F (q *, ): B. Call the other N-M blocks of q F (q *, ): {R }. At a singularity (q *, *, * ), B has a single nullvector v and R is nonsingular for every . If M<N, then B R -1 + MI K is nonsingular. Theorem: If q, L (q *, *, * ) is singular then q F (q *, * ) is singular. Theorem: (q *, *, * ) is a bifurcation of equilibria of if and only if q, L (q *, *, * ) is singular. Theorem: If (q *, *, * ) is a bifurcation of equilibria of, then * 1. Theorem: dim (ker q F (q *, * )) = M with basis vectors w 1,w 2, …, w M Theorem: dim (ker q, L (q *, *, * )) = M-1 with basis vectors The Dynamical System
5
Continuation A local maximum q k * ( k ) of is an equilibrium of the gradient flow. Initial condition q k+1 (0) ( k+1 (0) ) is sought in the tangent direction q k, which is found by solving the matrix system The continuation algorithm used to find q k+1 * ( k+1 ) is based on Newton’s method. How: Use numerical continuation in a constrained system to choose and to choose an initial guess to find the equilibria q*( ). Use bifurcation theory with symmetries to understand bifurcations of the equilibria. Investigating the Dynamical System
6
Bifurcations of q * ( ) Observed Bifurcations for the 4 Blob Problem Conceptual Bifurcation Structure q* (Y N |Y) Bifurcations with symmetry To better understand the bifurcation structure, we capitalize on the symmetries of the optimization function F(q, ). The “obvious” symmetry is that F(q, ) is invariant to relabeling of the N classes of Z The symmetry group of all permutations on N symbols is S N. The action of S N on and q, L (q,, ) is represented by the finite Lie Group where P is a “block permutation” matrix. The symmetry of is measured by its isotropy group, the subgroup of which fixes it.
7
The Equivariant Branching Lemma gives the existence of bifurcating solutions for every isotropy subgroup which fixes a one dimensional subspace of ker q, L (q *,, ). Theorem: Let (q *, *, * ) be a singular point of the flow such that q * is fixed by S M. Then there exists M bifurcating solutions, (q *, *, * ) + (tu k,0, (t)), each with isotropy group S M-1, where What do the bifurcations look like? Let T(q*, *) = Transcritical or Degenerate? Theorem: If T(q*, *) 0 and M>2, then the bifurcation at (q*, *) is transcritical. If T(q*, *) = 0, it is degenerate. Branch Orientation? Theorem: If T(q*, *) > 0 or if T(q*, *) < 0, then the branch is supercritical or subcritical respectively. If T(q*, *) = 0, then 4 qqqq F(q, ) dictates orientation. Branch Stability? Theorem: If T(q*, *) 0, then all branches fixed by S M-1 are unstable. Bifurcation Structure
8
Partial lattice of the isotropy subgroups of S 4 (and associated bifurcating directions) For the 4 blob problem: The isotropy subgroups and bifurcating directions of the observed bifurcating branches isotropy group: S 4 S 3 S 2 1 bif direction: (-v,-v,3v,-v,0) T (-v,2v,0,-v,0) T (-v,0,0,v,0) T …No more bifs!
9
The Smoller-Wasserman Theorem ascertains the existence of bifurcating branches for every maximal isotropy subgroup. Theorem: If M is a composite number, then there exists bifurcating solutions with isotropy group for every element of order M in and every prime p|M. The bifurcating direction is in the p-1 dimensional subspace of ker q, L (q *,, ) which is fixed by. We have never numerically observed solutions fixed by and so perhaps they are unstable. Other Branches An example of redundancy: (1423) 2 = (1324) 2 = (12)(34) The full lattice of subgroups of the group S M is not known for arbitrary M. Lattice of the maximal isotropy subgroups in S 4
10
The efficient algorithm to solve max F(q, ) Let q 0 be the maximizer of max q G(q), 0 =1 and s > 0. For k 0, let (q k, k ) be a solution to max q (G(q) + D(q )). Iterate the following steps until K = B for some K. 1.Perform -step: solve for and select k+1 = k + d k where d k = s /(|| q k || 2 + || k || 2 +1) 1/2. 2.The initial guess for q k+1 at k+1 is q k+1 (0) = q k + d k q k. 3.Optimization: solve max q (G(q) + k+1 D(q)) to get the maximizer q * k+1, using initial guess q k+1 (0). 4.Check for bifurcation: compare the sign of the determinant of an identical block of each of q [G(q k ) + k D(q k )] and q [G(q k+1 ) + k+1 D(q k+1 )]. If a bifurcation is detected, then set q k+1 (0) = q k + d k u where u is given by and repeat step 3.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.