Presentation is loading. Please wait.

Presentation is loading. Please wait.

We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q,  )=max (G(q)+  D(q)).

Similar presentations


Presentation on theme: "We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q,  )=max (G(q)+  D(q))."— Presentation transcript:

1 We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q,  )=max (G(q)+  D(q)). The goal is to solve for  = B  (0,  ), where:. G and D are infinitely differentiable in . G is strictly concave. D is convex. G and D must be invariant under relabeling of the classes. The hessian of F is block diagonal with N blocks {B  } and B  =B  if q(z  |y)= q(z  |y) for every y  Y. Problems in this Class Deterministic Annealing (Rose 1998) max H(Z|Y) -  D(Y,Z) Clustering Algorithm Rate Distortion Theory (Shannon ~1950) max –I(Y,Z) -  D(Y,Z) Optimal Source Coding Information Distortion (Dimitrov and Miller2001) max H(Z|Y) +  I(X,Z) Used in neural coding. Information Bottleneck Method (Tishby, Pereira, Bialek 2000) max –I(Y,Z) +  I(X,Z) Used for document classification, gene expression, neural coding and spectral analysis A Class of Problems

2 2 H(X) input sequences 2 H(Y) output sequences 2 I(X,Y) distinguishable input/output classes of (x,y) pairs Y X 12341234 Size of an input/output class: 2 (H(X|Y) + H(Y|X)) pairs Rate Distortion How well is the source X represented by Z? Information Distortion Goal: Determine the input/output classes of (x,y) pairs. Idea: We seek to quantize (X,Y) into clusters which correspond with the input/output classes. Method: We determine a quantizer, Q *, between X and Z, a representation of Y using N elements, such that F(Q*,B) is a maximum for some B  (0,  ). X Y P(Y |X) input source output source Z clustered outputs q * (Z |Y) Q * (Z |X) X p(X) Z is a representation of X using N symbols (or clusters) A good communication system has p(X,Y) like:

3 Some nice properties of the problem The feasible region , a product of simplices, is nice. Lemma  is the convex hull of vertices (  ). The optimal quantizer q* is DETERMINISTIC. Theorem The extrema of lie generically on the vertices of .. Corollary The optimal quantizer is invariant to small perturbations in the model. Solution of the problem when p(X,Y):= 4 gaussian blobs p(X,Y)I(X,Z) vs. N

4 Goal: To efficiently solve max q  (G(q) +  D(q)) for each , incremented in sufficiently small steps, as   B. Method: Study the equilibria of the of the flow The Jacobian wrt q of the K constraints {  z q(z|y)-1} is J = (I K I K … I K ). The first equilibrium is q*(  0 = 0)  1/N.. determines stability and location of bifurcation. Assumptions: Let q * be a local solution to and fixed by S M. Call the M identical blocks of  q F (q *,  ): B. Call the other N-M blocks of  q F (q *,  ): {R  }. At a singularity (q *, *,  * ), B has a single nullvector v and R  is nonsingular for every . If M<N, then B  R  -1 + MI K is nonsingular. Theorem: If  q, L (q *, *,  * ) is singular then  q F (q *,  * ) is singular. Theorem: (q *, *,  * ) is a bifurcation of equilibria of if and only if  q, L (q *, *,  * ) is singular. Theorem: If (q *, *,  * ) is a bifurcation of equilibria of, then  *  1. Theorem: dim (ker  q F (q *,  * )) = M with basis vectors w 1,w 2, …, w M Theorem: dim (ker  q, L (q *, *,  * )) = M-1 with basis vectors The Dynamical System

5 Continuation A local maximum q k * (  k ) of is an equilibrium of the gradient flow. Initial condition q k+1 (0) (  k+1 (0) ) is sought in the tangent direction   q k, which is found by solving the matrix system The continuation algorithm used to find q k+1 * (  k+1 ) is based on Newton’s method. How: Use numerical continuation in a constrained system to choose  and to choose an initial guess to find the equilibria q*(  ). Use bifurcation theory with symmetries to understand bifurcations of the equilibria. Investigating the Dynamical System

6 Bifurcations of q * (  ) Observed Bifurcations for the 4 Blob Problem Conceptual Bifurcation Structure  q*  (Y N |Y) Bifurcations with symmetry To better understand the bifurcation structure, we capitalize on the symmetries of the optimization function F(q,  ). The “obvious” symmetry is that F(q,  ) is invariant to relabeling of the N classes of Z The symmetry group of all permutations on N symbols is S N. The action of S N on and  q, L (q,,  ) is represented by the finite Lie Group where P is a “block permutation” matrix. The symmetry of is measured by its isotropy group, the subgroup of  which fixes it.

7 The Equivariant Branching Lemma gives the existence of bifurcating solutions for every isotropy subgroup which fixes a one dimensional subspace of ker  q, L (q *,,  ). Theorem: Let (q *, *,  * ) be a singular point of the flow such that q * is fixed by S M. Then there exists M bifurcating solutions, (q *, *,  * ) + (tu k,0,  (t)), each with isotropy group S M-1, where What do the bifurcations look like? Let T(q*,  *) = Transcritical or Degenerate? Theorem: If T(q*,  *)  0 and M>2, then the bifurcation at (q*,  *) is transcritical. If T(q*,  *) = 0, it is degenerate. Branch Orientation? Theorem: If T(q*,  *) > 0 or if T(q*,  *) < 0, then the branch is supercritical or subcritical respectively. If T(q*,  *) = 0, then  4 qqqq F(q,  ) dictates orientation. Branch Stability? Theorem: If T(q*,  *)  0, then all branches fixed by S M-1 are unstable. Bifurcation Structure 

8 Partial lattice of the isotropy subgroups of S 4 (and associated bifurcating directions) For the 4 blob problem: The isotropy subgroups and bifurcating directions of the observed bifurcating branches isotropy group: S 4 S 3 S 2 1 bif direction: (-v,-v,3v,-v,0) T (-v,2v,0,-v,0) T (-v,0,0,v,0) T …No more bifs!

9 The Smoller-Wasserman Theorem ascertains the existence of bifurcating branches for every maximal isotropy subgroup. Theorem: If M is a composite number, then there exists bifurcating solutions with isotropy group for every element  of order M in  and every prime p|M. The bifurcating direction is in the p-1 dimensional subspace of ker  q, L (q *,,  ) which is fixed by. We have never numerically observed solutions fixed by and so perhaps they are unstable. Other Branches An example of redundancy: (1423) 2 = (1324) 2 = (12)(34) The full lattice of subgroups of the group S M is not known for arbitrary M. Lattice of the maximal isotropy subgroups in S 4

10 The efficient algorithm to solve max F(q,  ) Let q 0 be the maximizer of max q G(q),  0 =1 and  s > 0. For k  0, let (q k,  k ) be a solution to max q (G(q) +  D(q )). Iterate the following steps until  K = B for some K. 1.Perform  -step: solve for and select  k+1 =  k + d k where d k =  s /(||   q k || 2 + ||   k || 2 +1) 1/2. 2.The initial guess for q k+1 at  k+1 is q k+1 (0) = q k + d k   q k. 3.Optimization: solve max q (G(q) +  k+1 D(q)) to get the maximizer q * k+1, using initial guess q k+1 (0). 4.Check for bifurcation: compare the sign of the determinant of an identical block of each of  q [G(q k ) +  k D(q k )] and  q [G(q k+1 ) +  k+1 D(q k+1 )]. If a bifurcation is detected, then set q k+1 (0) = q k + d k u where u is given by  and repeat step 3.


Download ppt "We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q,  )=max (G(q)+  D(q))."

Similar presentations


Ads by Google