Continuation and Symmetry Breaking Bifurcation of the Information Distortion Function September 19, 2002 Albert E. Parker Complex Biological Systems Department.

Slides:



Advertisements
Similar presentations
Boyce/DiPrima 9th ed, Ch 2.8: The Existence and Uniqueness Theorem Elementary Differential Equations and Boundary Value Problems, 9th edition, by William.
Advertisements

Chapter 9 Approximating Eigenvalues
Feature Selection as Relevant Information Encoding Naftali Tishby School of Computer Science and Engineering The Hebrew University, Jerusalem, Israel NIPS.
Discovering Cyclic Causal Models by Independent Components Analysis Gustavo Lacerda Peter Spirtes Joseph Ramsey Patrik O. Hoyer.
Center for Computational Biology Department of Mathematical Sciences Montana State University Collaborators: Alexander Dimitrov Tomas Gedeon John P. Miller.
Equilibrium Concepts in Two Player Games Kevin Byrnes Department of Applied Mathematics & Statistics.
Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.
MATH 685/ CSI 700/ OR 682 Lecture Notes
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Visual Recognition Tutorial
1cs542g-term Notes. 2 Solving Nonlinear Systems  Most thoroughly explored in the context of optimization  For systems arising in implicit time.
1 L-BFGS and Delayed Dynamical Systems Approach for Unconstrained Optimization Xiaohui XIE Supervisor: Dr. Hon Wah TAM.
Modelling and Control Issues Arising in the Quest for a Neural Decoder Computation, Control, and Biological Systems Conference VIII, July 30, 2003 Albert.
Symmetry Breaking Bifurcations of the Information Distortion Dissertation Defense April 8, 2003 Albert E. Parker III Complex Biological Systems Department.
Center for Computational Biology Department of Mathematical Sciences Montana State University Collaborators: Alexander Dimitrov John P. Miller Zane Aldworth.
L15:Microarray analysis (Classification) The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Symmetry Breaking Bifurcation of the Distortion Problem Albert E. Parker Complex Biological Systems Department of Mathematical Sciences Center for Computational.
Universidad de La Habana Lectures 5 & 6 : Difference Equations Kurt Helmes 22 nd September - 2nd October, 2008.
1 L-BFGS and Delayed Dynamical Systems Approach for Unconstrained Optimization Xiaohui XIE Supervisor: Dr. Hon Wah TAM.
Structural Stability, Catastrophe Theory, and Applied Mathematics
We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q,  )=max (G(q)+  D(q)).
1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.
Complexity 19-1 Complexity Andrei Bulatov More Probabilistic Algorithms.
Symmetry breaking clusters when deciphering the neural code September 12, 2005 Albert E. Parker Department of Mathematical Sciences Center for Computational.
Center for Computational Biology Department of Mathematical Sciences Montana State University Collaborators: Alexander Dimitrov John P. Miller Zane Aldworth.
A Bifurcation Theoretical Approach to the Solving the Neural Coding Problem June 28 Albert E. Parker Complex Biological Systems Department of Mathematical.
Collaborators: Tomas Gedeon Alexander Dimitrov John P. Miller Zane Aldworth Information Theory and Neural Coding PhD Oral Examination November 29, 2001.
We use Numerical continuation Bifurcation theory with symmetries to analyze a class of optimization problems of the form max F(q,  )=max (G(q)+  D(q)).
Phase Transitions in the Information Distortion NIPS 2003 workshop on Information Theory and Learning: The Bottleneck and Distortion Approach December.
KKT Practice and Second Order Conditions from Nash and Sofer
Boyce/DiPrima 9th ed, Ch 7.3: Systems of Linear Equations, Linear Independence, Eigenvalues Elementary Differential Equations and Boundary Value Problems,
Stability Analysis of Linear Switched Systems: An Optimal Control Approach 1 Michael Margaliot School of Elec. Eng. Tel Aviv University, Israel Joint work.
GROUPS & THEIR REPRESENTATIONS: a card shuffling approach Wayne Lawton Department of Mathematics National University of Singapore S ,
BMI II SS06 – Class 3 “Linear Algebra” Slide 1 Biomedical Imaging II Class 3 – Mathematical Preliminaries: Elementary Linear Algebra 2/13/06.
Wei Wang Xi’an Jiaotong University Generalized Spectral Characterization of Graphs: Revisited Shanghai Conference on Algebraic Combinatorics (SCAC), Shanghai,
MA2213 Lecture 5 Linear Equations (Direct Solvers)
Algorithms for a large sparse nonlinear eigenvalue problem Yusaku Yamamoto Dept. of Computational Science & Engineering Nagoya University.
Optimizing Scrip Systems: Efficiency, Crashes, Hoarders, and Altruists By Ian A. Kash, Eric J. Friedman, Joseph Y. Halpern Presentation by Avner May 12/10/08.
1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.
Chapter 3. Pitfalls Initialization Ambiguity in an iteration
1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.
Chapter 3 Solution of Algebraic Equations 1 ChE 401: Computational Techniques for Chemical Engineers Fall 2009/2010 DRAFT SLIDES.
ECE 8443 – Pattern Recognition LECTURE 10: HETEROSCEDASTIC LINEAR DISCRIMINANT ANALYSIS AND INDEPENDENT COMPONENT ANALYSIS Objectives: Generalization of.
The Information Bottleneck Method clusters the response space, Y, into a much smaller space, T. In order to informatively cluster the response space, the.
EASTERN MEDITERRANEAN UNIVERSITY Department of Industrial Engineering Non linear Optimization Spring Instructor: Prof.Dr.Sahand Daneshvar Submited.
Exact Differentiable Exterior Penalty for Linear Programming Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison December 20, 2015 TexPoint.
Chapter 2-OPTIMIZATION G.Anuradha. Contents Derivative-based Optimization –Descent Methods –The Method of Steepest Descent –Classical Newton’s Method.
METHOD OF STEEPEST DESCENT ELE Adaptive Signal Processing1 Week 5.
By: Jesse Ehlert Dustin Wells Li Zhang Iterative Aggregation/Disaggregation(IAD)
STATIC ANALYSIS OF UNCERTAIN STRUCTURES USING INTERVAL EIGENVALUE DECOMPOSITION Mehdi Modares Tufts University Robert L. Mullen Case Western Reserve University.
Linear Programming Chapter 9. Interior Point Methods  Three major variants  Affine scaling algorithm - easy concept, good performance  Potential.
F. Fairag, H Tawfiq and M. Al-Shahrani Department of Math & Stat Department of Mathematics and Statistics, KFUPM. Nov 6, 2013 Preconditioning Technique.
Center for Computational Biology Department of Mathematical Sciences Montana State University Collaborators: Alexander Dimitrov John P. Miller Zane Aldworth.
Matrices, Vectors, Determinants.
Numerical Methods for Inverse Kinematics Kris Hauser ECE 383 / ME 442.
INTERPOLATORY SOLUTIONS OF LINEAR ODE’S AND EXTENSIONS Wayne M. Lawton Dept. of Mathematics, National University of Singapore 2 Science Drive 2, Singapore.
Perturbation method, lexicographic method
Local Search Algorithms
GROUPS & THEIR REPRESENTATIONS: a card shuffling approach
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
CS5321 Numerical Optimization
Polyhedron Here, we derive a representation of polyhedron and see the properties of the generators. We also see how to identify the generators. The results.
Chapter 8. General LP Problems
Michael Margaliot School of Elec. Eng. -Systems
I.4 Polyhedral Theory (NW)
I.4 Polyhedral Theory.
Chapter 8. General LP Problems
Local Search Algorithms
Chapter 8. General LP Problems
Presentation transcript:

Continuation and Symmetry Breaking Bifurcation of the Information Distortion Function September 19, 2002 Albert E. Parker Complex Biological Systems Department of Mathematical Sciences Center for Computational Biology Montana State University Collaborators: Tomas Gedeon Alexander Dimitrov John P. Miller Zane Aldworth Bryan Roosien

Outline  Our Problem  A Class of Problems  Continuation  Bifurcation with Symmetries  How we can efficiently solve the Class of Problems

The Neural Coding Problem : We want to understand the neural code. We seek an answer to the question: How does neural activity represent information about environmental stimuli? “The little fly sitting in the fly’s brain trying to fly the fly”

The mathematical problem Optimizing the Information Distortion Function max F(q,  )= max (H(Z|Y)+  I(X,Z)) We are really just interested in max I(X,Z).. H(Z|Y) is the conditional entropy of Z|Y. I(X,Z) is the mutual information between X and Z. X Y Q(Y |X) environmental stimuli neural responses Z q(Z |Y) neural responses in N clusters

Annealing

Random clusters Application of the method to 4 gaussian blobs

Information Bottleneck Method (Tishby, Pereira, Bialek 2000) max –I(Y,Z) +  I(X,Z) Used for document classification, gene expression, neural coding and spectral analysis Deterministic Annealing (Rose 1998) max H(Z|Y) +  D(Y,Z) Clustering Algorithm Rate Distortion Theory (Shannon ~1950) max –I(Y,Z) +  D(Y,Z) Similar Problems

The Class of Problems max F(q,  )=max(G(q)+  D(q)) To apply the bifurcation theory, the above problem must satisfy: G and D are infinitely differentiable in . G is strictly concave. G and D must be invariant under relabeling of the classes. The hessian of F is block diagonal with N blocks and B  =B  if q(z  |y)= q(z  |y) for every y  Y.

The Dynamical System Goal: To efficiently solve max q  (G(q) +  D(q)) for each , incremented in sufficiently small steps, as . Method: Study the equilibria of the of the flow The Jacobian wrt q of the K constraints {  z q(z|y)-1} is J=(I K I K … I K ). If w T  q F(q *,  ) w < 0 for every w  ker J, then q * (  ) is a maximizer of. The first equilibrium is q*(  0 = 0)  1/N. If w T  q F(q *,  ) w < 0 for every w  ker J, then q * (  ) is a maximizer of. The first equilibrium is q*(  0 = 0)  1/N.

The Dynamical System How: Use numerical continuation in a constrained system to choose  and to choose an initial guess to find the equilibria q*(  ). Use bifurcation theory with symmetries to understand bifurcations of the equilibria. Investigating the Dynamical System

In our dynamical system the hessian determines the stability of equilibria and the location of bifurcation. Theorem: (q *,  * ) is a bifurcation of equilibria of if and only if  q, L (q *,  * ) is singular. Theorem: Let (q *,  * ) be a local solution to. If  q, L (q *,  * ) is singular, then  q F (q *,  * ) is singular. Properties of the Dynamical System

Continuation A local maximum q k * (  k ) of is an equilibrium of the gradient flow. Initial condition q k+1 (0) (  k+1 (0) ) is sought in tangent direction   q k, which is found by solving the matrix system The continuation algorithm used to find q k+1 * (  k+1 ) is based on Newton’s method. We cannot use Newton’s method directly, since we have a constrained problem. Instead, we use an Augmented Lagrangian or an implicit solution method.

 q* Bifurcation! What to do? How to continue after bifurcation?

Do not want to get stuck on a local max  global max 

Bifurcations of q * (  ) Observed Bifurcations for the 4 Blob Problem Conceptual Bifurcation Structure  q*  (Y N |Y)

Questions … How to detect bifurcation? What kinds of bifurcations do we expect? And how many bifurcating solutions are there? How to choose a direction to take after bifurcation is detected? Explain the nice bifurcation structure that is observed numerically. 1.Why are there only N-1 bifurcations observed? 2.Why are there no bifurcations observed after all N classes have resolved?

Bifurcations with symmetry To better understand the bifurcation structure, we capitalize on the symmetries of the optimization function F(q,  ). The “obvious” symmetry is that F(q,  ) is invariant to relabeling of the N classes of Z The symmetry group of all permutations on N symbols is S N. We need to define the action of S N on q and F(q,  ) …

The Groups Let P be the finite group of n ×n “block” permutation matrices which represents the action of S N on q and F(q,  ). For example, if N=3, permutes q(z 1 |y) with q(z 2 |y) for every y F(q,  ) is P -invariant means that for every   P, F(  q,  ) = F(q,  ) Let  be the finite group of (n+K) × (n+K) block permutation matrices which represents the action of S N on and  q, L (q,,  ):  q, L (q,,  ) is  -equivariant means that for every     q, L (q,,  ) =  q, L ( ,  )

Bifurcations with symmetry The symmetry of is measured by its isotropy subgroup An isotropy subgroup  is a maximal isotropy subgroup of  if there does not exist an isotropy subgroup  of  such that     . At bifurcation, the fixed point subspace of  q*, * is

Bifurcations with symmetry One of the tools we use to describe a bifurcation in the presence of symmetries is the Equivariant Branching Lemma (Vanderbauwhede and Cicogna ). Idea: The bifurcation structure of local solutions is described by the isotropy subgroups of  which have dim Fix(  )=1. System:. r(x,  ) is G -equivariant for some compact Lie Group G Fix( G )={0} Let H be an isotropy subgroup of G such that dim Fix ( H ) = 1. Assume   r(0,0) x 0  0 for nontrivial x 0  Fix ( H ) (crossing condition). Then there is a unique smooth solution branch (tx 0,  (t)) to r = 0 such that x 0  Fix ( H ) and the isotropy subgroup of each solution is H.

Bifurcations with symmetry The other tool: Smoller-Wasserman Theorem (1985-6) For variational problems where there is a bifurcating solution tangential to Fix( H ) for every maximal isotropy subgroup H, not only those with dim Fix( H ) = 1. dim Fix( H ) =1 implies that H is a maximal isotropy subgroup

From bifurcation, the Equivariant Branching Lemma shows that the following solutions emerge: An equilibria q * is called M-uniform if  q F (q *,  ) has M blocks that are identical. The M classes of Z corresponding to these M identical blocks are called unresolved classes. The classes of Z that are not unresolved are called resolved. The first equilibria, q *  1/N, is N-uniform. Theorem: q * is M-uniform if and only if q * is fixed by S M. What do the bifurcations look like?

Theorem: dim ker  q F (q *,  )=M with basis vectors {v i } M i=1 Theorem: dim ker  q, L (q *,,  )=M-1 with basis vectors Point: Since the bifurcating solutions whose existence is guaranteed by the EBL and the SW Theorem are tangential to ker  q, L (q *,,  ), then we know the explicit form of the bifurcating directions. What do the bifurcations look like?

Assumptions: Let q * be M-uniform Call the M identical blocks of  q F (q *,  ): B. Call the other N-M blocks of  q F (q *,  ): {R  }. We assume that B has a single nullvector v and that R  is nonsingular for every . If M<N, then B  R  -1 + MI K is nonsingular. Theorem: Let (q *, *,  * ) be a singular point of the flow such that q * is M-uniform. Then there exists M bifurcating (M-1)- uniform solutions (q *, *,  * ) + (tu k,0,  (t)), where What do the bifurcations look like?

Example: Some of the bifurcating branches when N=4 are given by the following isotropy subgroup lattice for S 4

For the 4 Blob problem: The isotropy subgroups and bifurcating directions of the observed bifurcating branches isotropy group: S 4 S 3 S 2 1 bif direction: (-v,-v,3v,-v,0) T (-v,2v,0,-v,0) T (-v,0,0,v,0) T … No more bifs!

The Smoller-Wasserman Theorem shows that (under the same assumptions as before) if M is composite, then there exists bifurcating solutions with isotropy group for every element  of order M in  and every prime p|M. Furthermore, dim (Fix )=p-1 We have never numerically observed solutions fixed by and so perhaps they are unstable. Are there other branches?

Bifurcating branches from a 4-uniform solution are given by the following isotropy subgroup lattice for S 4

Maximal isotropy subgroup for S 4

Issues: S M The full lattice of subgroups of the group S M is not known for arbitrary M.

Bifurcation Type? Pitchfork? Conjecture: There are only pitchfork bifurcations. (show that for  ’(t), which depends on  3 qqq F(q,  ), that  ’(0)=0 for any M – have this result for M=2,3) Subcritical or Supercritical? If not pitchfork,  ’(0) >0 or  ’(0) <0 answers this question. If pitchfork, one needs to examine  ’’(0), depends on  4 qqqq F(q,  ). Stability? (Is the bifurcating solution a maximum to ?)

Answered Questions How to detect bifurcation? Look for singularity of B. What kinds of bifurcations do we expect? And how many bifurcating solutions are there? M bifurcating (M-1)-uniform solutions How to choose a direction to take after bifurcation is detected? ((M-1)v, -v, -v, -v, … -v, 0) T Explain the nice bifurcation structure that is observed numerically. 1.Why are there only N bifurcations observed. There are only N different types of M-uniform solutions for M  N. 2.Why are there no bifurcations observed after all N classes have resolved. For 1-uniform solutions,  q, L (q *,,  ) is nonsingular.

The efficient algorithm Let q 0 be the maximizer of max q G(q),  0 =1 and  s > 0. For k  0, let (q k,  k ) be a solution to max q G(q) +  D(q ). Iterate the following steps until  K =  max for some K. 1.Perform  -step: solve for and select  k+1 =  k + d k where d k =  s /(||   q k || 2 + ||   k || 2 +1) 1/2. 2.The initial guess for q k+1 at  k+1 is q k+1 (0) = q k + d k   q k. 3.Optimization: solve max q G(q) +  k+1 D(q) to get the maximizer q k+1, using initial guess q k+1 (0). 4.Check for bifurcation: compare the sign of the determinant of an identical block of each of  q [G(q k ) +  k D(q k )] and  q [G(q k+1 ) +  k+1 D(q k+1 )]. If a bifurcation is detected, then set q k+1 (0) = q k + d_k u where u is in Fix(H) and repeat step 3.