Proving Non-Reconstruction on Trees by an Iterative Algorithm Elitza Maneva University of Barcelona joint work with N. Bhatnagar, Hebrew University
?
Optimal Algorithm for Reconstruction: Belief Propagation computes the distribution at the root given the boundary
Random variable L R (n) : colors at vertices at level n. For what degree is lim n E[Pr [ R at the root | L R (n) ]] = 1/q ?
Random variable L R (n) : colors at vertices at level n. For what degree is lim n d TV (L R (n), L G (n)) = 0 ? Total variation distance: d TV ( , )=1/2 L D | (L)- (L)|
Interest in reconstruction (in chronological order) Probability: Extremality of the free-boundary Gibbs measure (since 1960s) Phylogeny: reconstructing ancestry tree of collection of species Physics: Replica symmetry breaking (dynamical transition) in spin-glasses Computer Science: Glauber dynamics, MCMC, message-passing algorithms
Space of solutions of random Constraint Satisfaction Problems n variables dn constraints are chosen at random 0 EasyHard Unsat d drdr dsds
lim n d TV (L R (n), L G (n)) = 0 Tree: Random graph:
lim n d TV (L R (n), L G (n)) = 0 lim n d TV (L R (n), L G (n)) > 0 Tree: Random graph:
lim n d TV (L R (n), L G (n)) = 0 lim n d TV (L R (n), L G (n)) > 0 Tree: Random graph: Conjecture
Other models Potts model: q colors, parameter [0,1] color of child node = same as parent’s color with prob. every other color with prob. (1- )/(q-1) Asymmetric channels: q colors, q q matrix M M i, j = Prob [child gets color j | parent is color i] Same on Galton-Watson trees (i.e. random degrees) 3-SAT: x1x1 x2x2 x3x3 x4x4 x5x
RSB, clustering of solutions and reconstruction highlights [Mezard-Parisi-Zechina ‘03] Survey Propagation algorithm and satisfiability threshold calculation for 3-SAT and coloring, based on Replica Symmetry Breaking Ansatz. [Mezard-Montanari ‘06] The dynamic replica symmetry breaking threshold is the same as the threshold for reconstruction on the tree. [Achlioptas--Coja-Oghlan ‘08] For sufficiently large q, there exists a sequence q 0 s.t. the space of colorings is clustered for random graphs of average degree (1+ q ) q log q < d < (2- q ) q log q. [Sly ‘08] For sufficiently large q: q (log q + log log q ln 2 -o(1)) < d r < q (log q + log log q + 1+o(1))
RSB, clustering of solutions and reconstruction highlights [Mezard-Parisi-Zechina ‘03] Survey Propagation algorithm and satisfiability threshold calculation for 3-SAT and coloring, based on Replica Symmetry Breaking Ansatz. [Mezard-Montanari ‘06] The dynamic replica symmetry breaking threshold is the same as the threshold for reconstruction on the tree. [Achlioptas--Coja-Oghlan ‘08] For sufficiently large q, there exists a sequence q 0 s.t. the space of colorings is clustered for random graphs of average degree (1+ q ) q log q < d < (2- q ) q log q. [Sly ‘08] For sufficiently large q: q (log q + log log q ln 2 -o(1)) < d r < q (log q + log log q + 1+o(1))
Easy lower bound by coupling
d reconstruction > q-1
Upper bound: boundary forcing the root d reconstruction soln ( ∑ (-1) j ( ) (q-1-jx) d /(q-1) d = x ) q-1 j q-1 j=0
The bounds for coloring [Zdeborova-Krzakala ‘07] –Heuristic algorithm for computing threshold for general models –Heuristic analysis predicted the asymptotic result of Sly [Sly ‘08] [Bhatnagar-Vera-Vigoda ‘08] For q sufficiently large: Hard step (need large q): After sufficiently many iterations d TV < 2/q. Easy step: Given that d TV < 2/q, d TV goes to 0. [Bhatnagar-Maneva ‘09] –Rigorous algorithm for getting upper bounds on d TV for general models –Concrete bounds on the threshold for Potts model with small q. q Lower bound Upper bound [ZK07] (heur.)
Here we need: Recursion on the distribution over possible distributions at the root when the boundary is chosen at random BP: recursion on the distribution at the root given the boundary
f:(R q ) d R q f i ( 1 2,…, d ) = t=1 ( j i t j ) || || = i i 1 2,… d d d Some notation
Recursion on the tree depth Q G n random q-dim vector Q G n (R) := Prob[R|L], where L~ L G (n) Pr[Q G n+1 = ] = 1/(q-1) d Pr [f(Q c 1 n, …, Q c d n ) ] c 1,.., c d {1,..,q}\G
Population dynamics Pr[Q G n+1 = ] = 1/(q-1) d Pr [f(Q c 1 n, …, Q c d n ) ] Keep “populations” of N samples each from the distributions of Q R n, Q G n, …etc. Generating the population of Q G n+1 : –Choose d colors c 1, …, c d from {1, …, q}\G independently –Choose 1, …, d randomly respectively from the populations for Q c 1 n, …, Q c d n –Save f( 1, …, d )/||f( 1, …, d )|| into the population for Q G n+1 c 1,..,c d {1,..,q}\G
Recursions on the tree depth Conditional recursion: Q G n random q-dim vector Q G n (R) := Prob[R|L], where L~ L G (n) Pr[Q G n+1 = ] = 1/(q-1) d Pr [f(Q c 1 n, …, Q c d n ) ] Unconditional recursion: Q n random q-dim vector Q n (R) := Prob[R|L], L is a random boundary Pr[Q n+1 = ] E[ ||f(Q n (1),…, Q n (d) )|| Ind[ f(Q n (1),…,Q n (d) ) ] c 1,.., c d {1,..,q}\G
Discrete surveys algorithm Pr[Q n+1 = ] E[ ||f(Q n (1),…, Q n (d) )|| Ind[f(Q n (1),…, Q n (d) ) ] Keep a “survey” of the distribution of Q n Generate the survey of Q n+1 by applying the recursion to the survey of Q n R G distrib. of Q n
Discrete surveys algorithm Pr[Q n+1 = ] E[ ||f(Q n (1),…, Q n (d) )|| Ind[f(Q n (1),…, Q n (d) ) ] Keep a “survey” of the distribution of Q_ n Generate the survey of Q n+1 by applying the recursion to the survey of Q n R G distrib. of Q n 01 1 R G survey of Q n
Definition of a discrete survey Let P be the space of q-dim probability vectors. Let S = (S 1, …, S k ) P and convex hull of S is Let 1, … k be functions i : [0,1], s.t for every : 1. i i ( ) =1 2. = i i ( ) S i ( define a convex decomposition of ). Let P be a random element of P with support in. Let C be a random element of S with Pr[C=S i ] = E[ i (P)]. Then we say that C is a survey of P on the skeleton ( S, 1, … k )
Properties of discrete surveys Transitivity: If C is a survey of P and D is a survey of C then D is a survey of P. Mixing: If C 1, C 2 are surveys respectively of P 1 and P 2 then the r. v. with distribution the mixture p C 1 +(1-p) C 2 is a survey of the r.v. with distribution p P 1 +(1-p) P 2. For any multi-affine function f : P d P, if C 1, …, C d are surveys of P 1, …, P d then the r.v. D defined by Pr[D= ] E [ ||f(C 1, …, C d )|| x Ind [f(C 1, …, C d ) ] ] is a survey of the r.v. Q defined by Pr[Q= ] E [ ||f(P 1, …, P d )|| x Ind [f(P 1, …, P d ) ] ] For a convex function g on P, if C is a survey of P then E[g(P)] ≤ E[g(C)].
Manual part of the algorithm Selection of skeletons of small size k –complexity of the algorithm: O(nk d ) –k generally needs to be exponential in q –the skeleton can be refined progressively Examples on Potts model: –q=3, d=3, =0 n=14, k=19 was enough –q=3, d=2, =0.79 n<100, k≤208 was enough –q=3, d=3, =0.7 n<100, k≤85 was enough –q=3, d=2 or 3, =0.74 n<100, k≤61 was enough
A proof that d TV is small implies d TV 0 There is no general strategy For Potts model, due to [Sly ‘09]: x n := E L~L (n) [Prob[R|L]-1/q]= q E L [ (Prob[R|L]-1/q) 2 ] we have x n+1 ≤ d 2 x n + c 2 (q,d, ) x n 2 +… +c d (q,d, ) x n d Thus we could find >0 and c<1 such that if x n < then x n+1 < c x n R
Bound of Formentin and Külske ‘09 α: stationary vector of positive matrix M S(p|α) := Σ i p(i) log( p(i)/α(i) ) L(p) := S(p|α) + S(α|p) M rev (i,j) := α(j) M(j, i)/α(i) c(M):= sup p L(pM rev )/L(p) Theorem: If E[d] c(M) < 1 then no reconstruction.
Important Questions For the Potts model better bounds were obtained by [Formentin-Külske ‘09]. Could they be tight? Can their method be generalized to models with hard constraints? The design of the Survey Propagation algorithm also includes a discretization step - could this step be done in a controlled manner too? How are reconstruction on trees and clustering of solutions related?
[Mezard, Montanari `05] The dynamical transition at d correspond to the phase transition for reconstruction on the tree.
[Mezard, Montanari `05] The dynamical transition at d correspond to the phase transition for reconstruction on the tree.
[Mezard, Montanari `05] The dynamical transition at d correspond to the phase transition for reconstruction on the tree. ?
[Allan Sly `08] For q-coloring: d ≤ q(log q + log log q o(1)) (also [Zdeborova, Krzakala ’07]) d ≥ q(log q + log log q + 1 – ln 2 –o(1)) For constant q it is open. Estimates can be obtained with population dynamics.
What phenomena on the tree are described by the other transitions? Can we make population dynamics official? Find rigorous approximations for it? (it would imply that is an upper bound in the threshold by the results of [Franz, Leone `03] and [Talagrand Panchenko `03]) About clustering: is “phase” and “cluster” really the same thing?