Proving Non-Reconstruction on Trees by an Iterative Algorithm Elitza Maneva University of Barcelona joint work with N. Bhatnagar, Hebrew University.

Slides:

Advertisements

Similar presentations

Recurrences : 1 Chapter 3. Growth of function Chapter 4. Recurrences.

Advertisements

Algorithmic and Economic Aspects of Networks Nicole Immorlica.

Sub Exponential Randomize Algorithm for Linear Programming Paper by: Bernd Gärtner and Emo Welzl Presentation by : Oz Lavee.

Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.

Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.

Theory of Computing Lecture 3 MAS 714 Hartmut Klauck.

Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.

Exact Inference in Bayes Nets

1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.

An Introduction to Variational Methods for Graphical Models.

Survey Propagation Algorithm

On the Spread of Viruses on the Internet Noam Berger Joint work with C. Borgs, J.T. Chayes and A. Saberi.

Entropy Rates of a Stochastic Process

CS774. Markov Random Field : Theory and Application Lecture 04 Kyomin Jung KAIST Sep

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

Visual Recognition Tutorial

Why almost all k-colorable graphs are easy A. Coja-Oghlan, M. Krivelevich, D. Vilenchik.

. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.

CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.

The Connectivity of Boolean Satisfiability: Structural and Computational Dichotomies Elitza Maneva (UC Berkeley) Joint work with Parikshit Gopalan, Phokion.

Visual Recognition Tutorial

Packing Element-Disjoint Steiner Trees Mohammad R. Salavatipour Department of Computing Science University of Alberta Joint with Joseph Cheriyan Department.

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.

Modern Navigation Thomas Herring

Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.

Simulation Output Analysis

Combinatorial approach to Guerra's interpolation method David Gamarnik MIT Joint work with Mohsen Bayati (Stanford) and Prasad Tetali (Georgia Tech) Physics.

Random Walks and Markov Chains Nimantha Thushan Baranasuriya Girisha Durrel De Silva Rahul Singhal Karthik Yadati Ziling Zhou.

CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep

Efficient Gathering of Correlated Data in Sensor Networks

Physics Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 7th~10th Belief propagation Appendix Kazuyuki Tanaka Graduate School of Information.

CS 3343: Analysis of Algorithms

1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.

1 HMM - Part 2 Review of the last lecture The EM algorithm Continuous density HMM.

ANTs PI Meeting, Nov. 29, 2000W. Zhang, Washington University1 Flexible Methods for Multi-agent distributed resource Allocation by Exploiting Phase Transitions.

Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.

The satisfiability threshold and clusters of solutions in the 3-SAT problem Elitza Maneva IBM Almaden Research Center.

An Efficient Algorithm for Enumerating Pseudo Cliques Dec/18/2007 ISAAC, Sendai Takeaki Uno National Institute of Informatics & The Graduate University.

EE 685 presentation Utility-Optimal Random-Access Control By Jang-Won Lee, Mung Chiang and A. Robert Calderbank.

1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.

PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.

1/19 Minimizing weighted completion time with precedence constraints Nikhil Bansal (IBM) Subhash Khot (NYU)

Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)

The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.

Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

Unique Games Approximation Amit Weinstein Complexity Seminar, Fall 2006 Based on: “Near Optimal Algorithms for Unique Games" by M. Charikar, K. Makarychev,

Asymptotic behaviour of blinking (stochastically switched) dynamical systems Vladimir Belykh Mathematics Department Volga State Academy Nizhny Novgorod.

Diversity Loss in General Estimation of Distribution Algorithms J. L. Shapiro PPSN (Parallel Problem Solving From Nature) ’06 BISCuit 2 nd EDA Seminar.

Stochastic Processes and Transition Probabilities D Nagesh Kumar, IISc Water Resources Planning and Management: M6L5 Stochastic Optimization.

Network Science K. Borner A.Vespignani S. Wasserman.

Why almost all satisfiable k - CNF formulas are easy? Danny Vilenchik Joint work with A. Coja-Oghlan and M. Krivelevich.

Phase Transitions – like death and taxes? Why we should care; what to do about it. Scott Kirkpatrick, Hebrew University, Jerusalem With thanks to: Uri.

Clustering Data Streams A presentation by George Toderici.

Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.

CWR 6536 Stochastic Subsurface Hydrology Optimal Estimation of Hydrologic Parameters.

How many iterations in the Gibbs sampler? Adrian E. Raftery and Steven Lewis (September, 1991) Duke University Machine Learning Group Presented by Iulian.

Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.

Markov Chains and Random Walks

New Characterizations in Turnstile Streams with Applications

Markov Chains Mixing Times Lecture 5

Interpolation method and scaling limits in sparse random graphs

Path Coupling And Approximate Counting

R. Srikant University of Illinois at Urbana-Champaign

Reconstruction on trees and Phylogeny 1

Phase Transitions In Reconstruction Yuval Peres, U.C. Berkeley

Effective Social Network Quarantine with Minimal Isolation Costs

Hidden Markov Models Part 2: Algorithms

Locality In Distributed Graph Algorithms

Presentation transcript:

Proving Non-Reconstruction on Trees by an Iterative Algorithm Elitza Maneva University of Barcelona joint work with N. Bhatnagar, Hebrew University

?

Optimal Algorithm for Reconstruction: Belief Propagation computes the distribution at the root given the boundary

Random variable L R (n) : colors at vertices at level n. For what degree is lim n  E[Pr [ R at the root | L R (n) ]] = 1/q ?

Random variable L R (n) : colors at vertices at level n. For what degree is lim n  d TV (L R (n), L G (n)) = 0 ? Total variation distance: d TV ( ,  )=1/2  L  D |  (L)-  (L)|

Interest in reconstruction (in chronological order) Probability: Extremality of the free-boundary Gibbs measure (since 1960s) Phylogeny: reconstructing ancestry tree of collection of species Physics: Replica symmetry breaking (dynamical transition) in spin-glasses Computer Science: Glauber dynamics, MCMC, message-passing algorithms

Space of solutions of random Constraint Satisfaction Problems n variables dn constraints are chosen at random 0 EasyHard Unsat d drdr dsds

lim n  d TV (L R (n), L G (n)) = 0 Tree: Random graph:

lim n  d TV (L R (n), L G (n)) = 0 lim n  d TV (L R (n), L G (n)) > 0 Tree: Random graph:

lim n  d TV (L R (n), L G (n)) = 0 lim n  d TV (L R (n), L G (n)) > 0 Tree: Random graph: Conjecture

Other models Potts model: q colors, parameter   [0,1] color of child node = same as parent’s color with prob.  every other color with prob. (1-  )/(q-1) Asymmetric channels: q colors, q  q matrix M M i, j = Prob [child gets color j | parent is color i] Same on Galton-Watson trees (i.e. random degrees) 3-SAT: x1x1 x2x2 x3x3 x4x4 x5x

RSB, clustering of solutions and reconstruction highlights [Mezard-Parisi-Zechina ‘03] Survey Propagation algorithm and satisfiability threshold calculation for 3-SAT and coloring, based on Replica Symmetry Breaking Ansatz. [Mezard-Montanari ‘06] The dynamic replica symmetry breaking threshold is the same as the threshold for reconstruction on the tree. [Achlioptas--Coja-Oghlan ‘08] For sufficiently large q, there exists a sequence  q  0 s.t. the space of colorings is clustered for random graphs of average degree (1+  q ) q log q < d < (2-  q ) q log q. [Sly ‘08] For sufficiently large q: q (log q + log log q ln 2 -o(1)) < d r < q (log q + log log q + 1+o(1))

RSB, clustering of solutions and reconstruction highlights [Mezard-Parisi-Zechina ‘03] Survey Propagation algorithm and satisfiability threshold calculation for 3-SAT and coloring, based on Replica Symmetry Breaking Ansatz. [Mezard-Montanari ‘06] The dynamic replica symmetry breaking threshold is the same as the threshold for reconstruction on the tree. [Achlioptas--Coja-Oghlan ‘08] For sufficiently large q, there exists a sequence  q  0 s.t. the space of colorings is clustered for random graphs of average degree (1+  q ) q log q < d < (2-  q ) q log q. [Sly ‘08] For sufficiently large q: q (log q + log log q ln 2 -o(1)) < d r < q (log q + log log q + 1+o(1))

Easy lower bound by coupling

d reconstruction > q-1

Upper bound: boundary forcing the root d reconstruction  soln ( ∑ (-1) j ( ) (q-1-jx) d /(q-1) d = x ) q-1 j q-1 j=0

The bounds for coloring [Zdeborova-Krzakala ‘07] –Heuristic algorithm for computing threshold for general models –Heuristic analysis predicted the asymptotic result of Sly [Sly ‘08] [Bhatnagar-Vera-Vigoda ‘08] For q sufficiently large: Hard step (need large q): After sufficiently many iterations d TV < 2/q. Easy step: Given that d TV < 2/q, d TV goes to 0. [Bhatnagar-Maneva ‘09] –Rigorous algorithm for getting upper bounds on d TV for general models –Concrete bounds on the threshold for Potts model with small q. q Lower bound Upper bound [ZK07] (heur.)

Here we need: Recursion on the distribution over possible distributions at the root when the boundary is chosen at random BP: recursion on the distribution at the root given the boundary

f:(R q ) d  R q f i (  1  2,…,  d ) =  t=1 (  j  i  t j ) ||  || =  i  i  1  2,…  d d d Some notation

Recursion on the tree depth Q G n random q-dim vector Q G n (R) := Prob[R|L], where L~ L G (n) Pr[Q G n+1 =  ] =  1/(q-1) d Pr [f(Q c 1 n, …, Q c d n )   ] c 1,.., c d  {1,..,q}\G

Population dynamics Pr[Q G n+1 =  ] =  1/(q-1) d Pr [f(Q c 1 n, …, Q c d n )   ] Keep “populations” of N samples each from the distributions of Q R n, Q G n, …etc. Generating the population of Q G n+1 : –Choose d colors c 1, …, c d from {1, …, q}\G independently –Choose  1, …,  d randomly respectively from the populations for Q c 1 n, …, Q c d n –Save f(  1, …,  d )/||f(  1, …,  d )|| into the population for Q G n+1 c 1,..,c d  {1,..,q}\G

Recursions on the tree depth Conditional recursion: Q G n random q-dim vector Q G n (R) := Prob[R|L], where L~ L G (n) Pr[Q G n+1 =  ] =  1/(q-1) d Pr [f(Q c 1 n, …, Q c d n )   ] Unconditional recursion: Q n random q-dim vector Q n (R) := Prob[R|L], L is a random boundary Pr[Q n+1 =  ]  E[ ||f(Q n (1),…, Q n (d) )||  Ind[ f(Q n (1),…,Q n (d) )   ] c 1,.., c d  {1,..,q}\G

Discrete surveys algorithm Pr[Q n+1 =  ]  E[ ||f(Q n (1),…, Q n (d) )||  Ind[f(Q n (1),…, Q n (d) )   ] Keep a “survey” of the distribution of Q n Generate the survey of Q n+1 by applying the recursion to the survey of Q n R G distrib. of Q n

Discrete surveys algorithm Pr[Q n+1 =  ]  E[ ||f(Q n (1),…, Q n (d) )||  Ind[f(Q n (1),…, Q n (d) )   ] Keep a “survey” of the distribution of Q_ n Generate the survey of Q n+1 by applying the recursion to the survey of Q n R G distrib. of Q n 01 1 R G survey of Q n

Definition of a discrete survey Let P be the space of q-dim probability vectors. Let S = (S 1, …, S k )  P and convex hull of S is Let  1, …  k be functions  i :  [0,1], s.t for every   : 1.  i  i (  ) =1 2.  =  i  i (  ) S i (  define a convex decomposition of  ). Let P be a random element of P with support in. Let C be a random element of S with Pr[C=S i ] = E[  i (P)]. Then we say that C is a survey of P on the skeleton ( S,  1, …  k )

Properties of discrete surveys Transitivity: If C is a survey of P and D is a survey of C then D is a survey of P. Mixing: If C 1, C 2 are surveys respectively of P 1 and P 2 then the r. v. with distribution the mixture p C 1 +(1-p) C 2 is a survey of the r.v. with distribution p P 1 +(1-p) P 2. For any multi-affine function f : P d  P, if C 1, …, C d are surveys of P 1, …, P d then the r.v. D defined by Pr[D=  ]  E [ ||f(C 1, …, C d )|| x Ind [f(C 1, …, C d )   ] ] is a survey of the r.v. Q defined by Pr[Q=  ]  E [ ||f(P 1, …, P d )|| x Ind [f(P 1, …, P d )   ] ] For a convex function g on P, if C is a survey of P then E[g(P)] ≤ E[g(C)].

Manual part of the algorithm Selection of skeletons of small size k –complexity of the algorithm: O(nk d ) –k generally needs to be exponential in q –the skeleton can be refined progressively Examples on Potts model: –q=3, d=3,  =0  n=14, k=19 was enough –q=3, d=2,  =0.79  n<100, k≤208 was enough –q=3, d=3,  =0.7  n<100, k≤85 was enough –q=3, d=2 or 3,  =0.74  n<100, k≤61 was enough

A proof that d TV is small implies d TV  0 There is no general strategy For Potts model, due to [Sly ‘09]: x n := E L~L (n) [Prob[R|L]-1/q]= q E L [ (Prob[R|L]-1/q) 2 ] we have x n+1 ≤ d 2 x n + c 2 (q,d, ) x n 2 +… +c d (q,d, ) x n d Thus we could find  >0 and c<1 such that if x n <  then x n+1 < c x n R

Bound of Formentin and Külske ‘09 α: stationary vector of positive matrix M S(p|α) := Σ i p(i) log( p(i)/α(i) ) L(p) := S(p|α) + S(α|p) M rev (i,j) := α(j) M(j, i)/α(i) c(M):= sup p L(pM rev )/L(p) Theorem: If E[d] c(M) < 1 then no reconstruction.

Important Questions For the Potts model better bounds were obtained by [Formentin-Külske ‘09]. Could they be tight? Can their method be generalized to models with hard constraints? The design of the Survey Propagation algorithm also includes a discretization step - could this step be done in a controlled manner too? How are reconstruction on trees and clustering of solutions related?

[Mezard, Montanari `05] The dynamical transition at  d correspond to the phase transition for reconstruction on the tree.

[Mezard, Montanari `05] The dynamical transition at  d correspond to the phase transition for reconstruction on the tree.

[Mezard, Montanari `05] The dynamical transition at  d correspond to the phase transition for reconstruction on the tree. ?

[Allan Sly `08] For q-coloring:  d ≤ q(log q + log log q o(1)) (also [Zdeborova, Krzakala ’07])  d  ≥ q(log q + log log q + 1 – ln 2 –o(1)) For constant q it is open. Estimates can be obtained with population dynamics.

What phenomena on the tree are described by the other transitions? Can we make population dynamics official? Find rigorous approximations for it? (it would imply that is an upper bound in the threshold by the results of [Franz, Leone `03] and [Talagrand Panchenko `03]) About clustering: is “phase” and “cluster” really the same thing?