Cluster Variation Method for Correlation Function of Probabilistic Model with Loopy Graphical Structure Kazuyuki Tanaka Graduate School of Information Sciences Tohoku University, Sendai 980-8579, Japan kazu@statp.is.tohoku.ac.jp http://www.statp.is.tohoku.ac.jp/~kazu/index-e.html First of all, I am grateful to Profs. Toshiyuki Tanaka, Shiro Ikeda and Welling for giving me a chance to talk in this workshop. My talk is about the cluster variation method for correlation function of probabilistic model with loopy graphical structure. 3 March, 2003 University of Glasgow
+ Introduction Cluster Variation Method (CVM) Stat. Phys. [R. Kikuchi 1951] NIPS [J. S. Yedidia et al, 2000], [H. J. Kappen et al, 2001] Approximate marginal probability in probabilistic model + Linear Response Theory (LRT) MFA + LRT: H. J. Kappen et al 1998], [T. Tanaka 1998] Correlation between any pair of nodes Cluster variation method is one of familiar approximate method in statistical physics. Yedidia and Kappen proposed the application of the cluster variation method to neural computation. The cluster variation method gives us approximate marginal probability in probabilistic model in high accuracy. However, it is difficult to calculate a long rang correlation. It is well-known that a long range correlation is obtained by combining the mean-field approximation with the linear response theory. In this talk, we give the general approximate formula for correlation between any pair of nodes by applying the cluster variation method to linear response theory. General CVM Approximate Formula of Correlation 3 March, 2003 University of Glasgow
Linear Response Theory 3 March, 2003 University of Glasgow
Linear Response Theory 3 March, 2003 University of Glasgow
Final Result 3 March, 2003 University of Glasgow This is a final result in the present talk. The correlation function is obtained by calculating the inverse matrix of G. The matrix G is constructed from the short range correlation. B is a set of basic clusters in the cluster variation method. Every basic cluster must not be a subcluster of another element in the set of basic clusters. C is a set of basic clusters and its subclusters. Mu is a Mobius function. The cluster variation method is specified by defining the set of basic clusters. The matrix A is constructed from the belief of probabilistic model P(x) and is calculated by employing a generalized belief propagation. 3 March, 2003 University of Glasgow
Basic Cluster and Subcluster Example We denote the set of whole nodes by Omega and select the set of basic clusters as the middle of pictures. The subclusters are 2, 3, 4, 5 and 6. The set of basic clusters and its subclusters are shown in the right of picture. 3 March, 2003 University of Glasgow
Probabilistic Model Joint Probability Distribution Example Each node takes only two states. We consider the probabilistic model whose probability distribution is defined by P(x). The function mu is called Mobius function. Example 3 March, 2003 University of Glasgow
Cluster Variation Method Kullback-Leibler Divergence and Kikuchi Free Energy The probabilistic distribution P(x) satisfies the minimization of the free energy functional F[p]. In the cluster variation method, the free energy functional F[P] is approximately expressed in terms of the linear combination of the free energy functional of subclusters. The CVM approximate marginal probability is determined so as to minimize the approximate free energy under the normalization conditions and the consistency conditions are regarded as constraints. 3 March, 2003 University of Glasgow
Marginal Distribution in CVM Present Probabilistic Model Probabilistic Model with External Field In order to use the linear response theory, we have to consider two probabilistic models. One of them is the present probabilistic model. The other one is given by imposing an external field factor to the present probabilistic model. In this page, we show the CVM approximate marginal probabilities. The lammda’s are Lagrange multipliers and are determined so as to satisfy the consistency conditions and a relationship between Lagrange multipliers. 3 March, 2003 University of Glasgow
Linear Response in CVM 3 March, 2003 University of Glasgow Expanding the average of $x_{\alpha}$ in powers of Lagrange multipliers and retaining only the first order terms, we get the first equation. We regard the first equation as a system of linear equations for Lagrange multipliers and express the solution as the second equation. By substituting the solution into the relationship between Lagrange multipliers and by using the consistency relations, we obtain the last equation. We regard the last equation as a system of linear equations for deviation of average. 3 March, 2003 University of Glasgow
Correlation Function in CVM By solving the system of linear equations for deviation of average and by applying the linear response formula to the solutions, we obtain the correlation function as the inverse of matrix G. 3 March, 2003 University of Glasgow
Numerical Experiments Cluster Variation Method Exact We give some numerical experiments. We show some correlations by means of the cluster variation method and compare the results of CVM with the exact ones. In this example, the correlation between any pair of nodes has been obtained in high accuracy. 3 March, 2003 University of Glasgow
Conclusions Cluster Variation Method + Linear Response Theory → General CVM Approximate Formula for Correlation Extension of [H. J. Kappen et al 1998] and [T. Tanaka 1998] Other Related Previous Work CVM + LRT → General CVM Approximate Formula for Fourier Transform of Correlation of Probabilistic Model on Regular Lattice. [K. Tanaka, T. Horiguchi and T. Morita 1991] In this talk, we give the general formula for correlation by combining the cluster variation method with the linear response theory. The framework is one of extensions of Kappen et al 1998 and Tanaka 1998. As the other previous work, we have already given a general CVM approximate formula for Fourier transform of correlation of probabilistic model on any regular lattice. 3 March, 2003 University of Glasgow