An Introduction to Variational Methods for Graphical Models

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

Markov Networks Alan Ritter.
Graphical Models and Applications CNS/EE148 Instructors: M.Polito, P.Perona, R.McEliece TA: C. Fanti.
Factorial Mixture of Gaussians and the Marginal Independence Model Ricardo Silva Joint work-in-progress with Zoubin Ghahramani.
Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv 1.
Lauritzen-Spiegelhalter Algorithm
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
Dynamic Bayesian Networks (DBNs)
Supervised Learning Recap
An Introduction to Variational Methods for Graphical Models.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.
Hidden Markov Models M. Vijay Venkatesh. Outline Introduction Graphical Model Parameterization Inference Summary.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Junction Tree Algorithm Brookes Vision Reading Group.
Junction Trees: Motivation Standard algorithms (e.g., variable elimination) are inefficient if the undirected graph underlying the Bayes Net contains cycles.
From Variable Elimination to Junction Trees
Machine Learning CUNY Graduate Center Lecture 6: Junction Tree Algorithm.
Introduction to Inference for Bayesian Netoworks Robert Cowell.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
CS774. Markov Random Field : Theory and Application Lecture 06 Kyomin Jung KAIST Sep
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graph.
Global Approximate Inference Eran Segal Weizmann Institute.
Genome evolution: a sequence-centric approach Lecture 5: Undirected models and variational inference.
Belief Propagation, Junction Trees, and Factor Graphs
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Exact Inference: Clique Trees
Bayesian Networks Alan Ritter.
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.
. Expressive Graphical Models in Variational Approximations: Chain-Graphs and Hidden Variables Tal El-Hay & Nir Friedman School of Computer Science & Engineering.
CSC2535 Spring 2013 Lecture 2a: Inference in factor graphs Geoffrey Hinton.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by B.-H. Kim Biointelligence Laboratory, Seoul National.
CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep
Presenter : Kuang-Jui Hsu Date : 2011/5/23(Tues.).
Probabilistic Graphical Models David Madigan Rutgers University
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Lecture 2: Statistical learning primer for biologists
Indexing Correlated Probabilistic Databases Bhargav Kanagal, Amol Deshpande University of Maryland, College Park, USA SIGMOD Presented.
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Christopher M. Bishop, Pattern Recognition and Machine Learning 1.
Pattern Recognition and Machine Learning
Introduction on Graphic Models
Machine Learning – Lecture 18
Perceptual and Sensory Augmented Computing Machine Learning, Summer’09 Machine Learning – Lecture 13 Exact Inference & Belief Propagation Bastian.
Today Graphical Models Representing conditional dependence graphically
An Algorithm to Learn the Structure of a Bayesian Network Çiğdem Gündüz Olcay Taner Yıldız Ethem Alpaydın Computer Engineering Taner Bilgiç Industrial.
Knowledge Representation & Reasoning Lecture #5 UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2005 (Based on slides by Lise Getoor and Alvaro.
INTRODUCTION TO Machine Learning 2nd Edition
Learning Deep Generative Models by Ruslan Salakhutdinov
Inference in Bayesian Networks
Exact Inference Continued
Markov Networks.
CSCI 5822 Probabilistic Models of Human and Machine Learning
An Introduction to Variational Methods for Graphical Models
Markov Random Fields Presented by: Vladan Radosavljevic.
Lecture 3: Exact Inference in GMs
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
Generalized Belief Propagation
Presentation transcript:

An Introduction to Variational Methods for Graphical Models Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola and Lawrence K. Saul 報告者:邱炫盛 NTNU Speech Lab

Outline Introduction Exact Inference Basics of Variational Methodology … NTNU Speech Lab

Introduction The problem of probabilistic inference in graphical models is the problem of computing a conditional probability distribution NTNU Speech Lab

Exact Inference Junction Tree Algorithm Graphical models Moralization Triangulation Graphical models Directed (& Acyclic) Bayesian Network Local conditional probabilities Undirected Markov random field Potentials with the cliques NTNU Speech Lab

Exact Inference Directed Graphical Model The conditional probability Specified numerically by associating local conditional probabilities with each nodes in the graph The conditional probability The probability of node given the values of its parents NTNU Speech Lab

Exact Inference Directed Graph Joint probability: NTNU Speech Lab

Exact Inference Undirected Graphical Model Potential Clique specified numerically by associating “potentials” with the clique of the graph Potential A function on the set of configurations of a clique (that is, a setting of values for all of the nodes in the clique) Clique (Maximal) complete subgraph NTNU Speech Lab

Exact Inference Undirected Graph Joint probability: NTNU Speech Lab Partition function NTNU Speech Lab

Exact Inference The junction tree algorithm compiles directed graphical models into undirected graphical models Moralization Triangulation Convert the directed graph into an undirected graph (skip when undirected graph) The variables do not always appear together within a clique “marry” the parents of all of the nodes with undirected edges and then drop the arrows (moral graph) NTNU Speech Lab

Exact Inference Triangulation Take a moral graph as input and produces as output an undirected graph in which additional edges (possibly) been added (allow recursive calculation) A graph is not triangulated if there are 4-cycles which do not have a chord Chord An edge between non-neighboring nodes NTNU Speech Lab

Exact Inference 4-cycle Graph ABD BCD BD NTNU Speech Lab

Exact Inference Once a graph has been triangulated, it is possible to arrange cliques of the graph into a data structure known as a junction tree Running intersection property If a node appears in any two cliques in the tree, it appears in all cliques that lie on the path between the two cliques (the cliques assign the same marginal probability to the nodes that they have in common) Local consistency implies global consistency in a junction tree because of running intersection property NTNU Speech Lab

Exact Inference The QMR-DT database A diagnostic aid for internal medicine NTNU Speech Lab

Basics of variational methodology Variational methods used as approximation methods convert a complex problem into a simpler problem The decoupling achieved via an expansion of the problem to include additional parameters The terminology “variational” comes from the roots of the techniques in the calculus of variation NTNU Speech Lab

Basics of variational methodology Example: logarithm λ: variational parameter If λ changes, the family of such lines forms an upper envelope of the logarithm function So, The minimum over these bounds is the exact value NTNU Speech Lab

Basics of variational methodology NTNU Speech Lab

Basics of variational methodology Example: logistic regression model Logistic concave So, NTNU Speech Lab

Basics of variational methodology Then, take the exponential of both sides Finally, NTNU Speech Lab

Basics of variational methodology Convex duality A concave function can be represented via a conjugate or dual function Upper bound Non-linear bound NTNU Speech Lab

Basics of variational methodology To summarize, if the function is already convex or concave then we simply calculate the conjugate function or then we look for an invertible transformation that render the function convex or concave if the function is not convex or concave NTNU Speech Lab

Basics of variational methodology Approximation for joint and conditional probabilities Consider directed graph and upper bound Let E and H are disjoint treat right side as a function to be minimized with respect λ The best global bounds are obtained when the probabilistic dependencies in the distribution are reflected in dependencies in the approximation exact values not exact values NTNU Speech Lab

Basics of variational methodology Obtain a lower bound on the likelihood P(E) by fitting variational parameters Substitute these parameters into the parameterized variation form for P(H,E) Utilize the variational form as an efficient inference engine in calculating an approximation to P(H|E) NTNU Speech Lab

Basics of variational methodology Sequential approach Introduce variational transformations for the nodes in a particular order The goal is to transform the network until the resulting transformed network is amenable to exact methods Begin with the untransformed graph and introduce variational transformations one node at a time Or begin with a completely transformed graph and re-introduce exact conditional probabilities NTNU Speech Lab

Basics of variational methodology The QMR-DT network NTNU Speech Lab

Basics of variational methodology Block approach … NTNU Speech Lab