Probabilistic graphical models. Graphical models are a marriage between probability theory and graph theory (Michael Jordan, 1998) A compact representation.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

1 Undirected Graphical Models Graphical Models – Carlos Guestrin Carnegie Mellon University October 29 th, 2008 Readings: K&F: 4.1, 4.2, 4.3, 4.4,
Markov Networks Alan Ritter.
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
1 Applications of Optimization to Logic Testing Gary Kaminski and Paul Ammann ICST 2010 CSTVA Workshop.
Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Probabilistic Inference Lecture 1
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Global Approximate Inference Eran Segal Weizmann Institute.
Today Logistic Regression Decision Trees Redux Graphical Models
Bayesian Networks Alan Ritter.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
Introduction to Software Testing Chapter 8.2
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Undirected Graphical Models Eran Segal Weizmann Institute.
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
A Brief Introduction to Graphical Models
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
Markov Random Fields Probabilistic Models for Images
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Introduction to Software Testing Chapter 3.6 Disjunctive Normal Form Criteria Paul Ammann & Jeff Offutt
Selected Topics in Graphical Models Petr Šimeček.
Physics Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 12th Bayesian network and belief propagation in statistical inference Kazuyuki Tanaka.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
LAC group, 16/06/2011. So far...  Directed graphical models  Bayesian Networks Useful because both the structure and the parameters provide a natural.
Presented by Jian-Shiun Tzeng 5/7/2009 Conditional Random Fields: An Introduction Hanna M. Wallach University of Pennsylvania CIS Technical Report MS-CIS
Conditional Probability Distributions Eran Segal Weizmann Institute.
Slides for “Data Mining” by I. H. Witten and E. Frank.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
An Introduction to Variational Methods for Graphical Models
INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN © The MIT Press, Lecture.
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
CS520: Steinberg 1 Lecture 16 CS 520: Introduction to Artificial Intelligence Prof. Louis Steinberg Lecture 16: Uncertainty.
Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.
Lecture 2: Statistical learning primer for biologists
Introduction to Software Testing Chapter 3.6 Disjunctive Normal Form Criteria Paul Ammann & Jeff Offutt
Introduction to Software Testing Chapter 3.6 Disjunctive Normal Form Criteria Paul Ammann & Jeff Offutt
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
Christopher M. Bishop, Pattern Recognition and Machine Learning 1.
Machine Learning – Lecture 11
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Daphne Koller Markov Networks General Gibbs Distribution Probabilistic Graphical Models Representation.
MATH 110 Sec 12-1 Lecture: Intro to Counting Introduction to Counting: Just how many are there?
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Bayes network inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y 
Introduction on Graphic Models
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.
Graduate School of Information Sciences, Tohoku University
Introduction to Software Testing Chapter 8.2
INTRODUCTION TO Machine Learning 2nd Edition
CHAPTER 16: Graphical Models
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17
Markov Networks.
Running example The 4-houses puzzle:
Markov Random Fields Presented by: Vladan Radosavljevic.
Graduate School of Information Sciences, Tohoku University
Graduate School of Information Sciences, Tohoku University
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Junction Trees 3 Undirected Graphical Models
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18
Graduate School of Information Sciences, Tohoku University
Graduate School of Information Sciences, Tohoku University
A Review of Researches on Deep Learning in Remote Sensing Application
Presentation transcript:

Probabilistic graphical models

Graphical models are a marriage between probability theory and graph theory (Michael Jordan, 1998) A compact representation of joint probability distributions. Graphs – nodes: random variables (probabilistic distribution over a fixed alphabet) – edges (arcs), or lack of edges: conditional independence assumptions

Classification of probabilistic graphical models LinearBranchingApplication DirectedMarkov Chain (HMM) Bayesian network (BN) AI Statistics UndirectedLinear chain conditional random field (CRF) Markov network (MN) Physics (Ising) Image/Vision Both directed and undirected arcs: chain graphs

Bayesian Network Structure Directed acyclic graph G –Nodes X 1,…,X n represent random variables G encodes local Markov assumptions –X i is independent of its non-descendants given its parents A BC E G DF

A simple example We want to model whether the grass is wet The grass can get wet if –It rains –If the sprinklers are on Whether it rains depends on whether it is cloudy When it is cloudy we are less likely to turn on the sprinklers

A Simple Example Variables –Cloudy (C), Rain (R), Sprinkler (S), GrassWet (W) CRSWProb. FFFF0.01 FFFT0.04 FFTF0.05 FFTT0.01 FTFF0.02 FTFT0.07 FTTF0.2 FTTT0.1 TFFF0.01 TFFT0.07 TFTF0.13 TFTT0.04 TTFF0.06 TTFT0.05 TTTF0.1 TTTT independent parameters

Bayesian Network Conditional probability distribution (CPD) at each node –T (true), F (false) P(C, S, R, W) = P(C) * P(S|C) * P(R|C,S) * P(W|C,S,R)  P(C) * P(S|C) * P(R|C) * P(W|S,R) 8 independent parameters

Training Bayesian network: frequencies Known: frequencies  Pr(c, s, r, w) for all (c, s, r, w)

Application: Recommendation Systems Given user preferences, suggest recommendations –Amazon.com Input: movie preferences of many users Solution: model correlations between movie features –Users that like comedy, often like drama –Users that like action, often do not like cartoons –Users that like Robert Deniro films often like Al Pacino films –Given user preferences, can predict probability that new movies match preferences

Application: modeling DNA motifs Profile model: no dependences between positions Markov model: dependence between adjacent positions Bayesian network model: non-local dependences

A DNA profile TATAAA TATAAT TATAAA TATTAA TTAAAA TAGAAA T C A G 11 A1A1 22 A2A2 33 A3A3 44 A4A4 55 A5A5 66 A6A6 The nucleotide distributions at different sites are independent !

Mixture of profile model A1A1 A2A2 A3A3 A4A4 A5A5 A6A6 Z 1111 m1m1 1212 m2m2 1414 m4m4 1515 m5m5 The nt-distributions at different sites are conditionally independent but marginally dependent !

Tree model 11 A1A1 22 A2A2 33 A3A3 44 A4A4 55 A5A5 66 A6A6 The nt-distributions at different sites are pairwisely dependent !

Undirected graphical models (e.g. Markov network) Useful when edge directionality cannot be assigned Simpler interpretation of structure –Simpler inference –Simpler independency structure Harder to learn

Markov network Nodes correspond to random variables Local factor models are attached to sets of nodes –Factor elements are positive –Do not have to sum to 1 –Represent affinities D A BC AC  1 [A,C] a0a0 c0c0 4 a0a0 c1c1 12 a1a1 c0c0 2 a1a1 c1c1 9 AB  2 [A,B] a0a0 b0b0 30 a0a0 b1b1 5 a1a1 b0b0 1 a1a1 b1b1 10 CD  3 [C,D] c0c0 d0d0 30 c0c0 d1d1 5 c1c1 d0d0 1 c1c1 d1d1 10 BD  4 [B,D] b0b0 d0d0 100 b0b0 d1d1 1 b1b1 d0d0 1 b1b1 d1d1 1000

Markov network Represents joint distribution –Unnormalized factor –Partition function –Probability D A BC

Markov Network Factors A factor is a function from value assignments of a set of random variables D to real positive numbers –The set of variables D is the scope of the factor Factors generalize the notion of CPDs –Every CPD is a factor (with additional constraints)

Markov Network Factors C A DB C A DB Maximal cliques {A,B} {B,C} {C,D} {A,D} Maximal cliques {A,B,C} {A,C,D}

Pairwise Markov networks A pairwise Markov network over a graph H has: –A set of node potentials {  [X i ]:i=1,...n} –A set of edge potentials {  [X i,X j ]: X i,X j  H} –Example: Grid structured Markov network X 11 X 12 X 13 X 14 X 21 X 22 X 23 X 24 X 31 X 32 X 33 X 34

Application: Image analysis The image segmentation problem –Task: Partition an image into distinct parts of the scene –Example: separate water, sky, background

Markov Network for Segmentation Grid structured Markov network Random variable X i corresponds to pixel i –Domain is {1,...K} –Value represents region assignment to pixel i Neighboring pixels are connected in the network Appearance distribution –w i k – extent to which pixel i “fits” region k (e.g., difference from typical pixel for region k) –Introduce node potential exp(-w i k 1{X i =k}) Edge potentials –Encodes contiguity preference by edge potential exp( 1{X i =X j }) for >0

Markov Network for Segmentation Solution: inference –Find most likely assignment to X i variables X 11 X 12 X 13 X 14 X 21 X 22 X 23 X 24 X 31 X 32 X 33 X 34 Appearance distribution Contiguity preference