Presentation is loading. Please wait.

Presentation is loading. Please wait.

Christopher M. Bishop, Pattern Recognition and Machine Learning 1.

Similar presentations


Presentation on theme: "Christopher M. Bishop, Pattern Recognition and Machine Learning 1."— Presentation transcript:

1 Christopher M. Bishop, Pattern Recognition and Machine Learning 1

2 Outline  Introduction  Directed Graphs  Undirected Graphs  Factor Graphs  Summary 2

3 Outline  Introduction  Directed Graphs  Undirected Graphs  Factor Graphs  Summary 3

4 Introduction  A graph consists of nodes (vertices) that are connected by edges (links, arcs)  They provide a simple and clear way to visualize the probabilistic model  Complex computations can be expressed in terms of graphical manipulations 4

5 Probabilistic Graphical Models  There are two models: directed and undirected graphical models  Each node represents a random variable and the edges represent probabilistic relationships between these variables 5 DirectedUndirected

6 Outline  Introduction  Directed Graphs  Undirected Graphs  Factor Graphs  Summary 6

7 Directed Graphical Models  An example:  Definition: for a graph with K nodes, the joint distribution is given by where denotes the set of parents of 7 a b c

8 An Example 8 x1x1 x2x2 x3x3 x4x4 x5x5 x7x7 x6x6

9 Conditional Independence (1)  is conditionally independent of given :  A shorthand notation:  There are three types of conditional independencies for the directed graphs 9

10 Conditional Independence (2) 10 ab c ab c tail-to-tail blocked

11 Conditional Independence (2)  Definition: d-separation is the notion of being separated on a directed graph 11 abc a b c a b c head-to-tail head-to-head dependence

12 D-separation: an example 12 a b c e f

13 Application: an Example  Hidden Markov model: 13

14 Outline  Introduction  Directed Graphs  Undirected Graphs  Factor Graphs  Summary 14

15 Undirected Graphical Models  Nodes of set A and B are separated by the third set C  A and B are conditionally independent,  15 A B C

16 Conditional Independence  The computers can infect each other via the hubs and the hubs can infect each other via the computers 16 C1 C2 H1H2

17 Cliques  Definition: a subset of the nodes in a clique is fully connected  Maximal cliques  We can define the factors in decomposition of the joint distribution as functions of the variable in the clique 17 x1x1 x2x2 x4x4 x3x3

18 Undirected Factorization  Consider factorizations of the form: where is a non-negative potential function of a maximal clique  An example: 18 x1x1 x2x2 x4x4 x3x3

19 An Example  Markov random field: 19

20 Directed versus Undirected (1)  We have to discard some conditional independence properties to complete this transfer 20 x1x1 x2x2 x4x4 x3x3 x1x1 x2x2 x4x4 x3x3 moralization moral graph

21 Directed versus Undirected (2)  P: the set of all distributions over a given set of variables 21 P DU

22 Outline  Introduction  Directed Graphs  Undirected Graphs  Factor Graphs  Summary 22

23 Factor Graphs (1)  A factor graph is a more general graph  It allows us to be more explicit about the details of the factorization  An example: 23 x1x1 x2x2 fafa x3x3 fbfb fcfc fdfd Factor node Variable node

24 Factor Graphs (2)  Definition: given a factor graph, the joint probability distribution is given by where the denotes a subset of the variables that connect to the factor  Each factor is a function of a corresponding set of variables 24

25 Factor Graphs (3) 25

26  Directed and undirected graphs are special cases of factor graphs Factor Graphs (4) 26

27 Sum-Product Algorithm (1)  Goal: Obtain a efficient, exact inference algorithm for finding marginals Allow computations to be shared efficiently  By definition, the marginal is 27

28 Sum-Product Algorithm (2) where : the factor nodes are neighbors of x : all variables in the subtree : the product of all the factors in the group associated with factor 28

29 Sum-Product Algorithm (3)  can be view as messages from the factor node f s to the variable node x  which is a factor sub-graph can itself be factorized 29

30 Sum-Product Algorithm (4) 30 x1x1 x2x2 fsfs xMxM x G 1 (x 1,X s1 )

31 Sum-Product Algorithm (5) 31

32 Sum-Product Algorithm (6) 32

33 Sum-Product Algorithm (7)  Messages: Variable node  factor node: take the product of the in coming messages along all of the other link Factor node  variable node: take the product of the in coming messages along all of the other link and multiply by the factor 33

34 Sum-Product Algorithm (8)  The sum-product algorithm can be viewed purely in terms of messages sent out by factor nodes to other factor nodes 34

35 35 x1x1 x2x2 fafa x3x3 fbfb fcfc x4x4 root Sum-Product Algorithm – an Example

36 Max-Sum Algorithm (1)  Find a setting of the variables that has the largest probability  Find the value of that probability 36

37 Max-Sum Algorithm (2)  Compare this with the marginal:  That is similar to the sum-product algorithm except that the summations are replaced by maximization 37

38 Max-Sum Algorithm (3)  The max-product algorithm: 38

39 Max-Sum Algorithm (4)  It is convenient to work with the logarithm of the joint distribution  The max-sum algorithm: 39

40 Max-Sum Algorithm (5)  We can find the maximum by propagating messages from leaves to a root node  Now we want to find the configuration of the variables for which the joint distribution attains this maximum value 40

41 Max-Sum Algorithm (6)  An example:  Once we know, we can propagate a message back down the chain using 41 x1x1 x2x2 f 1,2 x3x3 f 2,3 f N-1,N x N-1 xNxN

42 Max-Sum Algorithm (7)  It is known as back-tracking  This can be extended to a general tree- structure factor graph 42

43 Examples  A Markov chain:  A hidden Markov model: 43

44 Outline  Introduction  Directed Graphs  Undirected Graphs  Factor Graphs  Summary 44

45 Summary  The author introduces three types of probabilistic graphs  Graphical models are composed of probability theory and graphical theory  The concept is to factorize a complicated system into some simple components 45


Download ppt "Christopher M. Bishop, Pattern Recognition and Machine Learning 1."

Similar presentations


Ads by Google