Download presentation
Presentation is loading. Please wait.
1
SURVEY: Foundations of Bayesian Networks
O, Jangmin 2002/10/29 Last modified 2002/10/29 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
2
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Contents From DAG to Junction Tree From Elimination Tree to Junction Tree Junction Tree Algorithms Learning Bayesian Networks Copyright (c) 2002 by SNU CSE Biointelligence Lab.
3
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Typical Example of DAG A B C F D G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
4
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
1. Topological Sort Algorithm 4.1 [Topological sort] Begin with all vertices unnumbered. Set counter i = 1. While any vertices remain: Select any vertex that has no parents; number the selected vertex as i; delete the numbered vertex and all its adjacent edges from the graph; increment i by 1. Objective: acquiring well-ordering Well-ordering: predecessors of any node have lower number than . Copyright (c) 2002 by SNU CSE Biointelligence Lab.
5
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
1. Topological Sort (1) 1 A B C F D G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
6
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
1. Topological Sort (2) 1 A 2 B C F D G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
7
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
1. Topological Sort (3) 1 A 2 3 B C F D G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
8
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
1. Topological Sort (4) 1 A 2 3 B C F 4 D G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
9
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
1. Topological Sort (5) 1 A 2 3 B C F 5 4 D G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
10
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
1. Topological Sort (6) 1 A 2 3 B C F 5 4 D 6 G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
11
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
2. Moral Graph Making moral graph of DAG Add undirected edge between the nodes which have same child. Remove directions Copyright (c) 2002 by SNU CSE Biointelligence Lab.
12
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
2. Moral Graph (1) 1 A 2 3 B C F 5 4 D 6 G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
13
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
2. Moral Graph (2) 1 A 2 3 B C F 5 4 D 6 G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
14
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Junction tree Definition Tree from nodes C1, C2,... Intersection of C1 and C2 is contained in every node on path between C1 and C2. Corollaries Decomposable, chordal, junction tree of cliques, perfect numbering: all are equal in undirected graph. Perfect numbering: ne(vj) {v1, ..., vj-1} induce complete subgraph. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
15
3. Maximum Cardinality Search (1)
Algorithm 4.9 [Maximum Cardinality Search] Set Output := ‘G is chordal’. Set counter i := 1. Set L = . For all v V, set c(v) := 0. While L V: Set U := V \ L. Select any vertex v maximizing c(v) over v V and label it i. If vi :=ne(vi) L is not complete in G: Set Output :=‘G is not chordal’. Otherwise, set c(w) = c(w) + 1 for each vertex w ne(vi) U. Set L = L {vi}. Increment i by 1. Report Output. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
16
3. Maximum Cardinality Search (2)
B C F D G Simple DAG Copyright (c) 2002 by SNU CSE Biointelligence Lab.
17
3. Maximum Cardinality Search (2)
1, ={} A . . B C F . D G Copyright (c) 2002 by SNU CSE Biointelligence Lab.
18
3. Maximum Cardinality Search (3)
1, = A 2, ={A} .. B C . F .. D G Copyright (c) 2002 by SNU CSE Biointelligence Lab.
19
3. Maximum Cardinality Search (4)
1, = A 2, ={A} 3, ={A, B} B C .. F .. D G Copyright (c) 2002 by SNU CSE Biointelligence Lab.
20
3. Maximum Cardinality Search (5)
1, = A 2, ={A} 3, ={A, B} B C .. F 4, ={A, B} D G Copyright (c) 2002 by SNU CSE Biointelligence Lab.
21
3. Maximum Cardinality Search (6)
1, = A 2, ={A} 3, ={A, B} B C F 5, ={B, C} 4, ={A, B} D . G Copyright (c) 2002 by SNU CSE Biointelligence Lab.
22
3. Maximum Cardinality Search (7)
1, = A 2, ={A} 3, ={A, B} B C F 5, ={B, C} 4, ={A, B} D G 6, ={F} Copyright (c) 2002 by SNU CSE Biointelligence Lab.
23
3. Maximum Cardinality Search (8)
1, = A 2, ={A} 3, ={A, B} B C F 5, ={B, C} 4, ={A, B} D G 6, ={F} Output = “G is chordal” Copyright (c) 2002 by SNU CSE Biointelligence Lab.
24
4. Cliques of Chordal Graph (1)
Algorithm 4.11 [Finding the Cliques of a Chordal Graph] From numbering (v1,..., vk) obtained by maximum cardinality search i = cardinality of vi Make ladder nodes. i = ladder node if i = k or i = ladder node if i < k and i+1 < 1 + i Define cliques Cj = {j} j C1, C2... Posess RIP (running intersection property). Copyright (c) 2002 by SNU CSE Biointelligence Lab.
25
4. Cliques of Chordal Graph (2)
1, = A 2, ={A} 3, ={A, B} B C C1 = {A, B, C} F 5, ={B, C} C3 = {B, C, F} 4, ={A, B} D C2 = {A, B, D} G 6, ={F} C4 = {F, G} Copyright (c) 2002 by SNU CSE Biointelligence Lab.
26
Running Intersection Property
RIP : definition Given (C1, C2, ..., Ck), For all 1 < j k, there is an i < j such that Cj (C1 ... Cj-1) Ci. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
27
5. Junction Tree Construction (1)
Algorithm 4.8 [Junction Tree Construction] From the cliques (C1, ..., Cp) of a chordal graph ordered with RIP, Associate a node of the tree with each clique Cj. For j = 2, ..., p, add an edge between Cj and Ci where i is any one value in {1, ..., j-1} such that Cj (C1 ... Cj-1) Ci. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
28
5. Junction Tree Construction (2)
1, = C1 ABC A 2, ={A} 3, ={A, B} C2 ABD B C C1 = {A, B, C} C3 BCF F 5, ={B, C} C3 = {B, C, F} 4, ={A, B} D C4 FG C2 = {A, B, D} G 6, ={F} C4 = {F, G} Copyright (c) 2002 by SNU CSE Biointelligence Lab.
29
5. Junction Tree Construction (3)
1, = C1 ABC A 2, ={A} 3, ={A, B} C2 ABD B C C1 = {A, B, C} C3 BCF F 5, ={B, C} C3 = {B, C, F} 4, ={A, B} D C4 FG C2 = {A, B, D} G 6, ={F} C4 = {F, G} Copyright (c) 2002 by SNU CSE Biointelligence Lab.
30
5. Junction Tree Construction (4)
1, = C1 ABC A 2, ={A} 3, ={A, B} C2 ABD B C C1 = {A, B, C} C3 BCF F 5, ={B, C} C3 = {B, C, F} 4, ={A, B} D C4 FG C2 = {A, B, D} G 6, ={F} C4 = {F, G} Copyright (c) 2002 by SNU CSE Biointelligence Lab.
31
5. Junction Tree Construction (5)
1, = C1 ABC A 2, ={A} 3, ={A, B} C2 ABD B C C1 = {A, B, C} C3 BCF F 5, ={B, C} C3 = {B, C, F} 4, ={A, B} D C4 FG C2 = {A, B, D} G 6, ={F} C4 = {F, G} Copyright (c) 2002 by SNU CSE Biointelligence Lab.
32
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Contents From DAG to Junction Tree From Elimination Tree to Junction Tree Junction Tree Algorithms Learning Bayesian Networks Copyright (c) 2002 by SNU CSE Biointelligence Lab.
33
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Triangulation (1) When need triangulation? If MCS (Maximum Cardinality Search) failed. Triangulation introduces Fill-in. produces perfect numbering. Optimal triangulation: NP-hard Size of each cliques matters... Copyright (c) 2002 by SNU CSE Biointelligence Lab.
34
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Triangulation (2) Algorithm 4.13 [One-step Look Ahead Triangulation] Start with all vertices unnumbered, set counter i := k. While there are still some unnumbered vertices: Select an unnumbered vertex v to optimize the criterion c(v). or Select v = (i) [ is an order]. Label it with the number i. Form the set Ci consisting of vi and its unnumbered neighbours. Fill in edges where none exist between all pairs of vertices in Ci. Eliminate vi and decrement i by 1. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
35
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Triangulation (3) A B C F D G 6, C6 = {F, G} = (A,B,C,D,F,G) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
36
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Triangulation (4) A B C F 5, C5 = {B,C,F} D G 6, C6 = {F, G} = (A,B,C,D,F,G) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
37
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Triangulation (5) A B C F 5, C5 = {B,C,F} D 4, C4 = {A,B,D} G 6, C6 = {F, G} = (A,B,C,D,F,G) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
38
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Triangulation (6) A B C 3, C3 = {A,B,C} F 5, C5 = {B,C,F} D 4, C4 = {A,B,D} G 6, C6 = {F, G} = (A,B,C,D,F,G) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
39
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Triangulation (7) A 2, C2 = {A,B} B C 3, C3 = {A,B,C} F 5, C5 = {B,C,F} D 4, C4 = {A,B,D} G 6, C6 = {F, G} = (A,B,C,D,F,G) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
40
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Triangulation (8) 1, C1 = {A} Elimination set Cj contains vj. vj Cl for all l < j. (C1,..., Ck) has RIP. The cliques of the triangulated graph G’ are contained in (C1,..., Ck). A 2, C2 = {A,B} B C 3, C3 = {A,B,C} F 5, C5 = {B,C,F} D 4, C4 = {A,B,D} G 6, C6 = {F, G} = (A,B,C,D,F,G) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
41
Elimination Tree Construction (1)
Algorithm 4.14 [Elimination Tree Construction] Associate a node of the tree with each set Ci. For j = 1, ..., k, if Cj contains more than one vertex, add an edge between Cj and Ci where i is the largest index of a vertex in Cj \ {vj} Copyright (c) 2002 by SNU CSE Biointelligence Lab.
42
Elimination Tree Construction (2)
B:A C:AB C3 F:BC C5 D:AB C4 G:F C6 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
43
Elimination Tree Construction (3)
B:A C:AB C3 F:BC C5 D:AB C4 G:F C6 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
44
Elimination Tree Construction (4)
B:A C:AB C3 F:BC C5 D:AB C4 G:F C6 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
45
Elimination Tree Construction (5)
B:A C:AB C3 F:BC C5 D:AB C4 G:F C6 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
46
Elimination Tree Construction (6)
B:A C:AB C3 F:BC C5 D:AB C4 G:F C6 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
47
Elimination Tree Construction (7)
B:A C:AB C3 F:BC C5 D:AB C4 G:F C6 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
48
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
From etree to jtree (1) Lemma 4.16 Let C1,..., Ck be a sequence of sets with RIP Assume that Ct Cp for some t p and that p is minimal with this property for fixed t. Then: (i) If t > p, then C1, ..., Ct-1, Ct+1, ..., Ck has the running intersection property (ii) If t < p, then C1,..., Ct-1, Cp, Ct+1, ..., Cp-1, Cp+1,..., Ck has the RIP. Simple removal of redundant elimination set might lead to destroy RIP. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
49
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
From etree to jtree (2) C1 A: C2 C2 B:A B:A C:AB C:AB C3 C3 F:BC F:BC C5 C5 D:AB D:AB C4 C4 G:F G:F C6 C6 Condition (ii): t = 1, p = 2 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
50
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
From etree to jtree (3) C2 B:A C:AB C:AB C3 C3 F:BC F:BC C5 C5 D:AB D:AB C4 C4 G:F G:F C6 C6 Condition (ii): t = 2, p = 3 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
51
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
MST for making jtree (1) Algorithm From Elimination set (C1, ..., Ck) Remove redundant Cis Make junction graph. If |Ci Cj | > 0 add edge between Ci and Cj. Set weight of the edge as |Ci Cj |. Construct MST (Maximum Weight Spanning Tree) The resulting tree is junction tree. Also the clique set has RIP. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
52
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
MST for making jtree (2) C1 ABC ABC 2 2 2 2 C2 C3 1 ABD ABD BCF BCF 1 1 FG C4 FG Junction graph MST Copyright (c) 2002 by SNU CSE Biointelligence Lab.
53
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
MST for making jtree (3) Optimal jtree (for a fixed elimination ordering) cost of edge e = (v, w) Use cost of edge to break tie when constructing MST. (minimum preferred) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
54
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Contents From DAG to Junction Tree From Elimination Tree to Junction Tree Junction Tree Algorithms Learning Bayesian Networks Copyright (c) 2002 by SNU CSE Biointelligence Lab.
55
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Collect phase separator Ck Updated potential projection Initial potential Cj From leaf to root Ci Ci’ Copyright (c) 2002 by SNU CSE Biointelligence Lab.
56
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Distribute phase Ck From root to leaf j* contains marginal distribution of clique j. Cj Ci Ci’ Copyright (c) 2002 by SNU CSE Biointelligence Lab.
57
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Contents From DAG to Junction Tree From Elimination Tree to Junction Tree Junction Tree Algorithms Learning Bayesian Networks Copyright (c) 2002 by SNU CSE Biointelligence Lab.
58
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Learning Paradigm Known structure or unknown structure Full observability or partial observability Frequentist or Bayesian Copyright (c) 2002 by SNU CSE Biointelligence Lab.
59
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Fr (1) Given training set D = {D1, ..., DM} MLE of parameters of each CPD MLE (Maximum likelihood Estimates) CPD (Conditional Probability Distribution) Decomposition, for each node # of nodes # of data Copyright (c) 2002 by SNU CSE Biointelligence Lab.
60
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Fr (2) Multinomial distributions , for tabular CPD Log-likelihood MLE constraint: Copyright (c) 2002 by SNU CSE Biointelligence Lab.
61
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Fr (3) MLE of Multinomial distr. Constrained optimization Derivatives of ijk Setting Derivatives of ijk zero Copyright (c) 2002 by SNU CSE Biointelligence Lab.
62
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Fr (4) Conditional linear Gaussian distributions Copyright (c) 2002 by SNU CSE Biointelligence Lab.
63
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Ba (1) Frequentist: point estimation Bayesian: distributional estimation Copyright (c) 2002 by SNU CSE Biointelligence Lab.
64
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Ba (2) Multinomial distributions Two assumptions on prior Global independence: Local independence: Global independence + likelihood equivalence leads to Dirichlet prior: Conjugate prior for multinomial Copyright (c) 2002 by SNU CSE Biointelligence Lab.
65
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Ba (3) Remark on Bayesian P(|D) P(D| )*P() Conjugate priors Posterior has same form with prior distribution. Many exponential family belongs to conjugate priors. Posterior Prior Likelihood Copyright (c) 2002 by SNU CSE Biointelligence Lab.
66
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Ba (4) Multinomial distributions Dirichlet prior on tabular CPDs ij: multinomial r.v. with ri possible values Posterior distribution Posterior mean Copyright (c) 2002 by SNU CSE Biointelligence Lab.
67
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Ba (5) Dirichlet distribution Hyper parameter ijk Positive number Pseudo count # of imaginary cases ijk - 1 Posterior distribution Combined count between pseudo count and # of observed data Simple sum Copyright (c) 2002 by SNU CSE Biointelligence Lab.
68
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Fo, Ba (6) Gaussian distributions Copyright (c) 2002 by SNU CSE Biointelligence Lab.
69
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Po, Fr (1) Log likelihood Not decomposable into a sum of local terms, one per node EM algorithm hidden visible (observed) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
70
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Po, Fr (2) EM algorithm From Jensen’s inequality constraint: Copyright (c) 2002 by SNU CSE Biointelligence Lab.
71
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Po, Fr (3) Maximizing w.r.t. q (E-step) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
72
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Po, Fr (4) Maximizing w.r.t (M-step) After q is maximized to p(h|Vm) Maximizing Expected complete-data log-likelihood Iteration until convergence E-step Calculate expected complete-data log-likelihood M-step Get * maximizing expected complete-data log-likelihood Copyright (c) 2002 by SNU CSE Biointelligence Lab.
73
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Po, Fr (5) Multinomial distribution E-step M-step Copyright (c) 2002 by SNU CSE Biointelligence Lab.
74
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Ks, Po, Ba (1) Gibbs sampling: stochastic version of EM Variational Bayes: P(, H|V) q(|V)q(H|V) Copyright (c) 2002 by SNU CSE Biointelligence Lab.
75
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (1) Issues Hypothesis space Evaluation function Search algorithm Copyright (c) 2002 by SNU CSE Biointelligence Lab.
76
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (2) Search space DAG # of DAGs ~ O(2n^2) 10 nodes ~ O(1018) DAGs Finding optimal DAG: doomed to failure Copyright (c) 2002 by SNU CSE Biointelligence Lab.
77
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (3) Search algorithm Local search Operators: adding, deleting, reversing a single arc Choose G somehow While not converged For each G’ in nbd(G) Compute score(G’) G* := arg maxG’ score(G’) If score(G*) > score(G) then G :=G* else converged := true Psedo-code for hill-climbing. nbd(G) is the neighborhood of G, i.e., the models that can be reached by applying a single local change operator. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
78
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (4) Search algorithm PC algorithm Starts with fully connected undirected graph CI (conditional independence) test If X Y|S, arc between X and Y is removed. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
79
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (5) Scoring function MLE selects fully connected graph. score(G) P(D|G)P(G) Automatically penalizing effect on complex model. has more parameters. Not much probability mass to the space where data actually lies. penalizing complex models Copyright (c) 2002 by SNU CSE Biointelligence Lab.
80
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (6) Scoring function Under global independences, and conjugate priors Integration at closed form Decomposition as factored form Copyright (c) 2002 by SNU CSE Biointelligence Lab.
81
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (7) Scoring function Under not conjugate priors: approximation Laplace approximation: BIC (Bayesian Information Criterioin) Case of multinomial distribution dim. of the model ML estimate of params. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
82
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (8) Scoring function Advantage of decomposed score Marginal likelihood at most two different terms in single link mismatched graphs. Ex) G1:X1X2 X3 X4, G2:X1 X2X3 X4 Copyright (c) 2002 by SNU CSE Biointelligence Lab.
83
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Fr (9) Scoring function Marginal likelihood for the multinomial distribution with Dirichlet prior Bayesian Dirichlet (BD) score posterior mean Copyright (c) 2002 by SNU CSE Biointelligence Lab.
84
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Ba (1) Posterior over all models is intractable Focusing on some features Bayesian model averaging Needs to calculate P(G|D) Solution MCMC: Metropolis-Hastings algorithm Only need to ratio R. Integration is avoided. f(G)=1 if G contains a certain edge Integration is intractable. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
85
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Fo, Ba (2) Calculation of P(G|D) Sampling G Choose G somehow While not converged Pick a G’ u.a.r. from nbd(G) Compute R = P(G’|D)q(G|G’)/P(G|D)q(G’|G) Sample u ~ uniform(0,1) If u < min{1, R} then G := G’ Psedo-code for MC3 algorithm. u.a.r. means uniformly at random. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
86
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Po, Fr (1) Partially observable Computation of marginal likelihood: Intractable Not decomposable to the product of local terms Solutions Approximating the marginal likelihood Structural EM hidden variables Copyright (c) 2002 by SNU CSE Biointelligence Lab.
87
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Po, Fr (2) Approximating the marginal likelihood Candidate’s method from BN’s inference algorithm trivial from Gibbs sampling MLE of params. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
88
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Po. Fr (3) Structural EM Idea: decomposition of expected complete-data log-likelihood (BIC-score) Search inside EM (EM inside Search is high cost process) MLE of params. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
89
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Us, Po, Ba (1) Combined MCMC MCMC for Bayesian model averaging MCMC over the values of the unobserved nodes. Copyright (c) 2002 by SNU CSE Biointelligence Lab.
90
Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Conclusion Has learning of structure important meaning? In paper, Yes. In engineering, No. What can AI do for human? What can human do for Machine learning algorithm? Copyright (c) 2002 by SNU CSE Biointelligence Lab.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.