Presentation is loading. Please wait.

Presentation is loading. Please wait.

. Introduction to Bayesian Networks Instructor: Dan Geiger Web page:

Similar presentations


Presentation on theme: ". Introduction to Bayesian Networks Instructor: Dan Geiger Web page:"— Presentation transcript:

1 . Introduction to Bayesian Networks Instructor: Dan Geiger Web page: www.cs.technion.ac.il/~dang/courseBN www.cs.technion.ac.il/~d Email: dang@cs.technion.ac.il dang@cs.technion.ac.il Phone: 829 4339 Office: Taub 616. על מה המהומה ? נישואים מאושרים בין תורת ההסתברות ותורת הגרפים. הילדים המוצלחים: אלגוריתמים לגילוי תקלות, קודים לגילוי שגיאות, מודלים למערכות מורכבות. שימושים במגוון רחב של תחומים.

2 2 Course Information Meetings:  Lecture: Mondays 10:30 –12:30  Tutorial: Wednesdays 16:30 – 17:30 Grade: u 50% in 4 question sets. These questions sets are obligatory. Each contains 6 problems. Submit in pairs in two weeks time. u 50% one hour lecture (Priority to graduate students). u Prerequisites: u Data structure 1 (cs234218) u Algorithms 1 (cs234247) u Probability (any course) Information and handouts: u www.cs.technion.ac.il/~dang/courseBN www.cs.technion.ac.il/~d

3 3 Lectures Plan u Mathematical Foundations (5-6 weeks including 3 students’ lectures, based on Pearl’s Chapter 3 + papers). 1.Properties of Conditional Independence (Soundness and completeness of marginal independence, graphoid axioms and their interpretation as “irrelevance”, incompleteness of conditional independence, no disjunctive axioms possible.) 2.Properties of graph separation (Paz and Pearl 85, Theorem 3), soundness and completeness of saturated independence statements. Undirected Graphs as I-maps of probability distributions. Markov-Blankets, Pairwise independence basis. Representation theorems (Pearl and Paz, from each basis to I-maps). 3.Markov networks, HC representation theorem, Completeness theorem. Markov chains 4.Bayesian Networks, d-separation, Soundness, Completeness. 5.Chordal Graphs as the intersection of BN and Markov networks. Equivalence of their 4 definitions. u Combinatorial Optimiziation of Exact Inference in Graphical models (3 weeks including 2 students lectures). 1.Variable elimination; greedy algorithms for optimization. 2.Clique tree algorithm. Conditioning. 3.Treewidth. Feddback Vertex Set. u Learning (3 weeks including 2 students lectures). 1.The ML method and the EM algorithm 2.Chow and Liu’s algorithm; the TAN model. 3.K2 measure, score equivalence, Chickering’s theorem, Dirichlet priors, Characterization theorem.

4 4 What is it all above ? How to use graphs to represent probability distributions over thousands of random variables ? How to encode conditional independence in directed and undirected graphs ? How to use such representations for efficient computations of the probability of events of interest ? How to learn such models from data ?

5 5 Properties of Independence I (X,Y) iff Pr(X=x,Y=y) = Pr(X=x)Pr(Y=y) Properties Symmetry: I(X,Y)  I(Y,X) Decomposition: I(X,YW)  I(X,Y) Mixing: I(X,Y) and I(XY,W)  I(X,YW) Are there more properties of independence ?

6 6 Properties of Conditional Independence Properties Symmetry: I(X,Z,Y)  I(Y,Z,X) Decomposition: I(X,Z,YW)  I(X,Z,Y) Mixing: I(X,Z,Y) and I(XY,Z,W)  I(X,Z,YW) Are there more properties of independence ? I(X,Z,Y) if and only if Pr(X=x,Y=y |Z=z) = Pr(X=x |Z=z) Pr(Y=y |Z=z)

7 7 A simple Markov network A D C B The probability function represented by this graph satisfies: I(A,{C,D},B) and I(C, {A,B}, D). In large graphs, how do we compute P(A|B) ? How do we learn the best graph from sample data ? f 4 (d,a) f 1 (a,c) f 2 (c,b) f 3 (b,d)

8 8 Relations to Some Other Courses u Introduction to Artificial Intelligence (cs236501) u Introduction to Machine Learning (cs236756) u Introduction to Neural Networks (cs236950) u Algorithms in Computational Biology (cs236522) u Error correcting codes u Data mining אמור לי מי חבריך ואומר לך מי אתה.

9 9 Possible Recitations Topics u Mathematical Foundations 1.Bayes rule and independence for multivariate distributions 2.Cover and Thomas 12-20. 3.Cover and Thomas 20-28. 4.Properties of independence using entropy (e.g. Studeny). u Graphical models 1.D-separation (maybe with deterministic nodes) 2.HMM (Rabiner’s tutorial) 3.HMM (Rabiner’s tutorial) 4.MAP via And-Or-Trees 5.Darwiche’s algorithm 6.Mini Buckets and other approximations 7.Software of Bayesian networks and HMMs (JAVA, Mathlab) 8.The ML method and EM algorithm 9.The EM algorithm 10.Noisy or gates, context sensitive independence


Download ppt ". Introduction to Bayesian Networks Instructor: Dan Geiger Web page:"

Similar presentations


Ads by Google