. Introduction to Bayesian Networks Instructor: Dan Geiger Web page:

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

Variational Methods for Graphical Models Micheal I. Jordan Zoubin Ghahramani Tommi S. Jaakkola Lawrence K. Saul Presented by: Afsaneh Shirazi.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
Rutgers CS440, Fall 2003 Review session. Rutgers CS440, Fall 2003 Topics Final will cover the following topics (after midterm): 1.Uncertainty & introduction.
. Introduction to Bayesian Networks Instructor: Dan Geiger Web page:
CSE 5522: Survey of Artificial Intelligence II: Advanced Techniques Instructor: Alan Ritter TA: Fan Yang.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Lec 2: March 30th, 2006EE512 - Graphical Models - J. BilmesPage 1 Jeff A. Bilmes University of Washington Department of Electrical Engineering EE512 Spring,
Lec 9: April 25th, 2006EE512 - Graphical Models - J. BilmesPage 1 Jeff A. Bilmes University of Washington Department of Electrical Engineering EE512 Spring,
EECS 349 Machine Learning Instructor: Doug Downey Note: slides adapted from Pedro Domingos, University of Washington, CSE
CSE 546 Data Mining Machine Learning Instructor: Pedro Domingos.
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.
Lec 8: April 20th, 2006EE512 - Graphical Models - J. BilmesPage 1 Jeff A. Bilmes University of Washington Department of Electrical Engineering EE512 Spring,
Today Logistic Regression Decision Trees Redux Graphical Models
Bayesian Networks Alan Ritter.
Rutgers CS440, Fall 2003 Introduction to Statistical Learning Reading: Ch. 20, Sec. 1-4, AIMA 2 nd Ed.
CSE 590ST Statistical Methods in Computer Science Instructor: Pedro Domingos.
Lec 8: April 20th, 2006EE512 - Graphical Models - J. BilmesPage 1 Jeff A. Bilmes University of Washington Department of Electrical Engineering EE512 Spring,
Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.
CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.
If we measured a distribution P, what is the tree- dependent distribution P t that best approximates P? Search Space: All possible trees Goal: From all.
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
1 Bayesian Param. Learning Bayesian Structure Learning Graphical Models – Carlos Guestrin Carnegie Mellon University October 6 th, 2008 Readings:
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.
CS774. Markov Random Field : Theory and Application Lecture 02
Learning With Bayesian Networks Markus Kalisch ETH Zürich.
1 COROLLARY 4: D is an I-map of P iff each variable X is conditionally independent in P of all its non-descendants, given its parents. Proof  : Each variable.
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2008 Readings: K&F: 3.1, 3.2, –  Carlos.
1 Bayesian Networks (Directed Acyclic Graphical Models) The situation of a bell that rings whenever the outcome of two coins are equal can not be well.
Independence, Decomposability and functions which take values into an Abelian Group Adrian Silvescu Vasant Honavar Department of Computer Science Iowa.
Lecture 2: Statistical learning primer for biologists
Instructor: Pedro Domingos
1 Parameter Learning 2 Structure Learning 1: The good Graphical Models – Carlos Guestrin Carnegie Mellon University September 27 th, 2006 Readings:
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
1 Param. Learning (MLE) Structure Learning The Good Graphical Models – Carlos Guestrin Carnegie Mellon University October 1 st, 2008 Readings: K&F:
Christopher M. Bishop, Pattern Recognition and Machine Learning 1.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
1 Learning P-maps Param. Learning Graphical Models – Carlos Guestrin Carnegie Mellon University September 24 th, 2008 Readings: K&F: 3.3, 3.4, 16.1,
1 BN Semantics 2 – Representation Theorem The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 17.
1 Structure Learning (The Good), The Bad, The Ugly Inference Graphical Models – Carlos Guestrin Carnegie Mellon University October 13 th, 2008 Readings:
Dynamic Programming & Hidden Markov Models. Alan Yuille Dept. Statistics UCLA.
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.
. Bayesian Networks Some slides have been edited from Nir Friedman’s lectures which is available at Changes made by Dan Geiger.
General Information Course Id: COSC6342 Machine Learning Time: TU/TH 1-2:30p Instructor: Christoph F. Eick Classroom:AH301
Slide 1 Directed Graphical Probabilistic Models: inference William W. Cohen Machine Learning Feb 2008.
. Introduction to Bayesian Networks Instructor: Dan Geiger Web page:
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk
Introduction to Bayesian Networks Instructor: Dan Geiger
ICS 280 Learning in Graphical Models
Bayesian Networks Background Readings: An Introduction to Bayesian Networks, Finn Jensen, UCL Press, Some slides have been edited from Nir Friedman’s.
Exact Inference Continued
Markov Networks.
Dependency Models – abstraction of Probability distributions
CSE 515 Statistical Methods in Computer Science
Readings: K&F: 15.1, 15.2, 15.3, 15.4, 15.5 K&F: 7 (overview of inference) K&F: 8.1, 8.2 (Variable Elimination) Structure Learning in BNs 3: (the good,
Introduction to Bayesian Networks Instructor: Dan Geiger
Bayesian Networks Independencies Representation Probabilistic
Exact Inference ..
Graduate School of Information Sciences, Tohoku University
Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.
Markov Networks Independencies Representation Probabilistic Graphical
Presentation transcript:

. Introduction to Bayesian Networks Instructor: Dan Geiger Web page: Phone: Office: Taub 616. על מה המהומה ? נישואים מאושרים בין תורת ההסתברות ותורת הגרפים. הילדים המוצלחים: אלגוריתמים לגילוי תקלות, קודים לגילוי שגיאות, מודלים למערכות מורכבות. שימושים במגוון רחב של תחומים.

2 Course Information Meetings:  Lecture: Mondays 10:30 –12:30  Tutorial: Wednesdays 16:30 – 17:30 Grade: u 50% in 4 question sets. These questions sets are obligatory. Each contains 6 problems. Submit in pairs in two weeks time. u 50% one hour lecture (Priority to graduate students). u Prerequisites: u Data structure 1 (cs234218) u Algorithms 1 (cs234247) u Probability (any course) Information and handouts: u

3 Lectures Plan u Mathematical Foundations (5-6 weeks including 3 students’ lectures, based on Pearl’s Chapter 3 + papers). 1.Properties of Conditional Independence (Soundness and completeness of marginal independence, graphoid axioms and their interpretation as “irrelevance”, incompleteness of conditional independence, no disjunctive axioms possible.) 2.Properties of graph separation (Paz and Pearl 85, Theorem 3), soundness and completeness of saturated independence statements. Undirected Graphs as I-maps of probability distributions. Markov-Blankets, Pairwise independence basis. Representation theorems (Pearl and Paz, from each basis to I-maps). 3.Markov networks, HC representation theorem, Completeness theorem. Markov chains 4.Bayesian Networks, d-separation, Soundness, Completeness. 5.Chordal Graphs as the intersection of BN and Markov networks. Equivalence of their 4 definitions. u Combinatorial Optimiziation of Exact Inference in Graphical models (3 weeks including 2 students lectures). 1.Variable elimination; greedy algorithms for optimization. 2.Clique tree algorithm. Conditioning. 3.Treewidth. Feddback Vertex Set. u Learning (3 weeks including 2 students lectures). 1.The ML method and the EM algorithm 2.Chow and Liu’s algorithm; the TAN model. 3.K2 measure, score equivalence, Chickering’s theorem, Dirichlet priors, Characterization theorem.

4 What is it all above ? How to use graphs to represent probability distributions over thousands of random variables ? How to encode conditional independence in directed and undirected graphs ? How to use such representations for efficient computations of the probability of events of interest ? How to learn such models from data ?

5 Properties of Independence I (X,Y) iff Pr(X=x,Y=y) = Pr(X=x)Pr(Y=y) Properties Symmetry: I(X,Y)  I(Y,X) Decomposition: I(X,YW)  I(X,Y) Mixing: I(X,Y) and I(XY,W)  I(X,YW) Are there more properties of independence ?

6 Properties of Conditional Independence Properties Symmetry: I(X,Z,Y)  I(Y,Z,X) Decomposition: I(X,Z,YW)  I(X,Z,Y) Mixing: I(X,Z,Y) and I(XY,Z,W)  I(X,Z,YW) Are there more properties of independence ? I(X,Z,Y) if and only if Pr(X=x,Y=y |Z=z) = Pr(X=x |Z=z) Pr(Y=y |Z=z)

7 A simple Markov network A D C B The probability function represented by this graph satisfies: I(A,{C,D},B) and I(C, {A,B}, D). In large graphs, how do we compute P(A|B) ? How do we learn the best graph from sample data ? f 4 (d,a) f 1 (a,c) f 2 (c,b) f 3 (b,d)

8 Relations to Some Other Courses u Introduction to Artificial Intelligence (cs236501) u Introduction to Machine Learning (cs236756) u Introduction to Neural Networks (cs236950) u Algorithms in Computational Biology (cs236522) u Error correcting codes u Data mining אמור לי מי חבריך ואומר לך מי אתה.

9 Possible Recitations Topics u Mathematical Foundations 1.Bayes rule and independence for multivariate distributions 2.Cover and Thomas Cover and Thomas Properties of independence using entropy (e.g. Studeny). u Graphical models 1.D-separation (maybe with deterministic nodes) 2.HMM (Rabiner’s tutorial) 3.HMM (Rabiner’s tutorial) 4.MAP via And-Or-Trees 5.Darwiche’s algorithm 6.Mini Buckets and other approximations 7.Software of Bayesian networks and HMMs (JAVA, Mathlab) 8.The ML method and EM algorithm 9.The EM algorithm 10.Noisy or gates, context sensitive independence