Generalizing Variable Elimination in Bayesian Networks 서울 시립대학원 전자 전기 컴퓨터 공학과 G20124901 박민규.

Slides:

Advertisements

Similar presentations

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.

Advertisements

BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.

Exact Inference in Bayes Nets

Identifying Conditional Independencies in Bayes Nets Lecture 4.

Bayesian Networks. Introduction A problem domain is modeled by a list of variables X 1, …, X n Knowledge about the problem domain is represented by a.

Bayesian Networks Chapter 14 Section 1, 2, 4. Bayesian networks A simple, graphical notation for conditional independence assertions and hence for compact.

IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS

Overview of Inference Algorithms for Bayesian Networks Wei Sun, PhD Assistant Research Professor SEOR Dept. & C4I Center George Mason University, 2009.

Causal and Bayesian Network (Chapter 2) Book: Bayesian Networks and Decision Graphs Author: Finn V. Jensen, Thomas D. Nielsen CSE 655 Probabilistic Reasoning.

From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.

Review: Bayesian learning and inference

Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.

Bayesian networks Chapter 14 Section 1 – 2.

Bayesian Network Representation Continued

Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.

Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.

. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.

1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.

Bayesian Reasoning. Tax Data – Naive Bayes Classify: (_, No, Married, 95K, ?)

Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.

1 CMSC 471 Fall 2002 Class #19 – Monday, November 4.

Probabilistic Reasoning

Approximate Inference 2: Monte Carlo Markov Chain

Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)

Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.

Read R&N Ch Next lecture: Read R&N

A Brief Introduction to Graphical Models

Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?

Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.

Visibility Graph. Voronoi Diagram Control is easy: stay equidistant away from closest obstacles.

Probabilistic Belief States and Bayesian Networks (Where we exploit the sparseness of direct interactions among components of a world) R&N: Chap. 14, Sect.

Bayesian networks. Motivation We saw that the full joint probability can be used to answer any question about the domain, but can become intractable as.

Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))

Probabilistic Reasoning ECE457 Applied Artificial Intelligence Spring 2007 Lecture #9.

Introduction to Bayesian Networks

Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,

Probabilistic Reasoning [Ch. 14] Bayes Networks – Part 1 ◦Syntax ◦Semantics ◦Parameterized distributions Inference – Part2 ◦Exact inference by enumeration.

Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):

1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2008 Readings: K&F: 3.1, 3.2, –  Carlos.

Uncertainty ECE457 Applied Artificial Intelligence Spring 2007 Lecture #8.

Review: Bayesian inference  A general scenario:  Query variables: X  Evidence (observed) variables and their values: E = e  Unobserved variables: Y.

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

CS B 553: A LGORITHMS FOR O PTIMIZATION AND L EARNING Bayesian Networks.

Probability. Probability Probability is fundamental to scientific inference Probability is fundamental to scientific inference Deterministic vs. Probabilistic.

Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011.

Conditional Independence As with absolute independence, the equivalent forms of X and Y being conditionally independent given Z can also be used: P(X|Y,

1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.

Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.

Matching ® ® ® Global Map Local Map … … … obstacle Where am I on the global map?                                   

Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.

CS 2750: Machine Learning Bayesian Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2016.

Recuperação de Informação B Modern Information Retrieval Cap. 2: Modeling Section 2.8 : Alternative Probabilistic Models September 20, 1999.

CS 2750: Machine Learning Directed Graphical Models

Qian Liu CSE spring University of Pennsylvania

Inference in Bayesian Networks

Quick Review Probability Theory

Quick Review Probability Theory

Read R&N Ch Next lecture: Read R&N

Learning Bayesian Network Models from Data

Bayesian Networks Probability In AI.

Read R&N Ch Next lecture: Read R&N

Professor Marie desJardins,

Class #19 – Tuesday, November 3

Class #16 – Tuesday, October 26

Read R&N Ch Next lecture: Read R&N

Read R&N Ch Next lecture: Read R&N

Chapter 14 February 26, 2004.

Presentation transcript:

Generalizing Variable Elimination in Bayesian Networks 서울 시립대학원 전자 전기 컴퓨터 공학과 G 박민규

Background knowledge

Bayes’ theorem P(A,B) = P(A|B) P(B) = P(B|A) P(A) Bayes’ theorem

Bayesian Networks Probabilistic graphic model that represents a set of random variables and their conditional dependencies via a directed acyclic graph.

Bayesian Networks Independence assumption –If there are n random variables,the complete distribution is specified by 2^n -1 joint probabilities –n=5. 2^n-1 =31.But we needed only 10 values. If n=10, we need 21 values. –Bayesian Networks have built in independence assumptions.

Bayesian Networks Independence assumption –d-Separation d-separation is a graphical test of independence between variables in a directed acyclic graph.

Bayesian Networks Independence assumption –Q: Is BP dependent on FO? Converging Node

Bayesian Networks Independence assumption –Q: Is BP independent on FO? Evidence

Bayesian Networks Independence assumption –joint distribution <- Chain rule –Marginal independence A ⊥ B ⇔ P(A|B) = P(A), P(B|A) = P(B) –Conditional independence A ⊥ B|C ⇔ P(A|B,C) = P(A|C), P(B|A,C) = P(B|C)

Paper

Bayesian Networks

Each node of this graph represents a random variable X i in X. The parents of X i are denoted by pa(X i ); the children of X i are denoted by ch(X i ). The parents of children of X i that are not children themselves are denoted by spo(X i ) – these are the “spouses” of X i in the polygamy society of Bayesian networks.

Bayesian Networks The semantics of Bayesian network model is determined by the Markov condition: Every variable is independent of its non-descendants non-parents given its parents. This condition leads to a unique joint probability density.

Bayesian Networks A set of query variables X q The posterior probability of X q given E is:

Standard variable elimination Variable elimination is an algebraic scheme for inference in Bayesian networks. Variable and bucket elimination are essentially identical. –The basic principles of bucket manipulation have been studied extensively by Dechter.

Standard variable elimination Variables in X q necessarily belong to X R but not all observed variables belong to X R –X R : the set of requisite variables where probability densities must be restricted to domains containing no evidence.

Standard variable elimination Denote by N the number of requisite variables that are not observed and are not in X q. Now, suppose that these variables are ordered in some special way, so we have an ordering {X 1, X 2, X 3,…, X N }.

Standard variable elimination Because X1 can only appear in densities p(X j | pa(X j )) for X j ∈ {X 1, ch(X 1 )}, we can move the summation for X 1 :

Standard variable elimination At this point, we have “used” the densities for X j ∈ {X 1, ch(X 1 )}. To visualize operations more clearly, we can define the following un-normalized density.

Standard variable elimination Think of the various densities as living in a “pool” of densities. We collect the densities that contain X 1 take them off of the pool construct the a new (un-normalized) density p(ch(X 1 | pa(X 1 ) | pa(X 1 ), spo(X 1 )) and add this density to the pool.

Generalizing variable elimination Junction tree algorithms Bucket elimination algorithms Updating procedure –Updating buckets right above the root –Updating buckets away from the root

JavaBayes

GUI

Edit

DogProblem

Query p(d) = p(d ∩ b ∩ f) + p(d ∩ b c ∩ f c ) + p(d ∩ b c ∩ f) + p(d ∩ b ∩ f c ) = p(d | b, f)p(f)p(b) + p(d | b c, f c )p(f c )p(b c ) + p(d | b c, f)p(f)p(b c ) + p(d | b, f c )p(f c )p(b) = 0.99 * 0.15 * * 0.85 * * 0.99 * * 0.85 * 0.01 = =

Observe true p(d) = p(d ∩ b ∩ f) + p(d ∩ b ∩ f c ) = p(d | b, f)p(f) + p(d | b, f c )p(f c ) = 0.99 * * 0.85 = = 0.973

감사합니다.