1 Inference Algorithm for Similarity Networks Dan Geiger & David Heckerman Presentation by Jingsong Wang USC CSE BN Reading Club 2008-03-17 Contact:

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

Chapter 4 Partition I. Covering and Dominating.
A Tutorial on Learning with Bayesian Networks
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Query Folding Xiaolei Qian Presented by Ram Kumar Vangala.
CS498-EA Reasoning in AI Lecture #15 Instructor: Eyal Amir Fall Semester 2011.
Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.
1 EE5900 Advanced Embedded System For Smart Infrastructure Static Scheduling.
Parallel Scheduling of Complex DAGs under Uncertainty Grzegorz Malewicz.
Introduction of Probabilistic Reasoning and Bayesian Networks
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.
Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
On the Role of MSBN to Cooperative Multiagent Systems By Y. Xiang and V. Lesser Presented by: Jingshan Huang and Sharon Xi.
M.I. Jaime Alfonso Reyes ´Cortés.  The basic task for any probabilistic inference system is to compute the posterior probability distribution for a set.
Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Bayesian Networks. Motivation The conditional independence assumption made by naïve Bayes classifiers may seem to rigid, especially for classification.
Statistical Methods Chichang Jou Tamkang University.
Bayes Nets Rong Jin. Hidden Markov Model  Inferring from observations (o i ) to hidden variables (q i )  This is a general framework for representing.
Bayesian Network Representation Continued
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
EXPANDER GRAPHS Properties & Applications. Things to cover ! Definitions Properties Combinatorial, Spectral properties Constructions “Explicit” constructions.
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
Bayesian Networks Alan Ritter.
. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.
CSc411Artificial Intelligence1 Chapter 5 STOCHASTIC METHODS Contents The Elements of Counting Elements of Probability Theory Applications of the Stochastic.
. PGM 2002/3 – Tirgul6 Approximate Inference: Sampling.
Probabilistic Reasoning
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Independence and Bernoulli.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Naive Bayes Classifier
Digital Statisticians INST 4200 David J Stucki Spring 2015.
1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.
第十讲 概率图模型导论 Chapter 10 Introduction to Probabilistic Graphical Models
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.
Independence and Bernoulli Trials. Sharif University of Technology 2 Independence  A, B independent implies: are also independent. Proof for independence.
Chapter 6 Bayesian Learning
Slides for “Data Mining” by I. H. Witten and E. Frank.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2008 Readings: K&F: 3.1, 3.2, –  Carlos.
1 Use graphs and not pure logic Variables represented by nodes and dependencies by edges. Common in our language: “threads of thoughts”, “lines of reasoning”,
1 Learning P-maps Param. Learning Graphical Models – Carlos Guestrin Carnegie Mellon University September 24 th, 2008 Readings: K&F: 3.3, 3.4, 16.1,
Bayesian Optimization Algorithm, Decision Graphs, and Occam’s Razor Martin Pelikan, David E. Goldberg, and Kumara Sastry IlliGAL Report No May.
Bayesian Learning Bayes Theorem MAP, ML hypotheses MAP learners
1 BN Semantics 2 – Representation Theorem The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 17.
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Belief Networks CS121 – Winter Other Names Bayesian networks Probabilistic networks Causal networks.
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
What have you learned in this lecture until now? Circuit theory Undefined quantities ( , ) Axioms ( , ) Two.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Theory of Computational Complexity M1 Takao Inoshita Iwama & Ito Lab Graduate School of Informatics, Kyoto University.
1 BN Semantics 3 – Now it’s personal! Parameter Learning 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 22 nd, 2006 Readings:
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Data Mining Lecture 11.
Computability and Complexity
Hidden Markov Models Part 2: Algorithms
More about Posterior Distributions
CAP 5636 – Advanced Artificial Intelligence
CS 188: Artificial Intelligence
Belief Networks CS121 – Winter 2003 Belief Networks.
Machine Learning: Lecture 6
BN Semantics 3 – Now it’s personal! Parameter Learning 1
Machine Learning: UNIT-3 CHAPTER-1
Switching Lemmas and Proof Complexity
Locality In Distributed Graph Algorithms
Presentation transcript:

1 Inference Algorithm for Similarity Networks Dan Geiger & David Heckerman Presentation by Jingsong Wang USC CSE BN Reading Club Contact:

2 The secured building story A guard of a secured building expects four types of persons to approach the building's entrance: executives, regular workers, approved visitors, and spies. As a person approaches the building, the guard can note its gender, whether or not the person wears a badge, and whether or not the person arrives in a limousine. We assume that only executives arrive in limousines and that male and female executives wear badges just as do regular workers (to serve as role models). Furthermore, we assume that spies are mostly men. Spies always wear badges in an attempt to fool the guard. Visitors don't wear badges because they don't have one. Female- workers tend to wear badges more often than do male-workers. The task of the guard is to identify the type of person approaching the building.

3 Definition of Similarity Network Distinguished Variable Hypothesis Cover –A cover of a set of hypotheses H is a collection {A 1,..., A k } of nonempty subsets of H whose union is H. –Each cover is a hypergraph, called a similarity hypergraph, where the A i are hyperedges and the hypotheses are nodes. –A cover is connected if the similarity hypergraph is connected.

4 Similarity Network –Let P(h, u 1,..., u n ) be a probability distribution and A 1,..., A k be a connected cover of the values of h. A directed acyclic graph D i is called a local network of P associated with A i if D i is a Bayesian network of P(h, v 1,..., v m | [[Ai]]) where {v 1,..., v m } is the set of all variables in {u 1,..., u n } that “help to discriminate” the hypotheses in A i. The set of k local networks is called a similarity network of P. Definition of Similarity Network

5 A similarity network representation

6 Definition of Similarity Network Subset Independence Hypothesis-specific Independence

7 Definition of Similarity Network The practical solution for constructing the similarity hypergraph is to choose a connected cover by grouping together hypotheses that are ``similar'' to each other by some criteria under our control (e.g., spies and visitors). This choice tends to maximize the number of subset independence assertions encoded in a similarity network. Hence the name for this representation.

8 Two Types of Similarity Networks “helps to discriminate” Related Relevant Define event e to be [[A i ]] –A disjunction over a subset of the values of h

9 Two Types of Similarity Networks Type 1 –A similarity network constructed by including in each local network D i only those variables u that satisfy related(u, h | [[A i ]]) is said to be of type 1. Type 2 –relevant(u, h | [[A i ]])

10 Two Types of Similarity Networks Theorem 1 –Let P(u 1, … u n | e ) be a probability distribution where U= {u 1, … u n } and e be a fixed event. Then, u i and u j are unrelated given e iff there exist a partition U1, U2 of U such that u i ∈ U1, u j ∈ U2, and P(U1, U2 | e) = P(U1 | e) P(U2 | e)

11 Two Types of Similarity Networks Theorem 2 –Let P(u 1,…, u n | e) be a probability distribution where e is a fixed event. Then, for every u i and u j, relevant(u i, u j | e) implies related(u i, u j | e)

12 Inference Using Similarity Networks The main task similarity networks are designed for is to compute the posterior probability of each hypothesis given a set of observations, as is the case in diagnosis. Under reasonable assumptions, the computation of the posterior probability of each hypothesis can be done in each local network and then be combined coherently according to the axioms of probability theory.

13 Inference Using Similarity Networks Strictly Positive We will remove this assumption later at the cost of obtaining an inference algorithm that operates only on type 1 similarity networks and whose complexity is higher.

14 Inference Using Similarity Networks The inference problem –Compute P(h j | v 1,…,v m ) INFER procedure –Two parameters: a query, a BN

15 Inference Using Similarity Networks

16

17 Inference Using Similarity Networks Theorem 3 –Let P(h,u 1,…,u n ) be a probability distribution and A= { A 1,…, A k } be a partition of the values of h. Let S be a similarity network based on A. Let v 1,…,v m be a subset of variables whose value is given. There exists a single solution for the set of equations defined by Line 7 and 8 of the above algorithm and this solution determines uniquely the conditional probability P(h |v 1, …, v m ). Complexity

18 Inferential And Diagnostic Completeness Inferential Complete Diagnostically Complete

19 Inferential And Diagnostic Completeness Theorem 4 (restricted inferential completeness) Theorem 5 (Diagnostic completeness)

20 Inferential And Diagnostic Completeness Hypothesis-specific Bayesian multinet of P Similarity network to Bayesian Multinet conversion

21

22 Inferential And Diagnostic Completeness Hypothesis-specific Bayesian- Multinet Inference Algorithm –For each hypothesis h i – B i = INFER(P(v 1,…,v l | h i ), M i ) –For each hypothesis h i – Compute P(h i | v 1,…,v l )