IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS

Slides:

Advertisements

Similar presentations

A Tutorial on Learning with Bayesian Networks

Advertisements

Probabilistic Reasoning Bayesian Belief Networks Constructing Bayesian Networks Representing Conditional Distributions Summary.

Dynamic Bayesian Networks (DBNs)

Homework 3: Naive Bayes Classification

Sampling Distributions (§ )

Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections

Bayesian Networks VISA Hyoungjune Yi. BN – Intro. Introduced by Pearl (1986 ) Resembles human reasoning Causal relationship Decision support system/ Expert.

1 Slides for the book: Probabilistic Robotics Authors: Sebastian Thrun Wolfram Burgard Dieter Fox Publisher: MIT Press, Web site for the book & more.

Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners David Jensen and Jennifer Neville.

M.I. Jaime Alfonso Reyes ´Cortés.  The basic task for any probabilistic inference system is to compute the posterior probability distribution for a set.

1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.

CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.

CS 547: Sensing and Planning in Robotics Gaurav S. Sukhatme Computer Science Robotic Embedded Systems Laboratory University of Southern California

Bayesian Networks. Graphical Models Bayesian networks Conditional random fields etc.

Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

5/25/2005EE562 EE562 ARTIFICIAL INTELLIGENCE FOR ENGINEERS Lecture 16, 6/1/2005 University of Washington, Department of Electrical Engineering Spring 2005.

Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Monday, March 6, 2000 William.

. Approximate Inference Slides by Nir Friedman. When can we hope to approximate? Two situations: u Highly stochastic distributions “Far” evidence is discarded.

A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.

1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.

Learning In Bayesian Networks. Learning Problem Set of random variables X = {W, X, Y, Z, …} Training set D = { x 1, x 2, …, x N }  Each observation specifies.

Bayesian Networks Textbook: Probabilistic Reasoning, Sections 1-2, pp

Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)

Artificial Intelligence CS 165A Tuesday, November 27, 2007  Probabilistic Reasoning (Ch 14)

Reasoning with Bayesian Networks. Overview Bayesian Belief Networks (BBNs) can reason with networks of propositions and associated probabilities Useful.

Importance Sampling ICS 276 Fall 2007 Rina Dechter.

Visibility Graph. Voronoi Diagram Control is easy: stay equidistant away from closest obstacles.

Bayesian Networks What is the likelihood of X given evidence E? i.e. P(X|E) = ?

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 Wednesday, 20 October.

Aprendizagem Computacional Gladys Castillo, UA Bayesian Networks Classifiers Gladys Castillo University of Aveiro.

Reasoning with Bayesian Belief Networks. Overview Bayesian Belief Networks (BBNs) can reason with networks of propositions and associated probabilities.

Bayesian Statistics and Belief Networks. Overview Book: Ch 13,14 Refresher on Probability Bayesian classifiers Belief Networks / Bayesian Networks.

Introduction to Bayesian Networks

Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 22 of 42 Wednesday, 22 October.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.

Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 25 of 41 Monday, 25 October.

Generalizing Variable Elimination in Bayesian Networks 서울 시립대학원 전자 전기 컴퓨터 공학과 G 박민규.

Reasoning Under Uncertainty: Conditioning, Bayes Rule & the Chain Rule Jim Little Uncertainty 2 Nov 3, 2014 Textbook §6.1.3.

1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.

Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.

1 CMSC 671 Fall 2001 Class #21 – Tuesday, November 13.

Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.

Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) KDD Group Research Seminar.

The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)

Chapter5: Evaluating Hypothesis. 개요 개요 Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral.

CPSC 422, Lecture 11Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 11 Oct, 2, 2015.

Lecture 2: Statistical learning primer for biologists

Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.

Bayesian networks and their application in circuit reliability estimation Erin Taylor.

Bayes Theorem. Prior Probabilities On way to party, you ask “Has Karl already had too many beers?” Your prior probabilities are 20% yes, 80% no.

Multi-target Detection in Sensor Networks Xiaoling Wang ECE691, Fall 2003.

Computing & Information Sciences Kansas State University Friday, 27 Oct 2006CIS 490 / 730: Artificial Intelligence Lecture 27 of 42 Friday, 27 October.

Quiz 3: Mean: 9.2 Median: 9.75 Go over problem 1.

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

Introduction on Graphic Models

The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.

Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.

Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.

Matching ® ® ® Global Map Local Map … … … obstacle Where am I on the global map?                                   

Reasoning Under Uncertainty: Conditioning, Bayes Rule & Chain Rule

Still More Uncertainty

CSE-490DF Robotics Capstone

Instructors: Fei Fang (This Lecture) and Dave Touretzky

Professor Marie desJardins,

Bayesian Statistics and Belief Networks

Class #19 – Tuesday, November 3

Class #16 – Tuesday, October 26

Sampling Distributions (§ )

Presentation transcript:

IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS By Sonal Junnarkar Friday, 05 October 2001 REFERENCES 1) AIS-BN: An Adaptive Importance Sampling Algorithm for Evidential Reasoning in Large Bayesian Networks -By Cheng and Druzdzel 2) Simulation Approaches To General Probabilistic Inference on Belief Networks - By R. D. Shachter and M. A. Peot

BAYES’S THEOREM Theorem P(h)  Prior Probability of Hypothesis h Measures initial beliefs (BK) before any information is obtained (hence prior) P(D)  Prior Probability of Training Data D Measures probability of obtaining sample D (i.e., expresses D) P(h | D)  Probability of h Given D | denotes conditioning - hence P(h | D) is a conditional (aka posterior) probability P(D | h)  Probability of D Given h Measures probability of observing D given that h is correct (“generative” model) P(h  D)  Joint Probability of h and D Measures probability of observing D and of h being correct

Model Uncertainty in Intelligent Systems. Bayesian Networks: Model Uncertainty in Intelligent Systems. Examples: Simple Bayesian Network P(x1 | y) P(x2 | y) P(x3 | y) P(xn | y) P(Y) “Sprinkler” BBN X1 X2 X3 X4 Season: Spring Summer Fall Winter Sprinkler: On, Off Rain: None, Drizzle, Steady, Downpour Ground: Wet, Dry X5 Slippery, Not-Slippery

WHAT IS SAMPLING? SAMPLING is: Generalization of results by selecting units from a population and studying the sample - POPULATION is the group of people, items or units under investigation e.g. Analog (Continuous) to Digital (Discrete Signals) Conversion IMPORTANCE SAMPLING: aka Biased Sampling Probabilistic Sampling Method - any sampling method that utilizes some form of random selection Advantage: - To reduce variance and errors in the result - Importance Function: finite-dimensional integral helps reducing the sampling variance.

GENERAL IMPORTANCE SAMPLING ALGORITHM Order the nodes in topological order Initialize Importance Function Pr0(X|E), total number of samples ‘m’, sampling interval ‘l’, and score arrays for every node. K <- 0, T<- 0 For I <- 1 to m do if(I mod l == 0) then k <- k + 1 Update Importance Function Prk(X\E) based on T* end if Si <- Generate a sample according to Prk(X\E) T <- T U {Si} Calculate Score (Si, Pr(X\E,e), Prk(X\E) ) and add to the corresponding entry in score array according to instantiated states. End for Normalize the score arrays for each node.

TERMS IN IMPORTANCE SAMPLING Importance Function: Probability Density Function over the domain of a given system. Sample generation from this function. Probability Distribution over all the variables of a Bayesian network model, Pr(X) = Πi=1 to n Pr(Xi | Pa(Xi)) (Product of Probability of each node given its parents) Where Pa(Xi) – Parents of Node Xi Probability Distribution of Query Nodes (Nodes other than Evidence Nodes) Pr(X\E,E=e) = Πi=X\E Pr(Xi | Pa(Xi)) (\ : Set Difference)

TERMS IN IMPORTANCE SAMPLING contd….. Sample Score is calculated as Pr(X\E) = Pr(X) / Pr(X\E,E=e) Revised importance distribution == Approximation of to posterior probability In Self Importance Sampling Algorithm, this function is updated in Step 7 As Prn+1(X\E) α Prn(X\E) + Pr(X\E) Periodic revision the conditional probability tables(CPTs) in order to make sampling distribution gradually approach the posterior distribution. Importance Sampling is biased, why? The same data is used to update the Importance Function and to compute the estimator, this process introduces bias in the estimator.

INTRODUCING PARALLELIZATION IN SIS Different Techniques: Using multiple threads for sample generation If total samples = 100, then 10 samples per thread, also sampling interval = 10. Problem with Updating Importance Distribution Function, (Since updation of that function is done after each sampling interval.) *Already Implemented in New Code Calculating probabilities of independent nodes in parallel Start from root node and for every sample generated, calculate probability of conditionally independent nodes1 simultaneously. 1Conditionally Independent Nodes: not ancestors or descendents of each other.

Conditional Independence Variable (node): conditionally independent of non-descendants given parents Example Result: chain rule for probabilistic inference Bayesian Network: Probabilistic Semantics Node: variable Edge: one axis of a conditional probability table (CPT) X1 X3 X4 X5 Age Exposure-To-Toxics Smoking Cancer X6 Serum Calcium X2 Gender X7 Lung Tumor