Presentation is loading. Please wait.

Presentation is loading. Please wait.

IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS

Similar presentations


Presentation on theme: "IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS"— Presentation transcript:

1 IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS
By Sonal Junnarkar Friday, 05 October 2001 REFERENCES 1) AIS-BN: An Adaptive Importance Sampling Algorithm for Evidential Reasoning in Large Bayesian Networks -By Cheng and Druzdzel 2) Simulation Approaches To General Probabilistic Inference on Belief Networks - By R. D. Shachter and M. A. Peot

2 BAYES’S THEOREM Theorem P(h)  Prior Probability of Hypothesis h
Measures initial beliefs (BK) before any information is obtained (hence prior) P(D)  Prior Probability of Training Data D Measures probability of obtaining sample D (i.e., expresses D) P(h | D)  Probability of h Given D | denotes conditioning - hence P(h | D) is a conditional (aka posterior) probability P(D | h)  Probability of D Given h Measures probability of observing D given that h is correct (“generative” model) P(h  D)  Joint Probability of h and D Measures probability of observing D and of h being correct

3 Model Uncertainty in Intelligent Systems.
Bayesian Networks: Model Uncertainty in Intelligent Systems. Examples: Simple Bayesian Network P(x1 | y) P(x2 | y) P(x3 | y) P(xn | y) P(Y) “Sprinkler” BBN X1 X2 X3 X4 Season: Spring Summer Fall Winter Sprinkler: On, Off Rain: None, Drizzle, Steady, Downpour Ground: Wet, Dry X5 Slippery, Not-Slippery

4 WHAT IS SAMPLING? SAMPLING is:
Generalization of results by selecting units from a population and studying the sample - POPULATION is the group of people, items or units under investigation e.g. Analog (Continuous) to Digital (Discrete Signals) Conversion IMPORTANCE SAMPLING: aka Biased Sampling Probabilistic Sampling Method - any sampling method that utilizes some form of random selection Advantage: - To reduce variance and errors in the result - Importance Function: finite-dimensional integral helps reducing the sampling variance.

5 GENERAL IMPORTANCE SAMPLING ALGORITHM
Order the nodes in topological order Initialize Importance Function Pr0(X|E), total number of samples ‘m’, sampling interval ‘l’, and score arrays for every node. K <- 0, T<- 0 For I <- 1 to m do if(I mod l == 0) then k <- k + 1 Update Importance Function Prk(X\E) based on T* end if Si <- Generate a sample according to Prk(X\E) T <- T U {Si} Calculate Score (Si, Pr(X\E,e), Prk(X\E) ) and add to the corresponding entry in score array according to instantiated states. End for Normalize the score arrays for each node.

6 TERMS IN IMPORTANCE SAMPLING
Importance Function: Probability Density Function over the domain of a given system. Sample generation from this function. Probability Distribution over all the variables of a Bayesian network model, Pr(X) = Πi=1 to n Pr(Xi | Pa(Xi)) (Product of Probability of each node given its parents) Where Pa(Xi) – Parents of Node Xi Probability Distribution of Query Nodes (Nodes other than Evidence Nodes) Pr(X\E,E=e) = Πi=X\E Pr(Xi | Pa(Xi)) (\ : Set Difference)

7 TERMS IN IMPORTANCE SAMPLING contd…..
Sample Score is calculated as Pr(X\E) = Pr(X) / Pr(X\E,E=e) Revised importance distribution == Approximation of to posterior probability In Self Importance Sampling Algorithm, this function is updated in Step 7 As Prn+1(X\E) α Prn(X\E) + Pr(X\E) Periodic revision the conditional probability tables(CPTs) in order to make sampling distribution gradually approach the posterior distribution. Importance Sampling is biased, why? The same data is used to update the Importance Function and to compute the estimator, this process introduces bias in the estimator.

8 INTRODUCING PARALLELIZATION IN SIS
Different Techniques: Using multiple threads for sample generation If total samples = 100, then 10 samples per thread, also sampling interval = 10. Problem with Updating Importance Distribution Function, (Since updation of that function is done after each sampling interval.) *Already Implemented in New Code Calculating probabilities of independent nodes in parallel Start from root node and for every sample generated, calculate probability of conditionally independent nodes1 simultaneously. 1Conditionally Independent Nodes: not ancestors or descendents of each other.

9 Conditional Independence
Variable (node): conditionally independent of non-descendants given parents Example Result: chain rule for probabilistic inference Bayesian Network: Probabilistic Semantics Node: variable Edge: one axis of a conditional probability table (CPT) X1 X3 X4 X5 Age Exposure-To-Toxics Smoking Cancer X6 Serum Calcium X2 Gender X7 Lung Tumor


Download ppt "IMPORTANCE SAMPLING ALGORITHM FOR BAYESIAN NETWORKS"

Similar presentations


Ads by Google