Download presentation
Presentation is loading. Please wait.
Published byErnest Chambers Modified over 9 years ago
1
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) KDD Group Research Seminar Fall, 2001 – Presentation 2b of 11 Friday, 05 October 2001 Julie A. Stilson http://www.cis.ksu.edu/~jas3466 Reference Cheng, J. and Druzdzel, M (2000). “AIS-BN: An Adaptive Importance Sampling Algorithm for Evidential Reasoning in Large Bayesian Networks.” Journal of Artificial Intelligence Research, 13, 155-188. Adaptive Importance Sampling on Bayesian Networks (AIS-BN)
2
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) Outline Basic Algorithm –Definitions –Updating importance function –Example using Sprinkler-Rain Why Adaptive Importance Sampling? –Heuristic initialization –Sampling with unlikely evidence Different Importance Sampling Algorithms –Forward Sampling (FS) –Logic Sampling (LS) –Self-Importance Sampling (SIS) –Differences between SIS, AIS-BN Gathering results –How RMSE values are collected –Sample results for FS, AIS-BN
3
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) Definitions Importance Conditional Probability Tables (ICPTs) – Probability tables that represent the learned importance function – Initially, equal to the CPTs – Updated after each updating interval (see below) Learning Rate – The rate at which the true importance function is being learned – Learning rate = a (b / a) ^ (k / kmax) – A = initial learning rate, b = learning rate in last step, k = number of updates that have been made, kmax = total number of updates that will be made Frequency Table – Stores the frequency with which each instantiation of each query node occurs – Used to update importance function Updating Interval – AIS-BN updates the importance function after this many samples – If 1000 total samples are to be taken, and the updating interval is 100, then 10 total updates will be made
4
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) k := number of updates so far, m := desired number of samples, l := updating interval for (int i = 1, i <= m, i++) { if (i mod l == 0) { k++; Update importance function Pr^k(X\E) based on total samples } generate a sample according to Pr^k(X\E), add to total samples totalweight += Pr(s,e) / Pr^k(s) } totalweight = 0; T = null; for (int i = 1; i <= m, i++) generate a sample according to Pr^kmax(X\E), add to total samples totalweight += Pr(s,e) / Pr^kmax(s) compute RMSE value of s using totalweight } Basic Algorithm
5
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) Updating Importance Function Theorem: Xi in X, Xi not in Anc(E) => Pr(Xi | Pa(Xi), E) = Pr(Xi | Pa(Xi)) –Proved using d-connectivity –Only ancestors of evidence nodes need to have their importance function learned –The ICPT tables of all other nodes do not change throughout sampling Algorithm for Updating Importance Function : Sample l points independently according to the current importance function, Pr^k(X\E) For every query node Xi that is an ancestor to evidence, estimate Pr’(xi | pa(Xi), e) based on the samples Update Pr^k(X\E) according to the following formula: Pr^(k+1)(xi | pa(Xi), e) = Pr^k(xi | pa(Xi), e) + LRate * (Pr’(xi | pa(Xi), e) – Pr^k(xi | pa(Xi), e)
6
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) Example Using Sprinkler-Rain C S R G Cloudy: Yes No Sprinkler: On, Off Rain: Yes, No Ground: Wet, Dry –Imagine Ground is evidence – instantiated to Wet –More probable that Sprinkler is on and that it is raining –ICPT tables update the probabilities of the ancestors to evidence nodes to reflect this CloudyClear.5 C OnOff Cloudy.1.9 Clear.5 C RainNo rain Cloudy.8.2 Clear.2.8 SR WetDry OnRain.99.01 On No rain.9.1 OffRain.9.1 OffNo rain 01
7
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) Why Adaptive Importance Sampling? Heuristic Initialization: Parents to Evidence Nodes –Changes the probabilities of the parents to evidence to a uniform distribution when the probability of that evidence is sufficiently small –Parents of evidence nodes are most affected by the instantiation of evidence –Uniform distribution helps importance function be learned faster Heuristic Initialization: Extremely Small Probabilities –Extremely low probabilities would usually not be sampled much –Slow to learn true importance function –AIS-BN raises extremely low probabilities to a set threshold and lowers extremely high probabilities accordingly Sampling with Unlikely Evidence –Importance function very different from CPTs with unlikely evidence –Difficult to accurately sample without changing probability distributions –AIS-BN performs better than other sampling algorithms with unlikely evidence
8
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) Different Importance Sampling Algorithms Forward Sampling / Likelihood Weighting (FS) –Similar to AIS-BN, but importance function is not learned –Performs well under most circumstances –Doesn’t do well when evidence is unlikely Logic Sampling (LS) –Network is sampled randomly without regard to evidence –Samples that don’t match evidence are then discarded –Simplest importance sampling algorithm –Also performs poorly with unlikely evidence –Inefficient when many nodes are evidence Self-Importance Sampling (SIS) –Also updates an importance function –Does not obtain samples from learned importance function –Updates to importance function do not use sampling information –For large numbers of samples, performs worse than FS
9
Kansas State University Department of Computing and Information Sciences Laboratory for Knowledge Discovery in Databases (KDD) Gathering Results Relative Root Mean Square Error : – P( i ) is exact probability of sample – P ^ ( i ) is estimated probability of sample from frequency table – M:= arity, T:= number of samples RMSE Collection – Relative RMSE computed for each sample – Each RMSE value is stored in an output file: printings.txt Graphing Results – Open output file in Excel – Graph results using “Chart” Example Chart – ALARM network, 10000 samples – Compares FS, AIS-BN
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.