UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Conflicts in Bayesian Networks January 23, 2007 Marco Valtorta

Slides:



Advertisements
Similar presentations
1 WHY MAKING BAYESIAN NETWORKS BAYESIAN MAKES SENSE. Dawn E. Holmes Department of Statistics and Applied Probability University of California, Santa Barbara.
Advertisements

A Tutorial on Learning with Bayesian Networks
Probabilistic models Jouni Tuomisto THL. Outline Deterministic models with probabilistic parameters Hierarchical Bayesian models Bayesian belief nets.
1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics,
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
Dynamic Bayesian Networks (DBNs)
Multisource Fusion for Opportunistic Detection and Probabilistic Assessment of Homeland Terrorist Threats Kathryn Blackmond Laskey & Tod S. Levitt presented.
COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.
Testing Hypotheses About Proportions Chapter 20. Hypotheses Hypotheses are working models that we adopt temporarily. Our starting hypothesis is called.
Integration of sensory modalities
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering On-line Alert Systems for Production Plants A Conflict Based Approach.
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Introduction to Hypothesis Testing
Sensitivity Analysis Reference n Bayesian Networks and Decision Graphs Finn V. Jensen n Expert Systems and Probabilistic Network Models Enrique Castillo,
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
Estimation 8.
Experimental Evaluation
The Analysis of Variance
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.
Statistics for Managers Using Microsoft® Excel 5th Edition
Bayesian Decision Theory Making Decisions Under uncertainty 1.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
Chapter 10 Hypothesis Testing
Fundamentals of Hypothesis Testing: One-Sample Tests
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Inference for a Single Population Proportion (p).
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Classification Techniques: Bayesian Classification
Hypothesis Testing. The 2 nd type of formal statistical inference Our goal is to assess the evidence provided by data from a sample about some claim concerning.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete.
Uncertainty Management in Rule-based Expert Systems
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Classification and Regression Trees
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Basic Concepts of Information Theory Entropy for Two-dimensional Discrete Finite Probability Schemes. Conditional Entropy. Communication Network. Noise.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
26134 Business Statistics Week 5 Tutorial
Online Conditional Outlier Detection in Nonstationary Time Series
Mutual inconsistency Due to the possible breakdown of the communication links, the system can be divided into several sub-systems (or sub-networks). There.
اختبار الفرضيات اختبارالفرضيات المتعلقة بالوسط
Software Reliability Models.
Statistical NLP: Lecture 4
Counting Statistics and Error Prediction
Chapter 8: Estimating with Confidence
Class #16 – Tuesday, October 26
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Presentation transcript:

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Conflicts in Bayesian Networks January 23, 2007 Marco Valtorta

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Example: Case Study #4 Bayesian Network Fragment Matching 1) Report Date: 1 April, FBI: Abdul Ramazi is the owner of the Select Gourmet Foods shop in Springfield Mall. Springfield, VA. (Phone number ). First Union National Bank lists Select Gourmet Foods as holding account number Six checks totaling $35,000 have been deposited in this account in the past four months and are recorded as having been drawn on accounts at the Pyramid Bank of Cairo, Egypt and the Central Bank of Dubai, United Arab Emirates. Both of these banks have just been listed as possible conduits in money laundering schemes. Partially- Instantiated Bayesian Network Fragment ….. …. BN Fragment Repository

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Example: Case Study #4 Bayesian Network Fragment Composition Fragments Situation-Specific Scenario

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Value of Information An item of information is useful if acquiring it leads to a better decision, that is, to a more useful action An item of information is useless if the actions that are taken after acquiring it are no more useful than before acquiring it In particular, information is useless if the actions that are taken after acquiring it are the same as before acquiring it In the absence of a detailed model of the utility of actions, the decrease in uncertainty about a variable of interest is taken to be a proxy for the increase in utility: the best item of information to acquire is the one that reduces the most the uncertainty about a variable of interest Since the value of the new item of information is not known, we average over its possible values Uncertainty is measured by entropy. Reduction in uncertainty is measured by reduction in entropy

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Example: Case Study #4 Computing Value of Information and Surprise This is the output of the VOI program on a situation-specific scenario for Case Study #4 (Sign of the Crescent). Variable Travel (which represents suspicious travel) is significant for determining the state of variable Suspect (whether Ramazi is a terrorist), even when it is already known that Ramazi has performed suspicious banking transactions. Ramazi performed illegal banking transactions Is Ramazi a terrorist? Would it help to know whether he traveled to sensitive locations? Yes

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Value of Information: Formal Definition Let V be a variable whose value affects the actions to be taken by an analyst. For example, V indicates whether a bomb is placed on a particular airliner Let p(v) be the probability that variable V has value v. The entropy of V is: Let T be a variable whose value we may acquire (by expending resources). For example, T indicates whether a passenger is a known terrorist. The entropy of V given that T has value t is: The expected entropy of V given T is: The value of information is:

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Surprise Detection Surprise is the situation in which evidence (a set of findings) and a situation-specific scenario are incompatible Since situation-specific scenarios are Bayesian networks, it is very unusual for an outright inconsistency to occur In some cases, however, the evidence is very unlikely in a given scenario; this may be because a rare case has been found, or because the scenario cannot explain the evidence To distinguish these two situations, we compare the probability of the evidence in the situation-specific scenario to the probability of the evidence in a scenario in which all events are probabilistically independent and occur with the same prior probability as in the situation-specific scenario

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering The VALUE OF INFORMATION of the test node C for the target node A is 0.0 Parsing the XMLBIF file 'ssn.xml'... done! PROBABILITY FOR JOINT FINDINGS = 5.0E-4 Prior probability for NODE: Suspicious Person=yes is 0.01 Prior probability for NODE: Unusual Activities=yes is Prior probability for NODE: Stolen Weapons=yes is 0.05 PROBABILITY FOR INDIVIDUAL FINDINGS = 3.28E-5 No conflict was detected. This shows the output of the surprise detection program. In this case, the user is informed that no conflict is detected, i.e., the scenario is likely to be a good interpretive model for the evidence received Example: Case Study #4 Computing Surprise

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Surprise Detection: Formal Definition Let the evidence be a set of findings: The probability of the evidence in the situation-specific scenario is where is the distribution represented in the situation-specific scenario The probability of the evidence in the model in which all variables are independent is The evidence is surprising if The conflict index is defined as The probability under that is greater than is Proof [Laskey, 1991]: If the conflict index is high, it is unlikely that the findings could have been generated by sampling the situation-specific scenario It is reasonable to inform the analyst that no good explanatory model of the findings exists, and we are in the presence of a novel or surprising situation

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering The Independent Straw Model In the absence of conflict, the joint probability of all evidence variables is greater than the product of the probabilities of each evidence variable. This is normally the case, because P(x|y) > P(x), and P(x,y) = P(x|y)P(y).

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Straw Models in Diagnosis A bipartite straw model is obtained by the elimination of some variables from a given model. In diagnosis by heuristic classification, one can divide variables into three sets: Target, Evidence, and Other

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering How to Compute the Conflict Index (I) The marginal probability of each finding is the normal result of any probability computation algorithm

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering How to Compute the Conflict Index (II) The probability of the evidence is a bi-product of probability update computed using the variable elimination or junction tree algorithms

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering P(e) from the Variable Elimination Algorithm Bucket  : Bucket  : Bucket  : Bucket  : Bucket  : Bucket : Bucket  : Bucket  : P(  |  ) P(  |  )*P(  ),  =“yes” P(  | , ) P(  | ,  ),  =“yes” P(  |  =“yes”,  =“yes”) =  X\ {  } (P(  |  )* P(  |  )* P(  | , )* P(  | ,  )* P(  )*P( |  )*P(  |  )*P(  )) P( |  ) P(  |  )*P(  ) H()H() H()H() H(,)H(,) H  ( ,,  ) H ( , ,  ) H()H() H(,)H(,) P(  |  =“yes”,  =“yes”) H n (u)=  xn П j i=1 C i (x n,u si ) *k P (e) = 1- k, where k is a normalizing constant       

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Sensitivity Analysis Sensitivity analysis assesses how much the posterior probability of some event of interest changes with respect to the value of some parameter in the model We assume that the event of interest is the value of a target variable. The parameter is either a conditional probability or an unconditional prior probability If the sensitivity of the target variable having a particular value is low, then the analyst can be confident in the results, even if the analyst is not very confident in the precise value of the parameter If the sensitivity of the target variable to a parameter is very high, it is necessary to inform the analyst of the need to qualify the conclusion reached or to expend more resources to become more confident in the exact value of the parameter

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Example: Case Study #4 Computing Sensitivity This is the output of the Sensitivity Analysis program on a situation-specific scenario for Case Study #4. In the context of the information already acquired, i.e., travel to dangerous places, large transfers of money, etc., the parameter that links financial irregularities to being a suspect is much more important for assessing the belief in Ramazi being a terrorist than the parameter that links dangerous travel to being a suspect. The analyst may want to concentrate on assessing the first parameter precisely.

UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering Sensitivity Analysis: Formal Definition Let the evidence be a set of findings: Let t be a parameter in the situation-specific scenario Then, [Castillo et al., 1997; Jensen, 2000] α and β can be determined by computing P(e) for two values of t More generally, if t is a set of parameters, then P(e)(t) is a linear function in each parameter in t, i.e., it is a multi-linear function of t Recall that Then, We can therefore compute the sensitivity of a target variable V to a parameter t by repeating the same computation with two values for the evidence set, viz. e and