Uncertainty  Most intelligent systems have some degree of uncertainty associated with them.  The types of uncertainty that can occur in KBS may be caused.

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
FT228/4 Knowledge Based Decision Support Systems
1 Chapter 3 Probability 3.1 Terminology 3.2 Assign Probability 3.3 Compound Events 3.4 Conditional Probability 3.5 Rules of Computing Probabilities 3.6.
COUNTING AND PROBABILITY
Copyright © Cengage Learning. All rights reserved.
B. Ross Cosc 4f79 1 Uncertainty Knowledge can have uncertainty associated with it - Knowledge base: rule premises, rule conclusions - User input: uncertain,
NIPRL Chapter 1. Probability Theory 1.1 Probabilities 1.2 Events 1.3 Combinations of Events 1.4 Conditional Probability 1.5 Probabilities of Event Intersections.
5/17/20151 Probabilistic Reasoning CIS 479/579 Bruce R. Maxim UM-Dearborn.
AI – CS364 Uncertainty Management Introduction to Uncertainty Management 21 st September 2006 Dr Bogdan L. Vrusias
Chapter 10: Hypothesis Testing
For Thursday, Feb. 20 Bring Conditional Probability Worksheet to class Exam #1:March 4 Project Presentations Start March 6 Team Homework #4 due Feb. 25.
Fundamentals of Forensic DNA Typing Slides prepared by John M. Butler June 2009 Appendix 3 Probability and Statistics.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
I The meaning of chance Axiomatization. E Plurbus Unum.
Lecture 05 Rule-based Uncertain Reasoning
Probability (cont.). Assigning Probabilities A probability is a value between 0 and 1 and is written either as a fraction or as a proportion. For the.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
For Monday after Spring Break Read Homework: –Chapter 13, exercise 6 and 8 May be done in pairs.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Chapter 6 Probability.
Lecture II.  Using the example from Birenens Chapter 1: Assume we are interested in the game Texas lotto (similar to Florida lotto).  In this game,
Basic Concepts and Approaches
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 13, Slide 1 Chapter 13 From Randomness to Probability.
Dr. Gary Blau, Sean HanMonday, Aug 13, 2007 Statistical Design of Experiments SECTION I Probability Theory Review.
Theory of Probability Statistics for Business and Economics.
 Basic Concepts in Probability  Basic Probability Rules  Connecting Probability to Sampling.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Probability. Statistical inference is based on a Mathematics branch called probability theory. If a procedure can result in n equally likely outcomes,
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Testing Theories: The Problem of Sampling Error. The problem of sampling error It is often the case—especially when making point predictions—that what.
1 Reasoning Under Uncertainty Artificial Intelligence Chapter 9.
Topic 2: Intro to probability CEE 11 Spring 2002 Dr. Amelia Regan These notes draw liberally from the class text, Probability and Statistics for Engineering.
Uncertainty Management in Rule-based Expert Systems
Uncertainty in Expert Systems
Making sense of randomness
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Fall 2002Biostat Statistical Inference - Confidence Intervals General (1 -  ) Confidence Intervals: a random interval that will include a fixed.
From Randomness to Probability Chapter 14. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen,
Introduction  Probability Theory was first used to solve problems in gambling  Blaise Pascal ( ) - laid the foundation for the Theory of Probability.
Reasoning with Uncertainty دكترمحسن كاهاني
Probability. What is probability? Probability discusses the likelihood or chance of something happening. For instance, -- the probability of it raining.
Textbook Basics of an Expert System: – “Expert systems: Design and Development,” by: John Durkin, 1994, Chapters 1-4. Uncertainty (Probability, Certainty.
Introduction Remember that probability is a number from 0 to 1 inclusive or a percent from 0% to 100% inclusive that indicates how likely an event is to.
Chapter 4 Probability Concepts Events and Probability Three Helpful Concepts in Understanding Probability: Experiment Sample Space Event Experiment.
PROBABILITY AND BAYES THEOREM 1. 2 POPULATION SAMPLE PROBABILITY STATISTICAL INFERENCE.
Chapter 2: Probability. Section 2.1: Basic Ideas Definition: An experiment is a process that results in an outcome that cannot be predicted in advance.
1 Probability- Basic Concepts and Approaches Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering EMIS 7370/5370 STAT 5340 : PROBABILITY AND.
Copyright © 2010 Pearson Education, Inc. Chapter 14 From Randomness to Probability.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Chapter 4 Probability.
PROBABILITY 1. Basic Terminology 2 Probability 3  Probability is the numerical measure of the likelihood that an event will occur  The probability.
Anifuddin Azis UNCERTAINTY. 2 Introduction The world is not a well-defined place. There is uncertainty in the facts we know: What’s the temperature? Imprecise.
AP Statistics From Randomness to Probability Chapter 14.
Probability and Probability Distributions. Probability Concepts Probability: –We now assume the population parameters are known and calculate the chances.
REASONING UNDER UNCERTAINTY: CERTAINTY THEORY
Virtual University of Pakistan
Chapter 3: Probability Topics
Probability Axioms and Formulas
What is Probability? Quantification of uncertainty.
Basic Probabilistic Reasoning
Probability Probability underlies statistical inference - the drawing of conclusions from a sample of data. If samples are drawn at random, their characteristics.
Honors Statistics From Randomness to Probability
Virtual University of Pakistan
28th September 2005 Dr Bogdan L. Vrusias
Mathematical Foundations of BME Reza Shadmehr
A random experiment gives rise to possible outcomes, but any particular outcome is uncertain – “random”. For example, tossing a coin… we know H or T will.
Presentation transcript:

Uncertainty  Most intelligent systems have some degree of uncertainty associated with them.  The types of uncertainty that can occur in KBS may be caused by problems with the data. 1. Data might be missing or unavailable. 2. Data might be present but unreliable or ambiguous due to measurement errors, multiple conflicting measurements etc. 3. The representation of the data may be imprecise or inconsistent. 4. Data may just be user's best guess. 5. Data may be based on defaults and the defaults may have exceptions.

 Alternatively, the uncertainty may be caused by the represented knowledge since - it might represent best guesses of the expert that are based on plausible or statistical observations - it might not be appropriate in all situations  Given these numerous sources of errors, the most KBS requires the incorporation of some form of uncertainty management.  While implementing some uncertainty scheme, we must be concerned with three issues. 1. How to represent uncertain data. 2. How to combine two or more pieces of uncertain data. 3. How to draw inference using uncertain data.  Several methods have been proposed for handling uncertain information.

Probabilities (Oldest with strong mathematical basis)  The chance that a particular event will occur is equal to the number of ways the event can occur divided by the total number of all possible events. Example: The probability of throwing two successive heads with a fair coin is 0.25 * Total of four possible outcomes are : H-H, H-T, T-H & T-T * Since there is only one way of getting H-H, the probability = ¼ = 0.25 Probability is a way of turning opinion or expectation into numbers. It is a number between 0 to 1 that reflects the likelihood of an event.

 An impossible event has a probability of 0 while a certain event has a probability.  We write P(A) for the probability of an event A.  A set of independent events, call it sample space may represent all possible outcomes. It is the set of all possible outcomes of a random experiment.  The possibility of all events S = {A1, A2, …, An} must sum up to certainty i.e. P(A1) P(An) = 1 Event:Every non-empty subset A (of sample space S) is called an event. - null set  is an impossible event. - S is a sure event  Since the events are the set, it is clear that all set operations can be performed on the events.

1. A  B is an event "both A and B". 2. A  B is an event "either A or B or both" 3. A' is an event "not A" 4. A - B is an event "A but not B 5. Events A and B are mutually exclusive, if A  B=  Axioms of Probability : Let S be a sample space, A and B are events. 1. P(A)  0 2. P(S) = 1 3. P(A ’ ) = 1 - P(A) 4. P(A  B ) = P(A) + P(B), if events A and B are utually exclusive 5. P(A  B ) = P(A) + P(B) – P(A  B) Proof: A  B = A  (A’  B) Therefore, P(A  B) = P(A  (A’  B)) = P(A) + P(A’  B) = P(A) + P(A’  B) + P(A  B) - P(A  B) = P(A) + P(B)- P(A  B)

 In general, for mutually exclusive events A 1, …, A n in S P(A 1  A 2  …  A n ) = P(A 1 ) + P(A 2 ) + …+ P(A n )  Joint Probability of the occurrence of two event (which are independent ) is written as P (A and B) is defined by P(A and B) = P(A) * P(B) Example: We toss two fair coins separately. Let P(A) = 0.5, Probability of getting Head of first coin P(B) = 0.5, Probability of getting Head of second coin Probability of getting Heads on both the coins is Joint probability = P(A and B) = P(A) * P(B) = 0.5 X 0.5 = 0.25  The probability of getting Heads on one or both of the coins i.e. the union of the probabilities P(A) and P(B) is expressed as P(A or B) = P(A) + P(B) - P(A) * P(B) = 0.5 X = 0.75

 Conditional Probability : It relates the probability of one event to the occurrence of another i.e. probability of the occurrence of an event E given that an event F is known to have occurred.  Probability of an event E, given the occurrence of an event F is denoted by P(E | F) and is defined as follows: Number of events favourable to F which are also favourable to E P(E | F) = No. of events favourable to F =P(E and F) / P(F)

Example: What is the probability of a person to be male if person chosen at random is 80 years old and if the following are given - Any person chosen at random being male is about Let the probability of a given person be 80 years old chosen at random is equal to Let the probability that a given person chosen at random is both male and 80 years old may be =0.002 Then the probability that an 80 years old person chosen at random is male is calculated as follows: P(X is male | Age of X is 80) = [P(X is male and the age of X is 80)] / [P(Age of X is 80)] = / = 0.4

Result : If the events E and F defined on the sample space S of a random experiments are independent then P(E | F) = P(E) Proof: P(E | F) = P(E and F) / P(F) =P(E) * P(F) / P(F) =P(E) Similarly, P(F | E) = P(F) Bayesian Probabilities  This approach relies on the concept that one should incorporate the prior probability of an event into the interpretation of a situation.  Bayes theorem provides a mathematical model for this type of reasoning where prior beliefs are combined with evidence to get estimates of uncertainty.  Bayes theorem relates the conditional probabilities of events i.e. it allows us to express the probability P(A | B) in terms of the probabilities of P(B | A), P(A) and P(B).

This is often important since if P(B | A), P(A) and P(B) are given then one can compute probability P(A | B) as follows: P(A | B) = [P(B | A) * P(A) ] / P(B) Proof: P(A | B) = P(A and B) / P(B) P(B | A) = P(B and A) / P(A) Therefore, P(A | B) = [P(B | A) * P(A) ] / P(B) Here B represents evidence and A represents the set of hypotheses. Some Expert Systems use Bayesian theory to derive further concepts. Example i.If _then rule in rule-based Systems if X is true then Y can be concluded with probability P i.e. IF the patient has a cold THEN the patient will sneeze (0.75)

But if we reason abductively and observe that the patient sneezes while knowing nothing about whether the patient has a cold or not, then what can we conclude about it? To make this more concrete, consider the following example: Example: Find whether Bob has a cold (hypotheses) given that he sneezes (the evidence) i.e., calculate P(H | E). Suppose that we know in general P(H)=P (Bob has a cold) =0.2 P(E | H)=P(Bob was observed sneezing | Bob has a cold) =0.75 P(E | ~H)=P(Bob was observed sneezing | Bob does not have a cold)= 0.2 Now P(H | E) = P(Rob has a cold | Rob was observed sneezing) =[ P(E | H) * P(H) ] / P(E)

We can compute P(E) as follows: P(E) =P(E | H) * P(H) + P(E | ~H) * P(~H) = (0.75)(0.2) + (0.2) (0.8) = 0.31 Hence P(H | E) = [(0.75 * 0.2)] / 0.31 = We can conclude that “Rob’s probability of having a cold given that he sneezes” is about 0.5 We can also determine what is his probability of having a cold if he was not sneezing P(H | ~E) =[P(~E | H) * P(H)] / P(~E) =[(1 – 0.75) * 0.2] / (1 – 0.31) =0.05 / 0.69 =0.072 Hence “Rob’s probability of having a cold if he was not sneezing” is 0.072

Result: Show that P(E) = P(E | H) * P(H) + P(E | ~H) * P(~H) Proof: P(E) = P( E and H) + P( E and ~H) We know that P( E and H) = P(E | H) * P(H) P( E and ~H) = P(E | ~H) * P(~H) Therefore, P(E) = P( E | H) * P(H) + P( E | ~H) * P(~H)

Advantages and disadvantages of Bayesian Method Advantages:  They have sound theoretical foundation in probability theory and thus are currently the most mature of all certainty reasoning methods.  Also they have well-defined semantics for decision making. Disadvantages:  They require a significant amount of probability data to construct a KB. - For example, a diagnostic system having 50 detectable conclusions (R) and 300 relevant and observable characteristics (S) requires a minimum of 15,050 (R*S + R) probability values assuming that all of the conclusions are mutually exclusive.

 If conditional probabilities are based on statistical data, the sample sizes must be sufficient so that the probabilities obtained are accurate.  If based on human experts, then question of values being consistent & comprehensive arise.  The reduction of the associations between the hypothesis and evidence to numbers also eliminates the knowledge embedded within.  The associations that would enable the system to explain its reasoning to a user are lost and also the ability to browse through the hierarchy of evidences to hypothesis.

Probabilities in Rules and Facts of Production System  We normally assume that the facts are always completely true but facts might also be probably true.  Probability can be put as the last argument to a predicate representing fact.  For example, a fact "battery in a randomly picked car is 4% of the time dead" in Prolog is expressed as battery_dead (0.04).  This fact indicates that battery is dead is sure with probability  Similarly probabilities can also be added to rules.

Examples:Consider the following probable rules and their corresponding Prolog representation. "if 30% of the time when car does not start, it is true that the battery is dead " battery_dead (0.3) :- ignition_not_start(1.0). Here 30% is rule probability. If right hand side of the rule is certain, then we can even write above rule as: battery_dead(0.3) :- ignition_not_start. "the battery is dead with same probability that the voltmeter is outside the normal range" battery_dead(P) :-voltmeter_measurment_abnormal(P). If we want to ignore very weak evidences for conclusion to avoid unnecessary computation, then we can rewrite above rule as: battery_dead(P) :- voltmeter_measurment_abnormal(P), P > 0.1.

Cumulative Probabilities  When we want to reason about whether the battery is dead, we should gather all relevant rules and facts.  Combining probabilities from facts and successful rules to get a cumulative probability of the battery being dead is an important issue.  According to Bayesian approach based on formal probability theory for uncertainty, there are two issues. - First issue is that the probability of a rule to succeed depends on probabilities of sub goals on the right side of a rule. * The cumulative probability of conclusion can be calculated by using and-combination.

* In this case, probabilities of sub goals in the right side of rule are multiplied, assuming all the events are independent of each other using the formula Prob(A and B and C and.....) = Prob(A) * Prob(B) * Prob(C) *... - Second issue is that the rules with same conclusion can be uncertain for different reasons. * So when we have more than one rules with the same predicate name, say, 'p' having different probabilities, then in order to get cumulative likelihood of 'p', we have to use or-combination to get overall probability of predicate 'p'. The following formula is used to get 'or' probability if events are mutually independent. Prob(A or B or C or...) = 1 - [(1 - Prob(A)) (1 - Prob(B)) (1 - Prob(C))....]

Examples: 1. "half of the time when a car has an electrical problem, then the battery is dead" battery_dead(P):-electrical_prob(P1), P is P1*0.5. Here 0.5 is a rule probability. 2. "95% of the time when a car has electrical problem and battery is old, then the battery is dead" battery_dead(P) :-electrical_prob(P1), battery_old(P2), P is P1 * P2 * Find the probability of battery being dead if probability of electric problem is 0.6 and that of battery old is 0.9 using rule defined in above example. Solution:Given P1 = 0.6 and P2 = 0.9 ThenP = P1 * P2 * 0.95= Therefore, the probability of battery being dead is if probability of electric problem is 0.6 and that of battery dead is 0.9

Computing Cumulative Probabilities using Prolog:  We have to assumes that the sub goals and rules are independent of each other. And-combination:collect all the probabilities in a list and compute the product of these to get and-combination effect. and_combination([P], P). and_combination([H | T], P) :- and_combination(T, P1), P is P1 * H. Examples: 1. " when a car has electrical problem and battery is old then the battery is dead" battery_dead(P) :-electrical_prob(P1), battery_old(P2), and_combination([P1, P2], P).

2. "95% of the time when a car has electrical problem and battery is old then the battery is dead". Here the rule probability (0.95) is also include in and combination. battery_dead(P) :-electrical_prob(P1), battery_old(P2), and_combination([P1, P2, 0.95], P).  The rule probability can be thought of hidden and is “and' combined along with associated probabilities in the rule. Example:Suppose we have the following program containing rules and facts find(P) :- first(P1), second(P2), third, and_combination([P1, P2, 0.7], P). first(0.5). second(0.75). third. ?- find(P). Ans: P =

Or-combination:Get all the probabilities of the same predicate name defined as head of different rules in a list and compute or_combination probability using the following formula. Prob(A or B or C or...) = 1 - [(1 - Prob(A)) (1 - Prob(B)) (1 - Prob(C))....] or_combination([P], P). or_combination([H|T],P) :- or_combination(T, P1), P is 1 -((1 - H) * (1 - P1)). Examples: 1. Consider the following rules and facts. get(0.6) :- first. get(0.8) :- second, third. get(0.5) :- first, fourth. We can define another predicate for computing overall probability or likelihood of 'get' using or_combination.

overall_get (P) :- findall(P1, get(P1), L), or_combination(L, P). ?- overall_get(P) Ans:P = Consider rules with and-combination as follows: get(0.6) :- first. get(0.8) :- second. get(P) :- third(P1), fourth(P2), and_combination([P1, P2, 0.8], P). first. second. third(0.5),fourth(0.6). overall_get(P) :- findall(P1, get(P1), L), or_combination(L, P). ?- overall_get(P). ?- findall(P1, get(P1), L), or_combination(L, P) Ans: L = [0.6, 0.8, 0.24] P =

Handling Negative Probabilities Sub goals in rules which are not true sometimes cause problems for probabilities. Consider the following example. Example:Consider a rule g(P) :- a(P1), b(P2), not(c(P3)). - Here a, b and c are all uncertain and if the rule itself has a certainty factor of 0.8, then cumulative probability P can not be calculated by simply using and_combination([P1, P2, P3, 0.8], P) because if there is any evidence for c, the rule will fail always. The way to handle this is that the probability of something being false is (1 - probability of it being true). Therefore, the rule is rewritten as g(P) :- a(P1), b(P2), not(c(P3)), inverse(P3, P4), and_combination([P1, P2, P4, 0.8], P). inverse(P, Q) :- Q is 1 - P. So we should not use probabilities specified in sub goals having 'not' but use inverse probabilities in calculating cumulative probabilities.

Production System for Diagnosis of malfunctioning Equipment using Probability  Let us develop production system for diagnosis of malfunctioning of television using probabilities.  We have to identify situations when television is not working properly. We broadly imagine one of two things may be improperly adjusted. - the controls (knobs and switches) - the receiver (antenna or cable) Let us estimate probabilities of these two things. Knobs adjustment:  Knobs might be adjusted wrong on television because of the following reasons (by no mean exhaustive) 1. If television is old and require frequent adjustment. 2. Unusual usage (i.e., any thing strange has happened lately that required adjustment of the knobs). 3. If children use your set and they play with the knobs.

Let us write the following rules: 1. Rule for frequent adjustment "If a set requires frequent readjustment, then it is quite reasonable that the set is maladjusted today". Assume that this rule is 50% sure. In Prolog it is coded as: maladjusted(0.5) :- askif(freq_adjustment_needed). Here askif predicate types out a question for the user and checks if the response is positive or negative. 2. Rule for unusual usage of the system "let us say that it is 40% to be reasonable that the set is maladjusted today because of unusual usage". maladjusted(0.4) :- askif( recent_unusual_usage).

3. Rule for children usage "If children are present and use your set and they play with the knobs with some probability, then it is 90% sure that set is maladjusted today". maladjusted(P):-askif(children_present, P1), askif(children_twidle_knobs, P2), and_combine([P1, P2, 0.9], P). This rule says that when both the conditions hold (children are present and children twiddle with knobs) with some probabilities, then it means, "it is 90% sure that set is maladjusted today". Finally, to get the cumulative probability for the set was adjusted recently, we need to combine probabilities of all the rules using or-combination explained earlier.

The corresponding rule is: recently_adjusted(P) :-findall(X, maladjusted(C), L), or_combine(L, P).  The rule for diagnosis which we assume to be 80% sure might be: diagnosis('Knobs on the set required adjustment', P) :- recently_adjusted(P1), askif(operation(abnormal)), and_combine([P1, 0.8], P).  Alternatively, if there is some uncertainty about whether the system behaviour is normal, then we could include a probability as another argument in operation predicate as: diagnosis('Knobs on the set required adjustment', P) :- recently_adjusted(P1), askif (operation(abnormal, P2)), and_combine([P1, P2, 0.8], P).

 We must note that these rules are to be carefully designed in consultation with domain expert.  Let us consider other kinds of television diagnosis such as antenna or cable connection etc. source_prob(0.8):-askif(television_old). source_prob(0.95):-askif(antenna_old). source_prob(0.98):-askif(cable_old). source_prob(0.3):- askif(recent_furniture_rearranging). overall_source_pr(P):-findall(X,source_prob(X), L), or_combine(L, P).  Following are two rules for such diagnosis. diagnosis( 'Antenna connection faulty', P):- askif(source_type_antenna), askif(operation(abnormal)), overall_source_prob(P).

diagnosis( 'Cable connection faulty', P):- askif(source_type_cable), askif(operation(abnormal)), overall_source_prob(P).  To run such a system, put a Goal as: ?- diagnosis(D, P).  Each answer to the above Goal will bind D to the string representing diagnosis and P to the corresponding probability.