DEALING WITH UNCERTAINTY (1) WEEK 5 CHAPTER 3
Introduction The world is not a well-defined place. There is uncertainty in the facts we know: – What’s the temperature? Imprecise measures – Is X a good president? Imprecise definitions – Where are the road pits? Imprecise knowledge There is uncertainty in our inferences – If I have red scars a itchy rash and was gardening all weekend I have poison ivy People make successful decisions all the time anyhow. 2 probably
Sources of Uncertainty Uncertain data – missing data, unreliable, ambiguous, imprecise representation, inconsistent, subjective, derived from defaults, noisy… Uncertain knowledge – Multiple causes lead to multiple effects – Incomplete knowledge of causality in the domain – Probabilistic/stochastic effects Uncertain knowledge representation – restricted model of the real system – limited expressiveness of the representation mechanism inference process – Derived result is formally correct, but wrong in the real world – New conclusions are not well-founded (eg, inductive reasoning) – Incomplete, default reasoning methods 3
Reasoning Under Uncertainty So how do we do reasoning under uncertainty and with inexact knowledge? – heuristics ways to mimic heuristic knowledge processing methods used by experts ( limit the search for solution) – empirical associations experiential reasoning and based on limited observations Verifiable or provable by means of observation or experiment. Guided by practical experience and not theory, as in medicine. – probabilities objective (frequency counting) subjective (human experience ) 4
Decision making with uncertainty Rational behavior: – For each possible action, identify the possible outcomes – Compute the probability of each outcome – Compute the utility of each outcome – Compute the probability-weighted (expected) utility over possible outcomes for each action – Select the action with the highest expected utility (principle of Maximum Expected Utility) 5
Some Relevant Factors expressiveness – can concepts used by humans be represented adequately? – can the confidence of experts in their decisions be expressed? comprehensibility – representation of uncertainty – utilization in reasoning methods correctness – probabilities – relevance ranking – long inference chains computational complexity – feasibility of calculations for practical purposes reproducibility – Do the observations deliver the same results when repeated? 6
Basic Probability Probability theory enables us to make rational decisions. Which mode of transportation is safer ( more safety): – Car or Plane? – What is the probability of an accident?
Basic Probability Theory
Probability as Relative Frequency An event has a probability. Consider a long sequence of experiments. If we look at the number of times a particular event occurs in that sequence, and compare it to the total number of experiments, we can compute a ratio. This ratio is one way of estimating the probability of the event. P(E) = (# of times E occurred)/(total # of trials)
Theoretical Probability Principle of Indifference - Alternatives are always to be judged probabley if we have no reason to expect or prefer one over the other. Each outcome in the sample space is assigned equal probability. Example: throw a dice – P({1})=P({2})=... =P({6})=1/6
Law of Large Numbers As the number of experiments increases the relative frequency of an event more closely approximates the theoretical probability of the event. – if the theoretical assumptions hold. Buffon’s Needle for Computing π – Draw parallel lines 1 inch apart on a plane – Throw a 1-inch needle on the plane – P( needle crossing a line )=2/ π
Large Number Reveals Untruth in Assumptions Results of 1,000,000 throws of a dice Number Fraction W hy ?
Axioms of Probability Theory Suppose P(.) is a probability function, then 1.for any event E, 0≤P(E) ≤1. …..How ? 2.P(S) = 1, where S is the sample space. 3.for any two mutually exclusive events E1 and E2, P(E1 E2) = P(E1) + P(E2) Any function that satisfies the above three axioms is a probability function.
Joint Probability Let A, B be two events, the joint probability of both A and B being true is denoted by P(A, B). Example: P(spade) is the probability of the top card being a spade. P(king) is the probability of the top card being a king. P(spade, king) is the probability of the top card being both a spade and a king, i.e., the king of spade. P(king, spade)=P(spade, king) ???
Properties of Probability 1.P( E) = 1– P(E) 2.If E1 and E2 are logically equivalent, then P(E1)=P(E2). – E1: Not all philosophers are more than six feet tall. – E2: Some philosopher is not more that six feet tall. Then P(E1)=P(E2). 3.P(E1, E2)≤P(E1).
Conditional Probability The probability of an event may change after knowing another event. The probability of A given B is denoted by P(A|B). Example – P( W=space ) the probability of a randomly selected word from an English text is ‘space’ – P( W=space | W’=outer) the probability of ‘space’ if the previous word is ‘outer’
Example A:the top card of a deck of poker cards is a king of spade P(A) = 1/52 However, if we know B:the top card is a king then, the probability of A given B is true is P(A|B) = 1/4.
How to Compute P(A|B)?
Business Students Of 100 students completing a course, 20 were business major. Ten students received A in the course, and three of these were business majors., suppose A is the event that a randomly selected student got an A in the course, B is the event that a randomly selected event is a business major. What is the probability of A? What is the probability of A after knowing B is true?
Probabilistic Reasoning Evidence – What we know about a situation. Hypothesis – What we want to conclude. Compute – P( Hypothesis | Evidence )
Credit Card Authorization E is the data about the applicant's age, job, education, income, credit history, etc, H is the hypothesis that the credit card will provide positive return. The decision of whether to issue the credit card to the applicant is based on the probability P(H|E).
Medical Diagnosis E is a set of symptoms, such as, coughing, sneezing, headache,... H is a disorder, e.g., common cold, SARS, flu. The diagnosis problem is to find an H (disorder) such that P(H|E) is maximum.
Basics of Probability Theory mathematical approach for processing uncertain information – sample space set X = {x1, x2, …, xn} collection of all possible events can be discrete or continuous – probability number P(xi): likelihood of an event xi to occur non-negative value in [0,1] total probability of the sample space is 1 for mutually exclusive events, the probability for at least one of them is the sum of their individual probabilities experimental probability – based on the frequency of events subjective probability – based on expert assessment 24
Compound Probabilities describes independent events – do not affect each other in any way joint probability of two independent events A and B P(A B) = P(A) * P (B) union probability of two independent events A and B P(A B) = P(A) + P(B) - P(A B) =P(A) + P(B) - P(A) * P (B) 25
Probability theory Random variables – Domain Atomic event: complete specification of state Prior probability: degree of belief without any other evidence Joint probability: matrix of combined probabilities of a set of variables Alarm, Burglary, Earthquake – Boolean (like these), discrete, continuous Alarm=True Burglary=True Earthquake=False alarm burglary earthquake P(Burglary) =.1 P(Alarm, Burglary) = 26 alarm¬alarm burglary ¬burglary.1.8
Independence When two sets of propositions do not affect each others’ probabilities, we call them independent, and can easily compute their joint and conditional probability: – Independent (A, B) if P(A B) = P(A) P(B), P(A | B) = P(A) For example, {moon-phase, light-level} might be independent of {burglary, alarm, earthquake} – Then again, it might not: Burglars might be more likely to burglarize houses when there’s a new moon (and hence little light) – But if we know the light level, the moon phase doesn’t affect whether we are burglarized – Once we’re burglarized, light level doesn’t affect whether the alarm goes off We need a more complex notion of independence, and methods for reasoning about these kinds of relationships 27
Conditional independence Absolute independence: – A and B are independent if P(A B) = P(A) P(B); equivalently, P(A) = P(A | B) and P(B) = P(B | A) A and B are conditionally independent given C if – P(A B | C) = P(A | C) P(B | C) This lets us decompose the joint distribution: – P(A B C) = P(A | C) P(B | C) P(C) Moon-Phase and Burglary are conditionally independent given Light-Level Conditional independence is weaker than absolute independence, but still useful in decomposing the full joint probability distribution 28
Conditional Probabilities describes dependent events – affect each other in some way conditional probability of event a given that event B has already occurred P(A|B) = P(A B) / P(B) 29
Q & A 30