Presentation is loading. Please wait.

Presentation is loading. Please wait.

Uncertainty in Environments

Similar presentations


Presentation on theme: "Uncertainty in Environments"— Presentation transcript:

1 Uncertainty in Environments
Comp3710 Artificial Intelligence Computing Science Thompson Rivers University

2 Course Outline Part I – Introduction to Artificial Intelligence
Part II – Classical Artificial Intelligence Part III – Machine Learning Introduction to Machine Learning Neural Networks Probabilistic Reasoning and Bayesian Belief Networks Artificial Life: Learning through Emergent Behavior Part IV – Advanced Topics TRU-COMP3710 Uncertainty

3 Learning Outcomes Relate a proposition to probability, i.e., random variables. Give the sample space for a set of random variables. Use one of Komogorov’s 3 axioms P(A  B) = P(A) + P(B) – P(A  B) Give the joint probability distribution for a set of random variables Compute probabilities given a joint probability distribution Give the joint probability distribution for a set of random variables that have conditional independence relations. Compute the diagnostic probability from causal probabilities. Uncertainty

4 Outline How to deal with uncertainty in environments Probability
Syntax and Semantics Inference Independence and Bayes' Rule Uncertainty

5 1. Uncertainty in environments
An example of decision making problem: Let action At = leave for airport t minutes before flight. Will At get me there on time? [Q] How to decide? Problems: Application environments are not deterministic and unpredictable. Many hidden factors. partial observability (road state, other drivers' plans, etc.) noisy sensors (traffic reports) uncertainty in action outcomes (flat tire, etc.) immense complexity of modeling and predicting traffic Hence a purely logical approach either risks falsehood: “A25 will get me there on time”, or leads to conclusions that are too weak for decision making: “A25 will get me there on time if there's no accident on the bridge and it doesn't rain and my tires remain intact etc etc.” (A1440 might reasonably be said to get me there on time but I'd have to stay overnight in the airport …) Uncertainty

6 Methods for handling uncertainty
Topics Methods for handling uncertainty [Q] How to handle uncertainty? [Q] Default logic? Default assumptions: My car does not have a flat tire. ... A25 works unless contradicted by evidence. But issues: What assumptions are reasonable? [Q] Can we use fuzzy logic? Probability – uncertainty/certainty Model of degree of belief Given the available evidence or knowledge – environments, A25 will get me there on time with probability 0.04. Fuzzy logic – ambiguity/clearness Degree of truth – value (observed data/facts) E.g., my age Uncertainty

7 2. Probability Probability is a way of expressing knowledge or belief that an event will occur or has occurred. E.g., … Probabilistic assertions summarize effects of hidden factors, i.e., laziness: failure to enumerate exceptions, qualifications, etc. ignorance: lack of relevant facts, initial conditions, etc. Subjective or Bayesian probability: Probabilities can relate propositions to agent's own state of knowledge (or conditions). e.g., P(A25 | no reported accidents) = 0.06 Probabilities of propositions change with new evidence: e.g., P(A25 | no reported accidents, 5 a.m.) = 0.15 Hence probability theory can be a good tool. Uncertainty

8 Making decisions under uncertainty
Topics Making decisions under uncertainty Suppose I believe the following: P(A25 gets me there on time | …) = 0.04 P(A90 gets me there on time | …) = 0.70 P(A120 gets me there on time | …) = 0.95 P(A1440 gets me there on time | …) = [Q] Which action to choose? Depends on my preferences for missing flight vs. time spent waiting, etc. Utility (the quality of being useful) theory is used to represent and infer preferences. Decision theory = probability theory + utility theory Uncertainty

9 3. Syntax and Semantics Sample space, event, probability, probability distribution Random variable and proposition Prior probability (aka unconditional probability) Posterior probability (aka conditional probability) Uncertainty

10 Sample Space and Probability
Definition. Begin with a set Ω – the sample space of the same type atomic events E.g., 6 possible rolls of a die -> Ω = {???} ω in Ω is a sample point / possible world / atomic event Definition. A probability space or probability distribution or probability model is a sample space with an assignment P(ω) for every ω in Ω s.t. 0 ≤ P(ω) ≤ 1 Σω P(ω) = 1 E.g., P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = 1/6. [Q] Sample space? [Q] Probability distribution? Definition. An event A can be considered as a subset of Ω. P(A) = Σω in A P(ω) E.g., P(die roll < 4) = P(1) + P(2) + P(3) = 1/6 + 1/6 + 1/6 = 1/2 Surface 1 2 3 4 5 6 Probability 1/6 Uncertainty

11 3. Axioms of probability Kolmogorov’s axioms: For any propositions (events) A, B 0 ≤ P(A) ≤ 1 P(true) = 1 and P(false) = 0 P(A  B) = P(A) + P(B) – P(A  B) Uncertainty

12 Random Variable and Proposition
Definition. Basic element: random variable, representing a sample space A function from sample points to some range, e.g., the reals or Booleans E.g., Odd: N -> {true, false} Odd(1) = true Boolean random variables E.g., Cavity (do I have a cavity?); [Q] What is the sample space? Discrete random variables (finite or infinite) E.g., Weather is one of <sunny, rainy, cloudy, snow>; [Q] What is the sample space? Continuous random variables (bounded or unbounded) E.g., RoomTemperature < 0.22; [Q] What is the sample space? Uncertainty

13 Proposition (how to relate to probability?)
3. Proposition (how to relate to probability?) Think of a proposition as the event where the proposition is true. E.g., I have a toothache; The weather is sunny and there is no cavity. [Q] How to relate to probability, i.e., random variables? Similar to propositional logic: possible worlds defined by assignment of values to random variables. A sample point is a value of a set of random variables. A sample space is the Cartesian product of the ranges of random variables. [Q] What is the sample space for Cavity and Weather? Elementary proposition constructed by assignment of a value to a random variable: E.g., Weather = sunny; Cavity = false (can be abbreviated as cavity) Complex propositions formed from elementary propositions and standard logical connectives E.g., Weather = sunny  Cavity = false Proposition = conjunction/disjunction of atomic events Uncertainty

14 Prior and Posterior Probability
Prior (aka unconditional) probability Posterior (aka conditional) probability Uncertainty

15 Prior probability Prior or unconditional probabilities of propositions
E.g., P(Cavity = true) = 0.1 and P(Weather = sunny) = 0.72 correspond to belief prior to arrival of any (new) evidence. Probability distribution gives values for all possible assignments: P(Weather) = <0.72, 0.1, 0.08, 0.1> (normalized, i.e., sums to 1) Joint probability distribution for a set of random variables gives the probability of every atomic event on those random variables. P(Weather, Cavity) = a 4 × 2 matrix of values: Weather = sunny rainy cloudy snow Cavity = true [Q] What is the sum of all the values above? Every question about a domain can be answered with the joint probability distribution. [Q] Any problem? Size of the table. Uncertainty

16 P(Weather, Cavity) = a 4 × 2 matrix of values:
Weather = sunny rainy cloudy snow Cavity = true Cavity = false [Q] What is the probability of cloudy? [Q] What is the probability of snow and no-cavity? [Q] What is the probability of sunny or cavity? P(W=sunny)  P(C=true) = P(W=sunny) + P(C=true) ??? ??? = ( ) + ( ) – ??? Uncertainty

17 Prior probability – continuous variables
In general, express probability distribution as parameterized function, called a probability density function (pdf), of value E.g., P(X=x) = U[18,26](x) = 0.125 P is a density; integrates to 1 0.125 18 dx 26 Uncertainty

18 Example of Normal distribution, or also called Gaussian distribution
Probabilities [This image is from Wikipedia.org] Events Uncertainty

19 Interesting facts: 3. P(A) = 1 – P(~A)
P(A) = P(A  B) + P(A  ~B) for any B Uncertainty

20 Conditional probability
Posterior or conditional probabilities i.e., given that toothache is all I know, (it is not like if toothache,) 60% chance of cavity. Another example, P(toothache | cavity) = 0.8 New evidence/observation/fact may be irrelevant, allowing simplification, e.g., when Cavity and Weather are irrelevant, P(cavity | toothache, sunny) = P(cavity | toothache) = 0.6 This kind of inference, sanctioned by domain knowledge, is crucial. Very interesting examples: [Q] The stiff neck is one symptom of meningitis; the probability of having meningitis with stiff neck? [Q] A car won’t start. What is the probability of having a bad starter? What is the probability of having a bad battery? Uncertainty

21 Definition of conditional probability:
P(a | b) = P(a  b) / P(b) if P(b) > 0 Product rule gives an alternative formulation: P(a  b) = P(a) P(b | a) = P(b) P(a | b) Chain rule is derived by successive application of product rule: P(a  b  c) = ??? = P(a  b) P(c | a  b) = P(a) P(b | a) P(c | a  b) P(X1,…, Xn) = P(X1,..., Xn-1) P(Xn | X1,..., Xn-1) = P(X1,..., Xn-2) P(Xn-1 | X1,..., Xn-2) P(Xn | X1,..., Xn-1) = … = P(X1) P(X2 | X1) P(X3 | X1, X2) P(X4 | X1, X2, X3) … = i=1 P(Xi | X1,…, Xi-1) will be used in Inference later! a n Uncertainty

22 3. Topics Interesting facts: P(A|B) = 1 – P(~A|B) Uncertainty

23 4. Inference by Enumeration
Start with the joint probability distribution: Toothache (I have a toothache), Cavity (meaning?), Catch (catch probe?). [Q] What are random variables here? [Q] What is the sum of probabilities of all the atomic events? [Q] How can you obtain the above probability distribution? For any proposition φ, the probability is the sum of the atomic events where it is true: P(φ) = Σω:ω╞φ P(ω) Uncertainty

24 For any proposition φ, sum the atomic events where it is true:
Start with the joint probability distribution: Toothache, Cavity, Catch For any proposition φ, sum the atomic events where it is true: P(φ) = Σω:ω╞φ P(ω) [Q] P(toothache) = = 0.2 [Q] P(cavity  toothache) = ??? [Q] P(cavity  toothache) = ??? Uncertainty

25 Can also compute conditional probabilities: Very important!
Start with the joint probability distribution: Toothache, Cavity, Catch Can also compute conditional probabilities: Very important! P(cavity | toothache) = P(cavity  toothache) P(toothache) = = 0.4 [Q] P(cavity | toothache) = ??? P(cavity | toothache) = ??? [Q] P(cavity | catch) = ??? [Q] P(cavity | toothache) ??? Uncertainty

26 Denominator can be viewed as a normalization constant α
P(Cavity | toothache) = P(Cavity, toothache) / P(toothache) = α, P(Cavity, toothache) = α, [P(Cavity, toothache, catch) + P(Cavity, toothache, catch)] = α, [<0.108, 0.016> + <0.012, 0.064>] = α, <0.12, 0.08> = <0.6, 0.4> [Q] What is α? General idea: compute distribution on query variable by fixing evidence variables and summing over hidden variables Uncertainty

27 Obvious problems: In other words, too much. [Q] How to reduce? Topics
For n random variables, Worst-case time complexity O(dn) where d is the largest arity Space complexity O(dn) to store the joint distribution How to find the numbers for O(dn) entries? In other words, too much. [Q] How to reduce? Uncertainty

28 5. Independence & Bayes’ Rule
Conditional independence Bayes’ rule Uncertainty

29 5. Independence A and B are independent iff P(A|B) = P(A) or P(B|A) = P(B) or P(A,B) = P(A) P(B) P(Toothache, Catch, Cavity, Weather) = P(Toothache, Catch, Cavity) × P(Weather) In the above example, 32 (2*2*2*4) entries reduced to 12 (2*2*2 + 4). Absolute independence is powerful but rare. Dentistry is a large field with hundreds of variables, none of which are independent. [Q] What do we have to do? Uncertainty

30 Conditional Independence
If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: (1) P(catch | (toothache  cavity)) = P(catch | cavity) The same independence holds if I haven't got a cavity: (2) P(catch | (toothache  cavity)) = P(catch | cavity) Catch is conditionally independent of Toothache given Cavity: P(Catch | Toothache, Cavity) = P(Catch | Cavity) When Catch is conditionally independent of Toothache given Cavity, equivalent statements: P(Toothache | Catch, Cavity) = P(Toothache | Cavity) P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity) (similarly P(A, B) = P(A) P(B) when A and B are independent) Uncertainty

31 Write out full joint distribution using chain rule:
P(Toothache, Catch, Cavity) = P(Toothache | Catch, Cavity) P(Catch, Cavity) by the product rule = P(Toothache | Catch, Cavity) P(Catch | Cavity) P(Cavity) by ??? = P(Toothache | Cavity) P(Catch | Cavity) P(Cavity) by ??? 2 * 2 * 2 = 8 -> How ??? Cavity Toothache Catch [Q] P(toothache) from the tables below??? = P(tcavity) + P(tcavity) = P(t|c) P(c) + P(t|c) P(c) = ??? [Q] P(t|cavity) = ??? = 1 – P(t|c) = ??? ? toothache | cavity toothache | cavity 0.6 0.1 catch | cavity | cavity 0.9 0.2 cavity 0.2 Uncertainty

32 Write out full joint distribution using chain rule:
P(Toothache, Catch, Cavity) = P(Toothache | Catch, Cavity) P(Catch, Cavity) = P(Toothache | Catch, Cavity) P(Catch | Cavity) P(Cavity) = P(Toothache | Cavity) P(Catch | Cavity) P(Cavity) In most cases, the use of conditional independence reduces the size of the representation of the joint distribution from exponential in n to linear in n. Conditional independence is our most basic and robust form of knowledge about uncertain environments. Cavity Toothache Catch Uncertainty

33 It is how to show the conditional independency.
5. Interesting fact: P(A,B|C) = P(A|C) P(B|C) when A and B are conditionally independent of each other given C. Actually it is a definition. P(toothache  catch | cavity) = .108 / ( ) = .054 P(toothache | cavity) = ( ) / .2 = .12 / .2 = .06 P(catch | cavity) = ( ) / .2 = .09 P(toothache | cavity) P(catch | cavity) = .06 * .09 = .054  P(toothache  catch | cavity) = P(toothache | cavity) P(catch | cavity) Therefore Toothache and Catch are conditionally independent given Cavity. It is how to show the conditional independency. Uncertainty

34 Bayes' Rule Product rule P(ab) = P(a | b) P(b) = P(b | a) P(a)
 Bayes' rule, or also called Bayes’ Theorem: P(a | b) = P(b | a) P(a) / P(b) or in distribution form P(Y|X) = P(X|Y) P(Y) / P(X) = α P(X|Y) P(Y) Useful for assessing diagnostic probability from causal probability: P(Cause|Effect) = P(Effect|Cause) P(Cause) / P(Effect) [Q] The stiff neck is one symptom of meningitis; the probability of having meningitis with stiff neck? [Q] A car won’t start. What is the probability of having a bad starter? What is the probability of having a bad battery? Uncertainty

35 Useful for assessing diagnostic probability from causal probability:
P(Cause|Effect) = P(Effect|Cause) P(Cause) / P(Effect) [Q] When cavity, usually toothache P(cavity | toothache) ? [Q] E.g., let M be meningitis, S be stiff neck; the stiff neck is one symptom of meningitis; the probability of having meningitis with stiff neck? P(s) = 0.1; P(m) = ; P(s|m) = 0.8 P(m|s) = ??? [Q] How to find P(s|m), P(m) and P(s) ??? Note: posterior probability of meningitis still very small! Uncertainty

36 Useful for assessing diagnostic probability from causal probability:
P(Cause|Effect) = P(Effect|Cause) P(Cause) / P(Effect) [Q] E.g., let B be ‘bad battery’, N be ‘no starting in cold morning’; N is one symptom of B; the probability of having bad battery when a car won’t start in cold morning? P(n) = 0.01; P(b) = 0.05; P(n|b) = 0.8 P(b|n) = ??? Uncertainty

37 Bayes' rule and conditional independence
5. Bayes' rule and conditional independence Topics P(Cavity | toothache  catch) = P(toothache  catch | Cavity) P(Cavity) = P(toothache | Cavity) P(catch | Cavity) P(Cavity) This is an example of a naïve Bayes model: P(Cause, Effect1, … , Effectn) = P(Cause) Πi P(Effecti|Cause), where Effects are conditionally independent of each other, given Cause. We will use this later again! Uncertainty

38 Topics 6. Summary Probability is a rigorous formalism for uncertain knowledge. Joint probability distribution for discrete random variables specifies probability of every atomic event. Queries can be answered by summing over atomic events. For nontrivial domains, we must find a way to reduce the joint size. Independence and conditional independence provide the tools. Uncertainty


Download ppt "Uncertainty in Environments"

Similar presentations


Ads by Google