Artificial Intelligence Uncertainty & probability Chapter 13, AIMA.

Slides:



Advertisements
Similar presentations
Probability. Uncertainty Let action A t = leave for airport t minutes before flight –Will A t get me there on time? Problems: Partial observability (road.
Advertisements

Probability: Review The state of the world is described using random variables Probabilities are defined over events –Sets of world states characterized.
Bayes Rule The product rule gives us two ways to factor a joint probability: Therefore, Why is this useful? –Can get diagnostic probability P(Cavity |
PROBABILITY. Uncertainty  Let action A t = leave for airport t minutes before flight from Logan Airport  Will A t get me there on time ? Problems :
Uncertainty Everyday reasoning and decision making is based on uncertain evidence and inferences. Classical logic only allows conclusions to be strictly.
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Uncertain Reasoning CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 6.
Reasoning under Uncertainty: Conditional Prob., Bayes and Independence Computer Science cpsc322, Lecture 25 (Textbook Chpt ) March, 17, 2010.
CPSC 422 Review Of Probability Theory.
Probability.
Uncertainty Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 13.
Probability Notation Review Prior (unconditional) probability is before evidence is obtained, after is posterior or conditional probability P(A) – Prior.
Uncertainty Chapter 13. Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems: 1.partial observability.
CS 416 Artificial Intelligence Lecture 13 Uncertainty Chapter 13 Lecture 13 Uncertainty Chapter 13.
KI2 - 2 Kunstmatige Intelligentie / RuG Probabilities Revisited AIMA, Chapter 13.
Uncertainty Management for Intelligent Systems : for SEP502 June 2006 김 진형 KAIST
Representing Uncertainty CSE 473. © Daniel S. Weld 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one.
Ai in game programming it university of copenhagen Welcome to... the Crash Course Probability Theory Marco Loog.
Probability and Information Copyright, 1996 © Dale Carnegie & Associates, Inc. A brief review (Chapter 13)
Uncertainty Logical approach problem: we do not always know complete truth about the environment Example: Leave(t) = leave for airport t minutes before.
Uncertainty Chapter 13.
Uncertainty Chapter 13.
Probability, Bayes’ Theorem and the Monty Hall Problem
Handling Uncertainty. Uncertain knowledge Typical example: Diagnosis. Consider:  x Symptom(x, Toothache)  Disease(x, Cavity). The problem is that this.
Uncertainty Chapter 13. Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems: 1.partial observability.
CSCI 121 Special Topics: Bayesian Network Lecture #1: Reasoning Under Uncertainty.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
1 Chapter 13 Uncertainty. 2 Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Probability and naïve Bayes Classifier Louis Oliphant cs540 section 2 Fall 2005.
An Introduction to Artificial Intelligence Chapter 13 & : Uncertainty & Bayesian Networks Ramin Halavati
1 Reasoning Under Uncertainty Artificial Intelligence Chapter 9.
Uncertainty Uncertain Knowledge Probability Review Bayes’ Theorem Summary.
Topic 2: Intro to probability CEE 11 Spring 2002 Dr. Amelia Regan These notes draw liberally from the class text, Probability and Statistics for Engineering.
Chapter 13 February 19, Acting Under Uncertainty Rational Decision – Depends on the relative importance of the goals and the likelihood of.
Uncertainty. Assumptions Inherent in Deductive Logic-based Systems All the assertions we wish to make and use are universally true. Observations of the.
Probability and Information Copyright, 1996 © Dale Carnegie & Associates, Inc. A brief review.
Probability Course web page: vision.cis.udel.edu/cv March 19, 2003  Lecture 15.
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Review: Probability Random variables, events Axioms of probability Atomic events Joint and marginal probability distributions Conditional probability distributions.
Uncertainty Chapter 13. Outline Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Uncertainty ECE457 Applied Artificial Intelligence Spring 2007 Lecture #8.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Review of Statistics I: Probability and Probability Distributions.
Uncertainty Let action A t = leave for airport t minutes before flight Will A t get me there on time? Problems: 1.partial observability (road state, other.
CSE 473 Uncertainty. © UW CSE AI Faculty 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one has stood.
Uncertainty Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Microsoft produces a New operating system on a disk. There is 0
Outline [AIMA Ch 13] 1 Uncertainty Probability Syntax and Semantics Inference Independence and Bayes' Rule.
Uncertainty & Probability CIS 391 – Introduction to Artificial Intelligence AIMA, Chapter 13 Many slides adapted from CMSC 421 (U. Maryland) by Bonnie.
Matching ® ® ® Global Map Local Map … … … obstacle Where am I on the global map?                                   
ECE457 Applied Artificial Intelligence Fall 2007 Lecture #8
AIMA 3e Chapter 13: Quantifying Uncertainty
Quick Review Probability Theory
Uncertainty Chapter 13 Copyright, 1996 © Dale Carnegie & Associates, Inc.
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Uncertainty Chapter 13.
Uncertainty Chapter 13.
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Uncertainty.
Probability and Information
Representing Uncertainty
Probability and Information
Uncertainty Logical approach problem: we do not always know complete truth about the environment Example: Leave(t) = leave for airport t minutes before.
Uncertainty Chapter 13.
Uncertainty Chapter 13.
Uncertainty Chapter 13.
Uncertainty Chapter 13.
ECE457 Applied Artificial Intelligence Spring 2008 Lecture #8
Presentation transcript:

Artificial Intelligence Uncertainty & probability Chapter 13, AIMA

”When an agent knows enough facts about its environment, the logical approach enables it to derive plans that are guaranteed to work.” ”Unfortunately, agents almost never have access to the whole truth about the environment.” From AIMA, chapter 13

Slide from S. Russell Airport example

Why FOL fails? Laziness: We can’t be bothered to list all possible requirements. Ignorance: We have no complete (nor tested) theoretical model for the domain. Practical reasons: Even if we knew everything and could be persuaded to list everything, the rules would be totally impractical to use (we can’t possibly test everything).

Instead, use decision theory Assign probability to a proposition based on the percepts (information). Proposition is either true or false. Probability means assigning a belief on how much we believe in it being either true or false. Evidence = information that the agent receives. Probabilities can (will) change when more evidence is acquired. Prior/unconditional probability ⇔ no evidence at all. Posterior/conditional probability ⇔ after evidence is obtained. No plan may be guaranteed to achieve the goal. To make choices, the agent must have preferences between the different possible outcomes of various plans. Utility represents the value of the outcomes, and utility theory is used to reason with the resulting preferences. An agent is rational if and only if it chooses the action that yields the highest expected utility, averaged over all possible outcomes. Decision theory = probability theory + utility theory Inspired by Sun-Hwa Hahn ProbabilityUtility

Basic probability notation X : Random variable Boolean: P(X=true), P(X=false) Discrete: P(X=a), P(X=b),... Continuous: P(x)dx

Boolean variable X ∈ {True, False} Examples: P(Cavity) P(W 31 ) P(Survive) P(CatchFlight) P(Cancer) P(LectureToday)...

Discrete variable X ∈ {a 1,a 2,a 3,a 4,...} Examples: P(PlayingCard) P(Weather) P(Color) P(Age) P(LectureQual)...

Continuous variable X Probability density P(x) Examples: P(Temperature) P(WindSpeed) P(OxygenLevel) P(LengthLecture)...

Propositions Elementary: Cavity = True, W 31 = True, Cancer = False, Card = A ♠, Card = 4 ♥, Weather = Sunny, Age = 40, (20C < Temperature < 21C), (2 hrs < LengthLecture < 3 hrs) Complex: (Elementary + connective) ¬ Cancer ∧ Cavity, Card 1 = A ♠ ∧ Card 2 = A ♣, Sunny ∧ (30C < Temperature) ∧ ¬ Beer Every proposition is either true or false. We apply a degree of belief in whether it is true or not. Extension of propositional calculus...

Atomic event A complete specification of the state of the world. Mutually exclusive Exhaustive World {Cavity, Toothache} cavity ∧ toothachecavity ∧ ¬ toothache ¬ cavity ∧ toothache ¬ cavity ∧ ¬ toothache

Prior, posterior, joint probabilities Prior probability P(X=a) = P(a). Our belief in X=a being true before information is collected. Posterior probability P(X=a|Y=b) = P(a|b). Our belief that X=a is true when we know that Y=b is true (and this is all we know). Joint probability P(X=a, Y=b) = P(a,b) = P(a ∧ b). Our belief that (X=a ∧ Y=b) is true. P(a ∧ b) = P(a,b) = P(a|b)P(b) Image borrowed from Lazaric & Sceffer Note boldface

Conditional probability (Venn diagram) P(A)P(B) P(A,B) = P(A ∧ B) P(A|B) = P(A,B)/P(B)

Conditional probability examples 1.You draw two cards randomly from a deck of 52 playing cards. What is the conditional probability that the second card is an ace if the first card is an ace? 2.In a course I’m giving with oral examination the examination statistics over the period have been: 23 have passed the oral examination in their first attempt, 25 have passed it their second attempt, 7 have passed it in their third attempt, and 8 have not passed it at all (after having failed the oral exam at least three times). What is the conditional probability (risk) for failing the course if the student fails the first two attempts at the oral exam? 3.(2005) 84% of the Swedish households have computers at home, 81% of the Swedish households have both a computer and internet. What is the probability that a Swedish household has internet if we know that it has a computer? Work these out...

Conditional probability examples 1.You draw two cards randomly from a deck of 52 playing cards. What is the conditional probability that the second card is an ace if the first card is an ace? 2.In a course I’m giving with oral examination the examination statistics over the period have been: 23 have passed the oral examination in their first attempt, 25 have passed it their second attempt, 7 have passed it in their third attempt, and 8 have not passed it at all (after having failed the oral exam at least three times). What is the conditional probability (risk) for failing the course if the student fails the first two attempts at the oral exam? 3.(2005) 84% of the Swedish households have computers at home, 81% of the Swedish households have both a computer and internet. What is the probability that a Swedish household has internet if we know that it has a computer? What’s the probability that both are aces?

Conditional probability examples 1.You draw two cards randomly from a deck of 52 playing cards. What is the conditional probability that the second card is an ace if the first card is an ace? 2.In a course I’m giving with oral examination the examination statistics over the period have been: 23 have passed the oral examination in their first attempt, 25 have passed it their second attempt, 7 have passed it in their third attempt, and 8 have not passed it at all (after having failed the oral exam at least three times). What is the conditional probability (risk) for failing the course if the student fails the first two attempts at the oral exam? 3.(2005) 84% of the Swedish households have computers at home, 81% of the Swedish households have both a computer and internet. What is the probability that a Swedish household has internet if we know that it has a computer?

Conditional probability examples 1.You draw two cards randomly from a deck of 52 playing cards. What is the conditional probability that the second card is an ace if the first card is an ace? 2.In a course I’m giving with oral examination the examination statistics over the period have been: 23 have passed the oral examination in their first attempt, 25 have passed it their second attempt, 7 have passed it in their third attempt, and 8 have not passed it at all (after having failed the oral exam at least three times). What is the conditional probability (risk) for failing the course if the student fails the first two attempts at the oral exam? 3.(2005) 84% of the Swedish households have computers at home, 81% of the Swedish households have both a computer and internet. What is the probability that a Swedish household has internet if we know that it has a computer?

Conditional probability examples 1.You draw two cards randomly from a deck of 52 playing cards. What is the conditional probability that the second card is an ace if the first card is an ace? 2.In a course I’m giving with oral examination the examination statistics over the period have been: 23 have passed the oral examination in their first attempt, 25 have passed it their second attempt, 7 have passed it in their third attempt, and 8 have not passed it at all (after having failed the oral exam at least three times). What is the conditional probability (risk) for failing the course if the student fails the first two attempts at the oral exam? 3.(2005) 84% of the Swedish households have computers at home, 81% of the Swedish households have both a computer and internet. What is the probability that a Swedish household has internet if we know that it has a computer? What’s the prior probability for not passing the course? =13%

Inference Inference means computing P(State of the world | Observed evidence) P(Y | e)

Inference w/ full joint distribution The full joint probability distribution is the probability distribution for all random variables used to describe the world. toothache ¬ toothache catch ¬ catch catch ¬ catch cavity ¬ cavity Dentist example {Toothache, Cavity, Catch}

Inference w/ full joint distribution The full joint probability distribution is the probability distribution for all random variables used to describe the world. toothache ¬ toothache catch ¬ catch catch ¬ catch cavity ¬ cavity Dentist example {Toothache, Cavity, Catch} P(cavity) = 0.2 Marginal probability (marginalization)

Inference w/ full joint distribution The full joint probability distribution is the probability distribution for all random variables used to describe the world. toothache ¬ toothache catch ¬ catch catch ¬ catch cavity ¬ cavity Dentist example {Toothache, Cavity, Catch} P(cavity) = 0.2 Marginal probability (marginalization) P(toothache) = 0.2

Inference w/ full joint distribution The full joint probability distribution is the probability distribution for all random variables used to describe the world. toothache ¬ toothache catch ¬ catch catch ¬ catch cavity ¬ cavity Dentist example {Toothache, Cavity, Catch} Marginal probability (marginalization)

toothache ¬ toothache catch ¬ catch catch ¬ catch cavity ¬ cavity Dentist example {Toothache, Cavity, Catch}

The general inference procedure Inference w/ full joint distribution where  is a normalization constant. This can always be computed if the full joint probability distribution is available. Cumbersome...erhm...totally impractical impossible, O ( 2 n ) for process

Independence Independence between variables can dramatically reduce the amount of computation. We don’t need to mix independent variables in our computations

Independence for dentist example Image borrowed from Lazaric & Sceffer Oral health is independent of weather

Bayes’ theorem

Bayes theorem example Joe is a randomly chosen member of a large population in which 3% are heroin users. Joe tests positive for heroin in a drug test that correctly identifies users 95% of the time and correctly identifies nonusers 90% of the time. Is Joe a heroin addict? Example from

Bayes theorem example Joe is a randomly chosen member of a large population in which 3% are heroin users. Joe tests positive for heroin in a drug test that correctly identifies users 95% of the time and correctly identifies nonusers 90% of the time. Is Joe a heroin addict? Example from

Bayes theorem: The Monty Hall Game show See © Let’s make a deal (A Joint Venture) In a TV Game show, a contestant selects one of three doors; behind one of the doors there is a prize, and behind the other two there are no prizes. After the contestant select a door, the game-show host opens one of the remaining doors, and reveals that there is no prize behind it. The host then asks the contestant whether he/she wants to SWITCH to the other unopened door, or STICK to the original choice. What should the contestant do?

The Monty Hall Game Show © Let’s make a deal (A Joint Venture)

The Monty Hall Game Show © Let’s make a deal (A Joint Venture)

The Monty Hall Game Show © Let’s make a deal (A Joint Venture)

Bayes theorem: The Monty Hall Game show In a TV Game show, a contestant selects one of three doors; behind one of the doors there is a prize, and behind the other two there are no prizes. After the contestant select a door, the game-show host opens one of the remaining doors, and reveals that there is no prize behind it. The host then asks the contestant whether he/she wants to SWITCH to the other unopened door, or STICK to the original choice. What should the contestant do? The host is actually asking the contestant whether he/she wants to SWITCH the choice to both other doors, or STICK to the original choice. Phrased this way, it is obvious what the optimal thing to do is.

Conditional independence We say that X and Y are conditionally independent if Z X Y

Naive Bayes: Combining evidence Assume full conditional independence and express the full joint probability distribution as:

Naive Bayes: Dentist example toothache ¬ toothache catch ¬ catch catch ¬ catch cavity ¬ cavity

Naive Bayes: Dentist example toothache ¬ toothache catch ¬ catch catch ¬ catch cavity ¬ cavity Full table has =7 independent numbers [ O(2 n ) ] catch ¬ catch cavity ¬ cavity toothache ¬ toothache cavity ¬ cavity cavity0.200 ¬ cavity independent numbers 1 independent number

Naive Bayes application: Learning to classify text Use a dictionary with words (not too frequent and not to infrequent), e.g. w 1 = airplane, w 2 = algorithm,... Estimate conditional probabilities P(w i | interesting) and P(w i | uninteresting) Compute P(text | interesting) and P(text | uninteresting) using Naive Bayes (and assuming that word position in text is unimportant) Where w i are the words occuring in the text.

Naive Bayes application: Learning to classify text Then compute the probability that the text is interesting (or uninteresting) using Bayes’ theorem P(text) is just a normalization factor; it is not necessary to compute it since we are only interested in if P(interesting | text) > P(uninteresting | text)

Uncertainty arises because of laziness and ignorance: This is unavoidable in the real world. Probability expresses the agent’s belief in different outcomes. Probability helps the agent to act in a rational manner. Rational = maximize the own expected utility. In order to get efficient algorithms to deal with probability we need conditional independences among variables. Conclusions