CPS 570: Artificial Intelligence Introduction to probability Instructor: Vincent Conitzer.

Slides:



Advertisements
Similar presentations
Probability, Part III.
Advertisements

Section 5.1 and 5.2 Probability
SI485i : NLP Day 2 Probability Review. Introduction to Probability Experiment (trial) Repeatable procedure with well-defined possible outcomes Outcome.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 4-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.
Section 7.4 (partially). Section Summary Expected Value Linearity of Expectations Independent Random Variables.
Section 16.1: Basic Principles of Probability
Lec 18 Nov 12 Probability – definitions and simulation.
1 Discrete Math CS 280 Prof. Bart Selman Module Probability --- Part a) Introduction.
Probability and combinatorics Notes from Virtual Laboratories in Probability and Statistics.
Finite probability space set  (sample space) function P:  R + (probability distribution)  P(x) = 1 x 
Information Theory and Security
CEEN-2131 Business Statistics: A Decision-Making Approach CEEN-2130/31/32 Using Probability and Probability Distributions.
Probability Review Thursday Sep 13.
Probability and Statistics Review Thursday Sep 11.
Expected Value (Mean), Variance, Independence Transformations of Random Variables Last Time:
Probability and Probability Distributions
CPS 270: Artificial Intelligence Bayesian networks Instructor: Vincent Conitzer.
Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.
Chapter 4 Probability See.
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
1 9/8/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
Independence and Dependence 1 Krishna.V.Palem Kenneth and Audrey Kennedy Professor of Computing Department of Computer Science, Rice University.
C4, L1, S1 Probabilities and Proportions. C4, L1, S2 I am offered two lotto cards: –Card 1: has numbers –Card 2: has numbers Which card should I take.
1 9/23/2015 MATH 224 – Discrete Mathematics Basic finite probability is given by the formula, where |E| is the number of events and |S| is the total number.
Great Theoretical Ideas In Computer Science Steven Rudich, Anupam GuptaCS Spring 2004 Lecture 22April 1, 2004Carnegie Mellon University
Sample space The set of all possible outcomes of a chance experiment –Roll a dieS={1,2,3,4,5,6} –Pick a cardS={A-K for ♠, ♥, ♣ & ♦} We want to know the.
The probability that it rains is 70% The probability that it does NOT rain is 30% Instinct tells us that for any event E, the probability that E happens.
Great Theoretical Ideas in Computer Science.
Ch.3 PROBABILITY Prepared by: M.S Nurzaman, S.E, MIDEc. ( deden )‏
Chap 4-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 4 Using Probability and Probability.
Lesson 6 – 2b Probability Models Part II. Knowledge Objectives Explain what is meant by random phenomenon. Explain what it means to say that the idea.
Copyright © 2010 Pearson Education, Inc. Chapter 6 Probability.
COMPSCI 102 Introduction to Discrete Mathematics.
Review Homework pages Example: Counting the number of heads in 10 coin tosses. 2.2/
From Randomness to Probability Chapter 14. Dealing with Random Phenomena A random phenomenon is a situation in which we know what outcomes could happen,
Probability Basic Concepts Start with the Monty Hall puzzle
Natural Language Processing Giuseppe Attardi Introduction to Probability IP notice: some slides from: Dan Jurafsky, Jim Martin, Sandiway Fong, Dan Klein.
Probability Rules In the following sections, we will transition from looking at the probability of one event to the probability of multiple events (compound.
V7 Foundations of Probability Theory „Probability“ : degree of confidence that an event of an uncertain nature will occur. „Events“ : we will assume that.
CPS 170: Artificial Intelligence Markov processes and Hidden Markov Models (HMMs) Instructor: Vincent Conitzer.
Lecture 6 Dustin Lueker.  Standardized measure of variation ◦ Idea  A standard deviation of 10 may indicate great variability or small variability,
Stat 31, Section 1, Last Time Big Rules of Probability –The not rule –The or rule –The and rule P{A & B} = P{A|B}P{B} = P{B|A}P{A} Bayes Rule (turn around.
Great Theoretical Ideas in Computer Science for Some.
Great Theoretical Ideas In Computer Science John LaffertyCS Fall 2006 Lecture 10 Sept. 28, 2006Carnegie Mellon University Classics.
9/14/1999 JHU CS /Jan Hajic 1 Introduction to Natural Language Processing Probability AI-Lab
Microsoft produces a New operating system on a disk. There is 0
Chap 4-1 Chapter 4 Using Probability and Probability Distributions.
AP Statistics From Randomness to Probability Chapter 14.
Probability and statistics - overview Introduction Basics of probability theory Events, probability, different types of probability Random variable, probability.
Great Theoretical Ideas in Computer Science.
Conditional Probability 423/what-is-your-favorite-data-analysis-cartoon 1.
1 COMP2121 Discrete Mathematics Principle of Inclusion and Exclusion Probability Hubert Chan (Chapters 7.4, 7.5, 6) [O1 Abstract Concepts] [O3 Basic Analysis.
Section 5.1 and 5.2 Probability
Chapter 4 Using Probability and Probability Distributions
Natural Language Processing
Probability Lirong Xia Spring Probability Lirong Xia Spring 2017.
Where are we in CS 440? Now leaving: sequential, deterministic reasoning Entering: probabilistic reasoning and machine learning.
Natural Language Processing
Introduction to Probability
Great Theoretical Ideas In Computer Science
Probabilities and Proportions
Probability, Part I.
CPS 570: Artificial Intelligence Introduction to probability
CPS 570: Artificial Intelligence Bayesian networks
Artificial Intelligence Introduction to probability
Instructor: Vincent Conitzer
Probability Lirong Xia.
Sets, Combinatorics, Probability, and Number Theory
Presentation transcript:

CPS 570: Artificial Intelligence Introduction to probability Instructor: Vincent Conitzer

Uncertainty So far in course, everything deterministic If I walk with my umbrella, I will not get wet But: there is some chance my umbrella will break! Intelligent systems must take possibility of failure into account… –May want to have backup umbrella in city that is often windy and rainy … but should not be excessively conservative –Two umbrellas not worthwhile for city that is usually not windy Need quantitative notion of uncertainty

Probability Example: roll two dice Random variables: –X = value of die 1 –Y = value of die 2 Outcome is represented by an ordered pair of values (x, y) –E.g., (6, 1): X=6, Y=1 –Atomic event or sample point tells us the complete state of the world, i.e., values of all random variables Exactly one atomic event will happen; each atomic event has a ≥0 probability; sum to 1 –E.g., P(X=1 and Y=6) = 1/36 1/36 An event is a proposition about the state (=subset of states) –X+Y = 7 Probability of event = sum of probabilities of atomic events where event is true X Y

Cards and combinatorics Draw a hand of 5 cards from a standard deck with 4*13 = 52 cards (4 suits, 13 ranks each) Each of the (52 choose 5) hands has same probability 1/(52 choose 5) Probability of event = number of hands in that event / (52 choose 5) What is the probability that… –no two cards have the same rank? –you have a flush (all cards the same suit?) –you have a straight (5 cards in order of rank, e.g., 8, 9, 10, J, Q)? –you have a straight flush? –you have a full house (three cards have the same rank and the two other cards have the same rank)?

Facts about probabilities of events If events A and B are disjoint, then –P(A or B) = P(A) + P(B) More generally: –P(A or B) = P(A) + P(B) - P(A and B) If events A 1, …, A n are disjoint and exhaustive (one of them must happen) then P(A 1 ) + … + P(A n ) = 1 –Special case: for any random variable, ∑ x P(X=x) = 1 Marginalization: P(X=x) = ∑ y P(X=x and Y=y)

Conditional probability We might know something about the world – e.g., “X+Y=6 or X+Y=7” – given this (and only this), what is the probability of Y=5? Part of the sample space is eliminated; probabilities are renormalized to sum to 1 1/ X Y 1/ X Y P(Y=5 | (X+Y=6) or (X+Y=7)) = 2/11

Facts about conditional probability P(A | B) = P(A and B) / P(B) A B A and B Sample space P(A | B)P(B) = P(A and B) = P(B | A)P(A) P(A | B) = P(B | A)P(A)/P(B) –Bayes’ rule

Conditional probability and cards Given that your first two cards are Queens, what is the probability that you will get at least three Queens? Given that you have at least two Queens (not necessarily the first two), what is the probability that you have at least three Queens? Given that you have at least two Queens, what is the probability that you have three Kings?

How can we scale this? In principle, we now have a complete approach for reasoning under uncertainty: –Specify probability for every atomic event, –Can compute probabilities of events simply by summing probabilities of atomic events, –Conditional probabilities are specified in terms of probabilities of events: P(A | B) = P(A and B) / P(B) If we have n variables that can each take k values, how many atomic events are there?

Independence Some variables have nothing to do with each other Dice: if X=6, it tells us nothing about Y P(Y=y | X=x) = P(Y=y) So: P(X=x and Y=y) = P(Y=y | X=x)P(X=x) = P(Y=y)P(X=x) –Usually just write P(X, Y) = P(X)P(Y) –Only need to specify 6+6=12 values instead of 6*6=36 values –Independence among 3 variables: P(X,Y,Z)=P(X)P(Y)P(Z), etc. Are the events “you get a flush” and “you get a straight” independent?

An example without cards or dice What is the probability of –Rain in Beaufort? Rain in Durham? –Rain in Beaufort, given rain in Durham? –Rain in Durham, given rain in Beaufort? Rain in Beaufort and rain in Durham are correlated Rain in Durham Rain in Beaufort Sun in Durham Sun in Beaufort (disclaimer: no idea if these numbers are realistic)

A possibly rigged casino With probability ½, the casino is rigged and has dice that come up 6 only 1/12 of the time, and 1 3/12 of the time 1/961/144 1/288 1/481/72 1/144 1/481/72 1/144 1/481/72 1/144 1/481/72 1/144 1/321/48 1/ X Y 1/ X Y Z=0 (fair casino) Z=1 (rigged casino) What is P(Y=6)? What is P(Y=6|X=1)? Are they independent?

Conditional independence Intuition: –the only reason that X tells us something about Y, –is that X tells us something about Z, –and Z tells us something about Y If we already know Z, then X tells us nothing about Y P(Y | Z, X) = P(Y | Z) or P(X, Y | Z) = P(X | Z)P(Y | Z) “X and Y are conditionally independent given Z”

Medical diagnosis X: does patient have flu? Y: does patient have headache? Z: does patient have fever? P(Y,Z|X) = P(Y|X)P(Z|X) P(X=1) =.2 P(Y=1 | X=1) =.5, P(Y=1 | X=0) =.2 P(Z=1 | X=1) =.4, P(Z=1 | X=0) =.1 What is P(X=1|Y=1,Z=0)? (disclaimer: no idea if these numbers are realistic)

Conditioning can also introduce dependence X: is it raining? –P(X=1) =.3 Y: are the sprinklers on? –P(Y=1) =.4 –X and Y are independent Z: is the grass wet? –P(Z=1 | X=0, Y=0) =.1 –P(Z=1 | X=0, Y=1) =.8 –P(Z=1 | X=1, Y=0) =.7 –P(Z=1 | X=1, Y=1) = Raining Not raining Sprinklers No sprinklers Wet Raining Not raining Sprinklers No sprinklers Not wet Conditional on Z=1, X and Y are not independent If you know Z=1, rain seems likely; then if you also find out Y=1, this “explains away” the wetness and rain seems less likely

Context-specific independence Recall P(X, Y | Z) = P(X | Z)P(Y | Z) really means: for all x, y, z, P(X=x, Y=y | Z=z) = P(X=x | Z=z)P(Y=y | Z=z) But it may not be true for all z P(Wet, RainingInLondon | CurrentLocation=New York) = P(Wet | CurrentLocation=New York)P(RainingInLondon | CurrentLocation=New York) But not P(Wet, RainingInLondon | CurrentLocation=London) = P(Wet | CurrentLocation=London)P(RainingInLondon | CurrentLocation=London)

Pairwise independence does not imply complete independence X is a coin flip, Y is a coin flip, Z = X xor Y Clearly P(X,Y,Z) ≠ P(X)P(Y)P(Z) But P(Z|X) = P(Z) Tempting to say X and Y are “really” independent, X and Z are “not really” independent But: X is a coin flip, Z is a coin flip, Y = X xor Z gives the exact same distribution

Monty Hall problem Game show participants can choose one of three doors One door has a car, two have a goat –Assumption: car is preferred to goat Participant chooses door, but not opened yet At least one of the other doors contains a goat; the (knowing) host will open one such door (flips coin to decide if both have goats) Participant is asked whether she wants to switch doors (to the other closed door) – should she? image taken from

Expected value If Z takes numerical values, then the expected value of Z is E(Z) = ∑ z P(Z=z)*z –Weighted average (weighted by probability) Suppose Z is sum of two dice E(Z) = ( 1/36)*2 + ( 2/36)*3 + ( 3/36)*4 + ( 4/36)*5 + ( 5/36)*6 + ( 6/36)*7 + ( 5/36)*8 + ( 4/36)*9 + ( 3/36)*10 + ( 2/36)*11 + ( 1/36)*12 = 7 Simpler way: E(X+Y)=E(X)+E(Y) (always!) –Linearity of expectation E(X) = E(Y) = 3.5

Linearity of expectation… If a is used to represent an atomic state, then E(X) = ∑ x P(X=x)*x = ∑ x (∑ a:X(a)=x P(a))*x =∑ a P(a)*X(a) E(X+Y) = ∑ a P(a)*(X(a)+Y(a)) = ∑ a P(a)*X(a) + ∑ a P(a)*Y(a) = E(X)+E(Y)

What is probability, anyway? Different philosophical positions: Frequentism: numbers only come from repeated experiments –As we flip a coin lots of times, we see experimentally that it comes out heads ½ the time –Problem: for most events in the world, there is no history of exactly that event happening Probability that the Democratic candidate wins the next election? Objectivism: probabilities are a real part of the universe –Maybe true at level of quantum mechanics –Most of us agree that the result of a coin flip is (usually) determined by initial conditions + classical mechanics Subjectivism: probabilities merely reflect agents’ beliefs