15-251 Great Theoretical Ideas in Computer Science.

Slides:



Advertisements
Similar presentations
Lecture Discrete Probability. 5.3 Bayes’ Theorem We have seen that the following holds: We can write one conditional probability in terms of the.
Advertisements

Probability Three basic types of probability: Probability as counting
Great Theoretical Ideas in Computer Science.
1 Some more probability Samuel Marateck © Another way of calculating card probabilities. What’s the probability of choosing a hand of cards with.
Segment 3 Introduction to Random Variables - or - You really do not know exactly what is going to happen George Howard.
Great Theoretical Ideas in Computer Science about AWESOME Some Generating Functions Probability.
Great Theoretical Ideas in Computer Science.
Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.
Section 7.4 (partially). Section Summary Expected Value Linearity of Expectations Independent Random Variables.
Lec 18 Nov 12 Probability – definitions and simulation.
1 Discrete Structures & Algorithms Discrete Probability.
Probability And Expected Value ————————————
Math 310 Section 7.2 Probability. Succession of Events So far, our discussion of events have been in terms of a single stage scenario. We might be looking.
1 Random Variables Supplementary Notes Prepared by Raymond Wong Presented by Raymond Wong.
The probabilistic method & infinite probability spaces Great Theoretical Ideas In Computer Science Steven Rudich, Anupam GuptaCS Spring 2004 Lecture.
C4: DISCRETE RANDOM VARIABLES CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics Longin Jan Latecki.
Great Theoretical Ideas in Computer Science.
Discrete Random Variables: The Binomial Distribution
Probability Theory: Counting in Terms of Proportions Great Theoretical Ideas In Computer Science Steven Rudich, Anupam GuptaCS Spring 2004 Lecture.
Probability and Probability Distributions
Lectures prepared by: Elchanan Mossel Yelena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.
Sets, Combinatorics, Probability, and Number Theory Mathematical Structures for Computer Science Chapter 3 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesProbability.
Probability We love Section 9.3a and b!. Most people have an intuitive sense of probability, but that intuition is often incorrect… Let’s test your intuition.
Conditional Probability and Independence If A and B are events in sample space S and P(B) > 0, then the conditional probability of A given B is denoted.
Great Theoretical Ideas In Computer Science Steven Rudich, Anupam GuptaCS Spring 2004 Lecture 22April 1, 2004Carnegie Mellon University
AP Statistics: Section 8.2 Geometric Probability.
Simple Mathematical Facts for Lecture 1. Conditional Probabilities Given an event has occurred, the conditional probability that another event occurs.
Lecture Discrete Probability. 5.3 Bayes’ Theorem We have seen that the following holds: We can write one conditional probability in terms of the.
Probability III: The probabilistic method & infinite probability spaces Great Theoretical Ideas In Computer Science Steven Rudich, Avrim BlumCS
Great Theoretical Ideas in Computer Science.
5.3 Random Variables  Random Variable  Discrete Random Variables  Continuous Random Variables  Normal Distributions as Probability Distributions 1.
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Properties of expectation.
CSCE 411 Design and Analysis of Algorithms Set 9: Randomized Algorithms Prof. Jennifer Welch Fall 2014 CSCE 411, Fall 2014: Set 9 1.
Chapter 5: Probability Analysis of Randomized Algorithms Size is rarely the only property of input that affects run time Worst-case analysis most common.
Probability III: The Probabilistic Method Great Theoretical Ideas In Computer Science John LaffertyCS Fall 2006 Lecture 11Oct. 3, 2006Carnegie.
Week 21 Conditional Probability Idea – have performed a chance experiment but don’t know the outcome (ω), but have some partial information (event A) about.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
COMPSCI 102 Introduction to Discrete Mathematics.
FORM : 4 DEDIKASI PRESENTED BY : GROUP 11 KOSM, GOLDCOURSE HOTEL, KLANG FORM : 4 DEDIKASI PRESENTED BY : GROUP 11 KOSM, GOLDCOURSE HOTEL, KLANG.
1 Parrondo's Paradox. 2 Two losing games can be combined to make a winning game. Game A: repeatedly flip a biased coin (coin a) that comes up head with.
Sixth lecture Concepts of Probabilities. Random Experiment Can be repeated (theoretically) an infinite number of times Has a well-defined set of possible.
How likely is it that…..?. The Law of Large Numbers says that the more times you repeat an experiment the closer the relative frequency of an event will.
L56 – Discrete Random Variables, Distributions & Expected Values
Bernoulli Trials, Geometric and Binomial Probability models.
MATH 256 Probability and Random Processes Yrd. Doç. Dr. Didem Kivanc Tureli 14/10/2011Lecture 3 OKAN UNIVERSITY.
A random variable is a variable whose values are numerical outcomes of a random experiment. That is, we consider all the outcomes in a sample space S and.
Great Theoretical Ideas in Computer Science for Some.
Tutorial 5: Discrete Probability I Reference: cps102/lecture11.ppt Steve Gu Feb 15, 2008.
Great Theoretical Ideas In Computer Science John LaffertyCS Fall 2006 Lecture 10 Sept. 28, 2006Carnegie Mellon University Classics.
Probability Distribution. Probability Distributions: Overview To understand probability distributions, it is important to understand variables and random.
Theory of Computational Complexity Probability and Computing Ryosuke Sasanuma Iwama and Ito lab M1.
Section 7.3. Why we need Bayes?  How to assess the probability that a particular event occurs on the basis of partial evidence.  The probability p(F)
Great Theoretical Ideas in Computer Science.
CHAPTER 6 Random Variables
Introduction to Discrete Mathematics
Copyright © 2016, 2013, and 2010, Pearson Education, Inc.
Introduction to Discrete Mathematics
Minds on! If you choose an answer to this question at random, what is the probability you will be correct? A) 25% B) 50% C) 100% D) 25%
Conditional Probability
Probability III: The Probabilistic Method
CHAPTER 12: Introducing Probability
Probability 14.1 Experimental Probability 14.2 Principles of Counting
Great Theoretical Ideas In Computer Science
Great Theoretical Ideas In Computer Science
Probability Theory: Counting in Terms of Proportions
Great Theoretical Ideas in Computer Science
Great Theoretical Ideas In Computer Science
I flip a coin two times. What is the sample space?
Probability Theory: Counting in Terms of Proportions
A random experiment gives rise to possible outcomes, but any particular outcome is uncertain – “random”. For example, tossing a coin… we know H or T will.
Presentation transcript:

Great Theoretical Ideas in Computer Science

Setting Expectations for Computer Scientists

Lecture 12 (September 30, 2010) Probability Theory II E[X+Y] = E[X] + E[Y]

A few things from last time…

S Sample space Sample Space D(t) = p(t) = 0.2 weight or probability of t

Events Any set E  S is called an event  p(t) t  E Pr D [E] = S Pr D [E] = 0.4

Binomials and Gaussians…

Given a fair coin (p = ½) Pr(see k heads in n flips) = nknk p k (1-p) n-k nknk 2 -n As n  1, the plot for Pr(k heads) for p = ½ tends to “bell curve” or “Gaussian/normal distribution”

Infinite sample spaces…...

A bias-p coin is tossed until the first time that a head turns up. sample space S = {H, TH, TTH, TTTH, …} The “Geometric” Distribution Pr(k) = (1-p) k-1 p (sanity check)  k≥1 Pr(k) =  k≥1 (1-p) k-1 p = p * 1/(1-(1-p)) = 1 = p * (1 + (1-p) + (1-p) 2 + …) (shorthand S = {1, 2, 3, 4, …})

A bias-p coin is tossed until the first time that a head turns up. sample space S = {1, 2, 3, 4, … } The Geometric Distribution Pr(E) =  x in E Pr(x) =  k even (1-p) k-1 p = p(1-p) * 1/(1-(1-p) 2 ) = (1-p)/(2-p) = p(1-p) * (1 + (1-p) 2 + (1-p) 4 + …) E = “first heads at even numbered flip” = {2, 4, 6, …} “= {TH, TTTH, TTTTTH, …}”

Expectations E[ ]

“Expectation = (weighted) average” What is the average height of a 251 student? Sample space S = all 251 students (implicit) probability distribution = uniform average = (sum of heights in S)/|S| = (  t in S Height(t) )/|S| =  t in S Pr(t) Height(t)

Expectation Given a probability distribution D = (S, Pr) And any function f:S to reals E[ f ] =  t in S f(t) Pr(t) A function from the sample space to the reals is called a “random variable”

Random Variable A Random Variable is a function from S to reals Examples: X = value of white die in a two-dice roll X(3,4) = 3,X(1,6) = 1 Y = sum of values of the two dice Y(3,4) = 7,Y(1,6) = 7 Let S be sample space in a probability distribution

Notational Conventions Use letters like A, B, C for events Use letters like X, Y, f, g for R.V.’s R.V. = random variable

0 1 2 TT HT TH HH ¼ ¼ ¼ ¼ S Two Coins Tossed X: {TT, TH, HT, HH} → {0, 1, 2} counts the number of heads ¼ ½ ¼ Distribution on the reals!

Two Views of Random Variables Input to the function is random Randomness is “pushed” to the values of the function Think of a R.V. as A function from S to the reals R Or think of the induced distribution on R Given a distribution, a random variable transforms it into a distribution on reals

Two dice I throw a white die and a black die. Sample space S = { (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6) } X = sum of both dice function with X(1,1) = 2, X(1,2) = 3, …, X(6,6)=12

It’s a Floor Wax And a Dessert Topping It’s a function on the sample space S It’s a variable with a probability distribution on its values You should be comfortable with both views

Two dice I throw a white die and a black die. Sample space S = { (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6) } X = sum of both dice function with X(1,1) = 2, X(1,2) = 3, …, X(6,6)=12

From Random Variables to Events For any random variable X and value a, we can define the event A that X = a Pr(A) = Pr(X=a) = Pr( {t  S | X(t)=a} ) Note that each event in the induced distribution corresponds to some event in the original one.

Pr(X = a) = Pr({t  S| X(t) = a}) Two Coins Tossed X: {TT, TH, HT, HH} → {0, 1, 2} counts # of heads TT HT TH HH ¼ ¼ ¼ ¼ S ¼ ½ ¼ Distribution on X X = Pr({t  S| X(t) = 1}) = Pr({TH, HT}) = ½ Pr(X = 1)

Two dice I throw a white die and a black die. Sample space S = { (1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2), (3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4), (5,5), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6) } X = sum of both dice Pr( X = 7 ) =

X has a prob. distribution on its values X is a function on the sample space S Definition: Expectation The expectation, or expected value of a random variable X is written as E[X], and is  Pr(t) X(t) =  k Pr[X = k] t  S k E[X] = (assuming X takes values in the naturals)

 Pr(t) X(t) =  k Pr[X = k] t  S k Quick check:

A Quick Calculation… What if I flip a coin 3 times? What is the expected number of heads? E[X] = (1/8)×0 + (3/8)×1 + (3/8)×2 + (1/8)×3 = 1.5 But Pr[ X = 1.5 ] = 0 Moral: don’t always expect the expected. Pr[ X = E[X] ] may be 0 !

Type Checking P( B ) E( X ) B must be an event X must be a R.V. cannot do P( R.V. ) or E( event )

Operations on R.V.s You can sum them, take differences, or do most other math operations… E.g., (X + Y)(t) = X(t) + Y(t) (X*Y)(t) = X(t) * Y(t) (X Y )(t) = X(t) Y(t)

Random variables and expectations allow us to give elegant solutions to problems that seem really really messy…

If I randomly put 100 letters into 100 addressed envelopes, on average how many letters will end up in their correct envelopes? On average, in class of size m, how many pairs of people will have the same birthday? Pretty messy with direct counting…

The new tool is called “Linearity of Expectation”

Linearity of Expectation If Z = X+Y, then E[Z] = E[X] + E[Y] Even if X and Y are not independent

E[Z]  Pr[t] Z(t) t  S  Pr[t] (X(t) + Y(t)) t  S  Pr[t] X(t) +  Pr[t] Y(t)) t  S E[X] + E[Y] = = = =

By Induction E[X 1 + X 2 + … + X n ] = E[X 1 ] + E[X 2 ] + …. + E[X n ] The expectation of the sum = The sum of the expectations

Expectation of a Sum of r.v.s = Sum of their Expectations even when r.v.s are not independent! Expectation of a Product of r.v.s = Product of their Expectations only when r.v.s are independent!

Independence for r.v.s Two random variables X and Y are independent if for every a,b, the events X=a and Y=b are independent How about the case of X=1st die, Y=2nd die? X = left arm, Y=right arm?

Let’s test our Linearity of Expectation chops…

If I randomly put 100 letters into 100 addressed envelopes, on average how many letters will end up in their correct envelopes? Hmm…  k k Pr(exactly k letters end up in correct envelopes) =  k k (…aargh!!…)

Use Linearity of Expectation X i = 1 if A i occurs 0 otherwise Let A i be the event the i th letter ends up in its correct envelope Let X i be the “indicator” R.V. for A i We are asking for E[Z] Let Z = X 1 + … + X 100 E[X i ] = Pr(A i ) = 1/100 So E[Z] = 1

So, in expectation, 1 letter will be in the same correct envelope Pretty neat: it doesn’t depend on how many letters! Question: were the X i independent? No! E.g., think of n=2

Use Linearity of Expectation General approach: View thing you care about as expected value of some RV Write this RV as sum of simpler RVs Solve for their expectations and add them up!

We flip n coins of bias p. What is the expected number of heads? We could do this by summing But now we know a better way! Example #2  k k Pr(X = k) =   k k p k (1-p) n-k nknk = n.p

Let X = number of heads when n independent coins of bias p are flipped Break X into n simpler RVs: E[ X ] = E[  i X i ] = np X i = 1 if the i th coin is tails 0 if the i th coin is heads Linearity of Expectation!  i E[ X i ] =  i p =

On average, in class of size m, how many pairs of people will have the same birthday?  k k Pr(exactly k collisions) =  k k (…aargh!!!!…)

Use linearity of expectation Suppose we have m people each with a uniformly chosen birthday from 1 to 366 X = number of pairs of people with the same birthday E[X] = ?

X = number of pairs of people with the same birthday E[X] = ? Use m(m-1)/2 indicator variables, one for each pair of people X jk = 1 if person j and person k have the same birthday; else 0 E[X jk ] = (1/366) 1 + (1 – 1/366) 0 = 1/366

X = number of pairs of people with the same birthday X jk = 1 if person j and person k have the same birthday; else 0 E[X jk ] = (1/366) 1 + (1 – 1/366) 0 = 1/366 E[X] = E[  1 ≤ j < k ≤ m X jk ] =  1 ≤ j < k ≤ m E[ X jk ] = m(m-1)/2 × 1/366

E.g., setting m = 23, we get E[# pairs with same birthday ] = 0.691… How does this compare to Pr[ at least one pair has same birthday] ?

You pick a number n 2 [1..6]. You roll 3 dice. If any of them match n, you win $1. Else you pay me $1. Want to play? X i = 1 if i-th die matches, 0 otherwise Expected number of dice that match: E[X 1 +X 2 +X 3 ] = 1/6+1/6+1/6 = ½. BUT not same as Pr(at least one die matches). Pr(at least one die matches) = 1 – Pr(none match) = 1 – (5/6) 3 =

Back to Lin. of Exp. Using linearity of expectations in unexpected places…

A Puzzle 10% of the surface of a sphere is colored green, and the rest is colored blue. Show that now matter how the colors are arranged, it is possible to inscribe a cube in the sphere so that all of its vertices are blue.

Solution Pick a random cube. (Note: any particular vertex is uniformly distributed over surface of sphere). Let X i = 1 if i th vertex is blue, 0 otherwise E[X i ] = P(X i =1) = 9/10 E[X] = 8*9/10 > 7 So, must have some cubes where X = 8. Let X = X 1 + X X 8

The general principle we used here Show the expected value of some random variable is “high” Hence, there must be an outcome in the sample space where the random variable takes on a “high” value. (Not everyone can be below the average.) called “the probabilistic method”

Some supplementary material

linearity of expectation : it holds even when the random variables are not independent

1,0,1 HT 0,0,0 TT 0,1,1 TH 1,1,2 HH Linearity of Expectation E.g., 2 fair flips: X = heads in 1st coin, Y = heads in 2nd coin Z = X+Y = total # heads What is E[X]? E[Y]? E[Z]?

1,0,1 HT 0,0,0 TT 1,0,1 TH 1,1,2 HH Linearity of Expectation E.g., 2 fair flips: X = at least one coin is heads Y = both coins are heads, Z = X+Y Are X and Y independent? What is E[X]? E[Y]? E[Z]?

from random variables to events and vice versa

From Random Variables to Events For any random variable X and value a, we can define the event A that X = a Pr(A) = Pr(X=a) = Pr({t  S| X(t)=a})

From Events to Random Variables For any event A, can define the “indicator random variable” for event A: X A (t) = 1 if t  A 0 if t  A E[X A ] = 1 × Pr(X A = 1) = Pr(A)

How about multiplying expectations?

Multiplication of Expectations A coin is tossed twice. X i = 1 if the i th toss is heads and 0 otherwise. E[ X i ] = 1/2 E[ X 1 X 2 ] = E[ X 1 ] E[ X 2 ] = 1/4 E[XY] = E[X]E[Y] if they are independent

Multiplication of Expectations Consider a single toss of a coin. X = 1 if heads turns up and 0 otherwise. Set Y = 1 - X E[X Y]  E[X] E[Y] E[X] = E[Y] = 1/2 since X Y = 0 X and Y are not independent