ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring 2013 8. Limit theorems.

ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring 2013 8. Limit theorems

Many times we do not need to calculate probabilities exactly Sometimes it is enough to know that a probability is very small (or very large) E.g. P( earthquake tomorrow ) = ? This is often a lot easier

What do you think? I toss a coin 1000 times. The probability that I get 14 consecutive heads is < 10% ≈ 50% > 90% AB C

Consecutive heads where I i is an indicator r.v. for the event “14 consecutive heads starting at position i ” Let N be the number of occurrences of 14 consecutive heads in 1000 coin flips. N = I 1 + … + I 987 E[I i ] = P(I i = 1) = 1/2 14 E[N ] = 987 ⋅ 1/2 14 = 987/16384 ≈ 0.0602

Markov’s inequality For every non-negative random variable X and every value a : P(X ≥ a) ≤ E[X] / a. E[N ] ≈ 0.0602 P[N ≥ 1] ≤ E[N ] / 1 ≤ 6%.

Proof of Markov’s inequality For every non-negative random variable X : and every value a : P(X ≥ a) ≤ E[X] / a. E[X ] = E[X | X ≥ a ] P(X ≥ a) + E[X | X < a ] P(X < a) ≥ 0 ≥ a ≥ 0 E[X ] ≥ a P(X ≥ a) + 0.

Hats 1000 people throw their hats in the air. What is the probability at least 100 people get their hat back? N = I 1 + … + I 1000 where I i is the indicator for the event that person i gets their hat. Then E[I i ] = P(I i = 1) = 1/n Solution E[N ] = n 1/n = 1 P[N ≥ 100] ≤ E[N ] / 100 = 1%.

Patterns A coin is tossed 1000 times. Give an upper bound on the probability that the pattern HH occurs: (b) at most 100 times (a) at least 500 times

Patterns Let N be the number of occurrences of HH. P[N ≥ 500] ≤ E[N ] / 500 = 249.75/500 ≈ 49.88% so 500+ HH s occur with probability ≤ 49.88%. P[N ≤ 100] ≤ ? P[999 – N ≥ 899] (b) P[N ≤ 100] =≤ E[999 – N ] / 899 = (999 – 249.75)/ 899 ≤ 83.34% Last time we calculated E[N ] = 999/4 = 249.75. (a)

Chebyshev’s inequality For every random variable X and every t : P(|X –  | ≥ t  ) ≤ 1 / t 2. where  = E[X],  = √Var[X].

Patterns E[N ] = 999/4 = 249.75 Var[N] = (5 ⋅ 999 – 7)/16 = 311.75  = 249.75  ≈ 17.66 (a) P(X ≥ 500) ≤ P(|X –  | ≥ 14.17  ) ≤ 1/14.17 2 ≈ 0.50% (b) P(X ≤ 100) ≤ P(|X –  | ≥ 8.47  ) ≤ 1/8.47 2 ≈ 1.39%

Proof of Chebyshev’s inequality For every random variable X and every a : P(|X –  | ≥ t  ) ≤ 1 / t 2. where  = E[X],  = √Var[X]. P(|X –  | ≥ t  ) = P((X –  ) 2 ≥ t 2  2 ) ≤ E[(X –  ) 2 ] / t 2  2 = 1 / t 2.

An illustration   – t   + t   P(|X –  | ≥ t  ) ≤ 1 / t 2.  a P( X ≥ a  ) ≤  / a. 0 Markov’s inequality: Chebyshev’s inequality:

Polling 1 2 3 4 5 6 7 8 9

X i = 1 if i 0 if i X 1,…, X n are independent Bernoulli(  ) where  is the fraction of blue voters X = X 1 + … + X n X/n is the pollster’s estimate of 

Polling How accurate is the pollster’s estimate X/n ? E[X] = =  n E[X 1 ] + … + E[X n ] Var[X]= Var [X 1 ] + … + Var [X n ] =  2 n  = E[X i ],  = √Var[X i ] X = X 1 + … + X n

Polling E[X] =  n Var[X] =  2 n P( |X –  n| ≥ t  √n ) ≤ 1 / t 2. P( |X/n –  | ≥  ) ≤ . confidence error sampling error X = X 1 + … + X n  nn

The weak law of large numbers For every ,  > 0 and n ≥  2  (  2  ) : P(|X/n –  | ≥  ) ≤  X 1,…, X n are independent with same p.m.f. (p.d.f.)  = E[X i ],  = √Var[X i ], X = X 1 + … + X n

Polling Say we want confidence error  = 10% and sampling error  = 5%. How many people should we poll? For ,  > 0 and n ≥  2  (  2  ) : P(|X/n –  | ≥  ) ≤  n ≥  2  (  2  ) ≥ 4000  2 For Bernoulli(  ) samples,  2 =  (1 –  ) ≤ 1/4 This suggests we should poll about 1000 people.

A polling experiment n X 1 + … + X n n X 1, …, X n independent Bernoulli(1/2)

A more precise estimate Let’s assume n is large. Weak law of large numbers: X 1 + … + X n ≈  n with high probability X 1,…, X n are independent with same p.m.f. (p.d.f.) P( |X –  n| ≥ t  √n ) ≤ 1 / t 2. this suggests X 1 + … + X n ≈  n + T  √n

Some experiments X = X 1 + … + X n X i independent Bernoulli(1/2) n = 6 n = 40

Some experiments X = X 1 + … + X n X i independent Poisson(1) n = 3 n = 20

Some experiments X = X 1 + … + X n X i independent Uniform(0, 1) n = 2 n = 10

The normal random variable f(t) = (2  ) -½ e -t /2 2 t p.d.f. of a normal random variable

The central limit theorem X 1,…, X n are independent with same p.m.f. (p.d.f.) where T is a normal random variable.  = E[X i ],  = √Var[X i ], X = X 1 + … + X n For every t (positive or negative): lim P(X ≥  n + t  √n ) = P(T ≥ t) n → ∞

Polling again Probability model X = X 1 + … + X n X i independent Bernoulli(  )  = fraction that will vote blue E[X i ] = ,  = √Var[X i ] = √  (1 -  ) ≤ ½. Say we want confidence error  = 10% and sampling error  = 5%. How many people should we poll?

Polling again lim P(X ≥  n + t  √n ) = P(T ≥ t) n → ∞ 5% n lim P(X ≤  n – t  √n ) = P(T ≤ -t) n → ∞ 5% n lim P(X/n is not within 5% of  ) = P(T ≥ t) + P(T ≤ -t) n → ∞ = 2 P(T ≤ -t) t  √n = 5% n t  = 5%√n/ 

The c.d.f. of a normal random variable t F(t)F(t) P(T ≤ -t) t -t P(T ≥ t)

Polling again confidence error = 2 P(T ≤ -t) We want a confidence error of ≤ 10% : = 2 P(T ≤ -5%√n/  ) ≤ 2 P(T ≤ -√n/10) We need to choose n so that P(T ≤ -√n/10) ≤ 5%.

Polling again t F(t)F(t) P(T ≤ -√n/10) ≤ 5% -√n/10 ≈ -1.645 n ≈ 16.45 2 ≈ 271 http://stattrek.com/online-calculator/normal.aspx

Party Give an estimate of the probability that the average arrival time of a guest is past 8:40pm. Ten guests arrive independently at a party between 8pm and 9pm.

ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring 2013 8. Limit theorems.

Similar presentations

Presentation on theme: "ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring 2013 8. Limit theorems."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring 2013 8. Limit theorems.

Similar presentations

Presentation on theme: "ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring 2013 8. Limit theorems."— Presentation transcript:

Similar presentations

About project

Feedback