Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lectures prepared by: Elchanan Mossel elena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section.

Similar presentations


Presentation on theme: "Lectures prepared by: Elchanan Mossel elena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section."— Presentation transcript:

1

2 Lectures prepared by: Elchanan Mossel elena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section 3.3

3 Histo 1 X = 2*Bin(300,1/2) – 300 E[X] = 0

4 Histo 2 Y = 2*Bin(30,1/2) – 30 E[Y] = 0

5 Histo 3 Z = 4*Bin(10,1/4) – 10 E[Z] = 0

6 Histo 4 W = 0 E[W] = 0

7 A natural question: Is there a good parameter that allow to distinguish between these distributions? Is there a way to measure the spread?

8 Variance and Standard Deviation The variance of X, denoted by Var(X) is the mean squared deviation of X from its expected value  = E(X): Var(X) = E[(X-  ) 2 ]. The standard deviation of X, denoted by SD(X) is the square root of the variance of X:

9 Computational Formula for Variance Proof: E[ (X-  ) 2 ] = E[X 2 – 2  X +  2 ] E[ (X-  ) 2 ] = E[X 2 ] – 2  E[X] +  2 E[ (X-  ) 2 ] = E[X 2 ] – 2  2 +  2 E[ (X-  ) 2 ] = E[X 2 ] – E[X] 2 Claim:

10 Variance and SD For a general distribution Chebyshev inequality states that for every random variable X, X is expected to be close to E(X) give or take a few SD(X). Chebyshev Inequality: For every random variable X and for all k > 0: P(|X – E(X)| ¸ k SD(X)) · 1/k 2.

11 Properties of Variance and SD 1.Claim: Var(X) ¸ 0. Pf: Var(X) =  (x-  ) 2 P(X=x) ¸ 0 2.Claim: Var(X) = 0 iff P[X=  ] = 1.

12 Chebyshev’s Inequality P(|X – E(X)| ¸ k SD(X)) · 1/k 2 proof: Let  = E(X) and  = SD(X). Observe that |X–  | ¸ k , |X–  | 2 ¸ k 2  2. The RV |X–  | 2 is non-negative, so we can use Markov’s inequality: P(|X–  | 2 ¸ k 2  2 ) · E [|X–  | 2 ] / k 2  2 P(|X–  | 2 ¸ k 2  2 ) ·  2 / k 2  2 = 1/k 2.

13 Variance of Indicators Suppose I A is an indicator of an event A with probability p. Observe that I A 2 = I A. AcAc A I A =1= I A 2 I A =0= I A 2 E(I A 2 ) = E(I A ) = P(A) = p, so: Var(I A ) = E(I A 2 ) – E(I A ) 2 = p – p 2 = p(1-p).

14 Variance of a Sum of Independent Random Variables Claim: if X 1, X 2, …, X n are independent then: Var (X 1 +X 2 +…+X n ) = Var (X 1 )+ Var (X 2 )+…+ Var (X n ). Pf: Suffices to prove for 2 random variables. E[( X+Y – E(X+Y) ) 2 ] = E[( X-E(X) + Y–E(Y) ) 2 ] = E[( X-E(X)) 2 + 2 E[(X-E(X)) (Y-E(Y))] + E(Y–E(Y) ) 2 ]= Var(X) + Var(Y) + 2 E[(X-E(X))] E[(Y-E(Y))] (mult.rule) = Var(X) + Var(Y) + 0

15 Variance and Mean under scaling and shifts Claim: SD(aX + b) = |a| SD(X) Proof: Var[aX+b] = E[(aX+b – a  –b) 2 ] = = E[a 2 (X-  ) 2 ] = a 2  2 Corollary: If a random variable X has E(X) =  and SD(X) =  > 0, then X * =(X-  )/  has E(X * ) =0 and SD(X * )=1.

16 Square Root Law Let X 1, X 2, …, X n be independent random variables with the same distribution as X, and let S n be their sum: S n =  i=1 n X i, and their average, then:

17 Weak Law of large numbers Then for every  > 0 : Thm: Let X 1, X 2, … be a sequence of independent random variables with the same distribution. Let  denote the common expected value  = E(X i ).

18 Weak Law of large numbers Now Chebyshev inequality gives us: Proof: Let  = E(X i ) and  = SD(X i ). Then from the square root law we have: For a fixed  right hand side tends to 0 as n tends to 1.

19 The Normal Approximation Let S n = X 1 + … + X n be the sum of independent random variables with the same distribution. Then for large n, the distribution of S n is approximately normal with mean E(S n ) = n  and SD(S n ) =  n 1/2, where  = E(X i ) and  = SD(X i ). In other words:

20 Sums of iid random variables Suppose X i represents the number obtained on the i’th roll of a die. Then X i has a uniform distribution on the set {1,2,3,4,5,6}.

21 Distribution of X 1

22 Sum of two dice We can obtain the distribution of S 2 = X 1 +X 2 by the convolution formula: P(S 2 = k) =  i=1 k-1 P(X 1 =i) P(X 2 =k-i| X 1 =i), by independence =  i=1 k-1 P(X 1 =i) P(X 2 =k-i).

23 Distribution of S 2

24 Sum of four dice We can obtain the distribution of S 4 = X 1 + X 2 + X 3 + X 4 = S 2 + S’ 2 again by the convolution formula: P(S 4 = k) =  i=1 k-1 P(S 2 =i) P(S’ 2 =k-i| S 2 =i), by independence of S 2 and S’ 2 =  i=1 k-1 P(S 2 =i) P(S’ 2 =k-i).

25 Distribution of S 4

26 Distribution of S 8

27 Distribution of S 16

28 Distribution of S 32

29 Distribution of X 1

30 Distribution of S 2

31 Distribution of S 4

32 Distribution of S 8

33 Distribution of S 16

34 Distribution of S 32

35 Distribution of X 1

36 Distribution of S 2

37 Distribution of S 4

38 Distribution of S 8

39 Distribution of S 16

40 Distribution of S 32


Download ppt "Lectures prepared by: Elchanan Mossel elena Shvets Introduction to probability Stat 134 FAll 2005 Berkeley Follows Jim Pitman’s book: Probability Section."

Similar presentations


Ads by Google