Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mathematics for Computer Science MIT 6.042J/18.062J

Similar presentations


Presentation on theme: "Mathematics for Computer Science MIT 6.042J/18.062J"— Presentation transcript:

1 Mathematics for Computer Science MIT 6.042J/18.062J
Deviation from the Mean

2 Don’t expect the Expectation!
Toss 101 fair coins. How many heads do we “expect” to see? E[#Heads] = 50.5 Should I expect to see the mean – clearly not! Now what – does this tell me anything? Really reminder that expected value = average over many experiments, not specific experiment And this is a problem, because real power comes from ability to predict ourcome of specific event (Gore vs Bush today, not average over many elections), and I want the mean to give me that kind of information

3 Pr{exactly 50 Heads} < 1/13 Pr{50§ 1 Heads} < 1/7 = 0
Exactly the Mean? Pr{exactly 50.5 Heads} = ? Pr{exactly 50 Heads} < 1/13 Pr{50§ 1 Heads} < 1/7 = 0

4 Toss 1001 fair coins. E[#Heads] = 500.5 Pr{#H = 500} = smaller
Very near the mean? Toss 1001 fair coins. E[#Heads] = 500.5 Pr{#H = 500} = smaller Pr{#H = 500§1 } = still small (< 1/39) Should I expect to see the mean – clearly not! Now what – does this tell me anything? Really reminder that expected value = average over many experiments, not specific experiment And this is a problem, because real power comes from ability to predict ourcome of specific event (Gore vs Bush today, not average over many elections), and I want the mean to give me that kind of information (< 1/19)

5 Toss 1001 fair coins. Pr{#H = 500 § 1%} = Pr{#H = 500 § 10} = not bad
Very near the mean? Toss 1001 fair coins. Pr{#H = 500 § 1%} = Pr{#H = 500 § 10} = not bad (very close to even) Should I expect to see the mean – clearly not! Now what – does this tell me anything? Really reminder that expected value = average over many experiments, not specific experiment And this is a problem, because real power comes from ability to predict ourcome of specific event (Gore vs Bush today, not average over many elections), and I want the mean to give me that kind of information

6 Giving Meaning to the “Mean”
 ::= E[R] Pr{R =  ±10}? Pr{R =  § 2%}? On average, how much does R deviate from the expected value? E[|R − µ|]? If have full information about PDF then can answer these questions exactlty, but don’t always – sometimes trying to predict from real data, other times very hard to formulate the pdf because events are not independent (incomplete information)

7 Fair Die: Pr{D1 = 3.5 § 1} = 1/3 Loaded Die: (throws either 1 or 6)
Two Dice with  = 3.5 Fair Die: Pr{D1 = 3.5 § 1} = 1/3 Loaded Die: (throws either 1 or 6) Pr{D2 = 3.5 § 1} = 0 !!

8 Giving Meaning to the “Mean”
Need more info about the distribution of R than just its mean, . If have full information about PDF then can answer these questions exactlty, but don’t always – sometimes trying to predict from real data, other times very hard to formulate the pdf because events are not independent (incomplete information)

9 Two Distributions, Same Mean
Pr{R = x} x

10 Deviation from the Mean
Today consider: Markov Bound Chebyshev Bound Binomial distribution More information about the distribution

11 Average IQ ::= 100 IQ EXERCISE: What fraction of
people can have an IQ ≥ 200?

12 Otherwise, average IQ > (1/2) ¢ 200 = 100
IQ Higher than 200 At most ½ the people have IQ ¸ 200. Otherwise, average IQ > (1/2) ¢ 200 = 100

13 If R is nonnegative, then
Markov Bound If R is nonnegative, then for x > 0.

14 Markov Bound (Alternate Form)

15 IQ again At most 2/3 of population have IQ ¸ 150

16 Formal Proof of Markov Bound
Let Ix be indicator variable for [R  x] Then xIx · R so xPr{R  x}= x  E[Ix] · E[R]

17 Markov Bound Markov’s bound is Weak Obvious Useful anyway.

18 = Pr{IQ · 50}? Pr{(250  IQ)  200}. E[(250  IQ)] = 150, so
Lower bounds on IQ = Pr{IQ · 50}? Pr{(250  IQ)  200}. E[(250  IQ)] = 150, so Pr{IQ · 50} · 150/200 (by Markov) = 3/4

19 Deviation from the Mean
Probability R deviates by ¸ x from its expected value? Pr{|R−µ| ≥ x} = Pr{R ≥ µ+x} + Pr{R · µ−x}

20 Using More Info about the PDF
Pr{|R−µ| ≥ x} = Pr{(R−µ)2 ≥ x2} · (by Markov)

21 Chebyshev Bound variance

22 Variance and Standard Deviation
Var[R] ::= E[(R −µ)2] σ ::=

23 Chebyshev Bound (Alternate Form)
1 c2 =

24 Probability that you are within µ ± 2σ ¸ 3/4 within µ ± 3σ ¸ 8/9
Probably close to cσ Probability that you are within µ ± 2σ ¸ 3/4 within µ ± 3σ ¸ 8/9

25 E[(R - )2] = E[R2  2R + 2] = E[R2]  2E[R] + 2
Variance Formula E[(R - )2] = E[R2  2R + 2] = E[R2]  2E[R] + 2 = E[R2]  2  + 2 = E[R2]  E[R]2

26 Suppose that main computer has a probability p of failing every year
Space Station Mir Suppose that main computer has a probability p of failing every year Mean Time to Failure: when do we expect it to fail? E[T] = 1/p Var[T] = ? Mean is not enough to keep you from worrying – what’s the probability that it fails immediately or within the first minute, does it occasionally run forever, and occasionally fail immediately? What is the expected deviation from the mean, what do I expect when Mir lauches

27 We know Pr{T=k} = (1−p)k−1p How do we compute Var[T]?
Calculating Variance We know Pr{T=k} = (1−p)k−1p How do we compute Var[T]? Var[T] = E[T2] − (E[T])2 Define Y ::= T2 T = 1, 2, 3,…, k, ...∞ Y = 1, 4, 9,…, k2,…∞ In general if I have a PDF how do I calculate variance. Similar to calculating expectation

28 Calculating Variance Range of Y Same event

29 Mean Time to Failure E[T] = 1/p Var[T] = σ2 =
p = 1/ E[T] = 6, σ = 6 (Dice) p = 1/ E[T] = 10, σ = 10 (Mir 1) p = 1/ E[T] = 1000, σ = 1000 (Mir 2) Chebyshev tells us that probability that Mir 1 will last for more than 30 years is less than 25%

30 (by linearity of expectation)
Birthday Pairs M ::= # pairs with matching b’days among n people in a year with y days? E[M] = (by linearity of expectation)

31 May 1 Jan 12 Jan 16 May 14 Jan 25 May 19 Feb 4
16 Birthday Pairs May  1 May 14 May 19 Aug  3 Aug  18 Oct   26 Nov  13 Dec  26 Jan  12 Jan  16 Jan  25 Feb  4 Feb  11 Feb  19 Mar  22 Apr   9

32 Birthday Pairs Anderson, Brian C Sherry, Brennan Bau, Benjamin D
Elliott, Grant A Bisker, Solomon M Tse, Chester G Bissonnette, Sara D Wang, Jessie Cheng, Marjorie Goldin, Donald M Chevalier, Kevin R Palakodety, Ravi K

33 Birthday Pairs Hollingsworth, Pamela Rbeiz, Michel A Hoover, Thomas M
Menchu, Miguel M Iyo, Shannon J Polonski, Marek Jones, Harvey C Lin, Hanyin H Kim, MinJi Liang, Alvin Y Kloster, David E Radez, Rob A Luger-Guillaume, R.L. Val, Charles C. Li, Xue Permar, Justin D

34 Birthday Pairs Liang, Alvin Y Kim, MinJi Lin, Hanyin H Jones, Harvey C
Menchu, Miguel Hoover, Thomas Meng, Nathan F Ngo, Tri M Netolicka, Karolina Sanchez, Rodrigo Palakodety, Ravi K Chevalier, Kevin R Permar, Justin D Li, Xue

35 Birthday Pairs Polonski, Marek Iyo, Shannon J Radez, Rob A
Kloster, David E Rbeiz, Michel A Hollingsworth, Pamela Sanchez, Rodrigo Netolicka, Karolina Sherry, Brennan Anderson, Brian C Tse, Chester G Bisker, Solomon M Val, Charles C Luger-Guillaume, R. L. Wang, Jessie Bissonnette, Sara D

36 Birthday Month Pairs 1 16 2 12 3 7 4 5 11 6 8 13 9 10

37 Birthday Pairs E[P] ≈ 16 for this class (107) and planet What is the Pr{P = 23}? Calculating Pr{P = k} is hard However, we can still calculate the variance!

38 Var[X+Y] = Var[X] + Var[Y] if X,Y are independent.
Variance of Sums Var[X+Y] = Var[X] + Var[Y] if X,Y are independent. Generally, Var[X1 + X2 + ···] = Var[X1] + Var[X2] + ··· if Xi are pairwise independent.

39 Pairwise Independence
U ::= Albert and Radhika have same bday V ::= Albert and Tina have same bday any U and V are independent But W ::= Radhika and Tina have same bday (U  V), and W are NOT independent

40 Birthday Pairs Xi = 1 if pair i has the same birthday Xi are pairwise independent E[Xi] = p where p ::= 1/Y Var[Xi] = p(1p) ≈ 1/Y Number of pairs M ≈ N2/2 E[P] = E[ X1 + X2 +…+ XM] ≈ N2/2Y Var[P] = Var[X1 + X2 + …+XM ] ≈ N2/2Y

41 Birthday Predictions For 107 students, E[P] ≈ 16,
Var[P] ≈ 16 (std dev 4.0) Chebyshev: More than 75% of the time we expect to see 16 ± 8 pairs, that is, between 8 & 24 pairs What we saw: 16 pairs What if N is larger (larger class)?

42 In Class Problems Additivity of Variance? Binomial Distribution YES
Random Hat Check NO


Download ppt "Mathematics for Computer Science MIT 6.042J/18.062J"

Similar presentations


Ads by Google