1 Econ 240A Power Four 1 1
2 Last Time Probability
3 The Big Picture
The Classical Statistical Trail Descriptive Statistics Inferential Statistics Probability Discrete Random Variables Discrete Probability Distributions; Moments Binomial Application Rates & Proportions
5
6
7
8 Working Problems
9 Problem 6.61 A survey of middle aged men reveals that 28% of them are balding at the crown of their head. Moreover, it is known that such men have an 18% probability of suffering a heart attack in the next ten years. Men who are not balding in this way have an 11% probability of a heart attack. Find the probability that a middle aged man will suffer a heart attack in the next ten years.
10 Middle Aged men Bald P (Bald and MA) = 0.28 Not Bald
11 Middle Aged men Bald P (Bald and MA) = 0.28 Not Bald P(HA/Bald and MA) = 0.18 P(HA/Not Bald and MA) = 0.11
12 Probability of a heart attack in the next ten years P(HA) = P(HA and Bald and MA) + P(HA and Not Bald and MA) P(HA) = P(HA/Bald and MA)*P(BALD and MA) + P(HA/Not BALD and MA)* P(Not Bald and MA) P(HA) = 0.18* *0.72 = =
13 This time
14 Random Variables There is a natural transition or easy segue from our discussion of probability and Bernoulli trials last time to random variables Define k to be the random variable # of heads in 1 flip, 2 flips or n flips of a coin We can find the probability that k=0, or k=n by brute force using probability trees. We can find the histogram for k, its central tendency and its dispersion
15 Outline Random Variables & Bernoulli Trials example: one flip of a coin –expected value of the number of heads –variance in the number of heads example: two flips of a coin a fair coin: frequency distribution of the number of heads –one flip –two flips
16 Outline (Cont.) Three flips of a fair coin, the number of combinations of the number of heads The binomial distribution frequency distributions for the binomial The expected value of a discrete random variable the variance of a discrete random variable
17 Concept Bernoulli Trial –two outcomes, e.g. success or failure –successive independent trials –probability of success is the same in each trial Example: flipping a coin multiple times
18 Flipping a Coin Once Heads, k=1 Tails, k=0 Prob. = p Prob. = 1-p The random variable k is the number of heads it is variable because k can equal one or zero it is random because the value of k depends on probabilities of occurrence, p and 1-p
19 Flipping a coin once Expected value of the number of heads is the value of k weighted by the probability that value of k occurs –E(k) = 1*p + 0*(1-p) = p variance of k is the value of k minus its expected value, squared, weighted by the probability that value of k occurs –VAR(k) = (1-p) 2 *p +(0-p) 2 *(1-p) = VAR(k) = (1-p)*p[(1-p)+p] =(1-p)*p
20 Flipping a coin twice: 4 elementary outcomes heads tails heads tails heads tails h, h h, t t, h t, t h, h; k=2 h, t; k=1 t, h; k=1 t, t; k=0 Prob =p Prob =1-p Prob=p Prob=1-p
21 Flipping a Coin Twice Expected number of heads –E(k)=2*p 2 +1*p*(1-p) +1*(1-p)*p + 0*(1-p) 2 E(k) = 2*p 2 + p - p 2 + p - p 2 =2p –so we might expect the expected value of k in n independent flips is n*p Variance in k –VAR(k) = (2-2p) 2 *p 2 + 2*(1-2p) 2 *p(1-p) + (0-2p) 2 (1-p) 2
22 Continuing with the variance in k –VAR(k) = (2-2p) 2 *p 2 + 2*(1-2p) 2 *p(1-p) + (0- 2p) 2 (1-p) 2 –VAR(k) = 4(1-p) 2 *p 2 +2*(1 - 4p +4p 2 )*p*(1-p) + 4p 2 *(1-p) 2 –adding the first and last terms, 8p 2 *(1-p) 2 + 2*(1 - 4p +4p 2 )*p*(1-p) –and expanding this last term, 2p(1-p) -8p 2 *(1-p) + 8p 3 *(1-p) –VAR(k) = 8p 2 *(1-p) 2 + 2p(1-p) -8p 2 *(1-p)(1-p) –so VAR(k) = 2p(1-p), or twice VAR(k) for 1 flip
23 So we might expect the variance in n flips to be np(1-p)
24 Frequency Distribution for the Number of Heads A fair coin
25 O heads 1 head 1/2 probability # of heads One Flip of the Coin
# of heads probability 1/4 1/2 Two Flips of a Fair Coin
27 Three Flips of a Fair Coin It is not so hard to see what the value of the number of heads, k, might be for three flips of a coin: zero, one,two, three But one head can occur two ways, as can two heads Hence we need to consider the number of ways k can occur, I.e. the combinations of branching probabilities where order does not count
Three flips of a coin; 8 elementary outcomes 3 heads 2 heads 1 head 2 heads 1 head 0 heads
29 Three Flips of a Coin There is only one way of getting three heads or of getting zero heads But there are three ways of getting two heads or getting one head One way of calculating the number of combinations is C n (k) = n!/k!*(n-k)! Another way of calculating the number of combinations is Pascal’s triangle
30
/8 2/8 3/8 Probability 3# of heads Three Flips of a Coin
32 The Probability of Getting k Heads The probability of getting k heads (along a given branch) in n trials is: p k *(1-p) n-k The number of branches with k heads in n trials is given by C n (k) So the probability of k heads in n trials is Prob(k) = C n (k) p k *(1-p) n-k This is the discrete binomial distribution where k can only take on discrete values of 0, 1, …k
Expected Value of a discrete random variable E(x) = the expected value of a discrete random variable is the weighted average of the observations where the weight is the frequency of that observation
34 Expected Value of the sum of random variables E(x + y) = E(x) + E(y)
Expected Number of Heads After Two Flips Flip One: k i I heads Flip Two: k j II heads Because of independence p(k i I and k j II ) = p(k i I )*p(k j II ) Expected number of heads after two flips: E(k i I + k j II ) = (k i I + k j II ) p(k i I )*p(k j II ) E(k i I + k j II ) = k i I p(k i I )* p(k j II ) +
Cont. E(k i I + k j II ) = k i I p(k i I )* p(k j II ) + k j II *p(k j II ) p(k i I ) E(k i I + k j II ) = E(k i I ) + E(k j II ) = p*1 + p*1 =2p So the mean after n flips is n*p
Variance of a discrete random variable VAR(x i ) = the variance of a discrete random variable is the weighted sum of each observation minus its expected value, squared,where the weight is the frequency of that observation
Cont. VAR(x i ) = So the variance equals the second moment minus the first moment squared
The variance of the sum of discrete random variables VAR[x i + y j ] = E[x i + y j - E(x i + y j )] 2 VAR[x i + y j ] = E[(x i - Ex i ) + (y j - Ey j )] 2 VAR[x i + y j ] = E[(x i - Ex i ) 2 + 2(x i - Ex i ) (y j - Ey j ) + (y j - Ey j ) 2 ] VAR[x i + y j ] = VAR[x i ] + 2 COV[x i *y j ] + VAR[y j ]
The variance of the sum if x and y are independent COV [x i *y j ] = E(x i - Ex i ) (y j - Ey j ) COV [x i *y j ]= (x i - Ex i ) (y j - Ey j ) COV [x i *y j ]= (x i - Ex i ) p[x(i)]* (y j - Ey j )* p[y(j)] COV [x i *y j ] = 0
41 Variance of the number of heads after two flips Since we know the variance of the number of heads on the first flip is p*(1-p) and ditto for the variance in the number of heads for the second flip then the variance in the number of heads after two flips is the sum, 2p(1-p) and the variance after n flips is np(1-p)
42 Application Rates and Proportions
43
44
Field Poll The estimated proportion, from the sample, that will vote for Guliani is: where is 0.35 or 35% k is the number of “successes”, the number of likely voters sampled who are for Guliani, approximately 122 n is the size of the sample, 348
Field Poll What is the expected proportion of voters Nov. 7 who will vote for Guliani? = E(k)/n = np/n = p, where from the binomial distribution, E(k) = np So if the sample is representative of voters and their preferences, 35% should vote for Guliani next February
Field Poll How much dispersion is in this estimate, i.e. as reported by the Field Poll, what is the sampling error? The sampling error is calculated as twice the standard deviation or square root of the variance in = VAR(k)/n 2 = np(1-p)/n 2 =p(1-p)/n and using 0.35 as an estimate of p, = 0.35*0.65/348 =
48 Field Poll So the sampling error should be 2*0.026 or 5.2%. The Field Poll reports a 95% confidence interval or about two standard errors, I.e 2*2.6% ~ 5.4%
49 Field Poll Is it possible that Guliani might get 50% of the vote or more? Not likely since the probabilty of Guliani reciving more then 40% of the vote is only 2.5% Based on a normal approximation to the binomial, the true proportion voting for Guliani should fall between 29.5% and 40.5% with probability of about 95%, unless sentiments change.
50
51
52 Lab Two The Binomial Distribution, Numbers & Plots –Coin flips: one, two, …ten –Die Throws: one, ten,twenty The Normal Approximation to the Binomial