Discrete probability distributions Chapter 6 - Sullivan Prof. Felix Apfaltrer fapfaltrer@bmcc.cuny.edu Office:N518 Phone: 212-220 8000 x 74 21 Office hours: Tue, Thu 1:30-3 pm
Random variables and distributions MAT - 150 APFALTRER Random variables and distributions A random variable is a variable (typically represented by x) that has a single numerical value, determined by chance, for each outcome of a procedure. A probability distribution is a graph, table, or a formula that gives the probability for each value of the random variable. Gender of children: A study consists of randomly selecting 14 newborn babies and counting the number of girls in the sample. If we assume that having a boy or a girl is equally likely, and let x = number of girls among the 14 babies then x is a random variable because its value depends on chance. The possible values are x =0,1,2,3,…,11,12,13,14. A probability distribution is shown to the left. Notes are added here and I wonder if I can see them
Discrete and Continuous Random Variables (r.v.’s) MAT - 150 APFALTRER Discrete and Continuous Random Variables (r.v.’s) A discrete random variable has either a finite or countable number of values. Countable means it might be infinite, but you can still “count” them (there are gaps between them). A continuous random variable has infinitely many values without gaps between them (like interval subsets of the real numbers). Examples: Discrete random variables: Number of eggs a hen lays per day. cannot lay 2.3 eggs one day! R.v.: # not known for sure in advance! Number of people attending the Columbus Day Parade. Discrete r.v.: counting the number of people. Random: we do not know in advance exactly how many are going. (but we might have an estimate) The sum of the faces when we roll two dice. The points in hand of Black Jack. The average number of eggs per hen per day in a farm with 10 hens. Continuous random variables: Amount of milk a cow produces a day. Continuous r.v.: She might yield 1.345 gallons, or 1.34512 (no gaps in measurement). The humidity at a given day. Continuous r.v.: Percentage of humidity can be 75.34%. The daily closing value of the Dow Jones Industrial Average index. The daily ocean temperature at a marine laboratory investigating whales. Notes are added here and I wonder if I can see them
Probability histogram Very similar to relative frequency histogram Instead of percent (relative frequency) probability is shown. The values 0, 1, 2, …, 13, 14, are at the center of the rectangles -> base = 1 area = height*base = height
Requirements of Probability Distributions MAT - 150 APFALTRER Requirements of Probability Distributions ∑P(x) = 1 where x assumes all possible values. 0 ≤ P(x) ≤ 1 for every individual value of x. Discussion: x takes all possible values, so it represents all options in the sample space For table ‘girls’, sum is 0.999, almost 1 except for rounding errors. All P(x) between 0 and 1 because they are probabilities! Example: Does the table represent a probability distribution? All values between 0 and 1. Good! ∑P(x) = 0.2+0.3+0.4+0.5 = 1.4 . Uups! is not 1. Therefore, it is not a probability distribution. Notes are added here and I wonder if I can see them Does the function P(x) = x/9 represent a probability distribution? P(2) =2/9, P(3) =3/9, P(4) =4/9, ∑P(x) = 2/9 + 3/9 + 4/9 = (2+3+4)/9 = 9/9 = 1 It is 1. Therefore, the function does represent a probability distribution. P(x) = x/9 for x = 2,3, & 4
Mean, Variance and Standard Deviation for Distributions MAT - 150 APFALTRER Mean, Variance and Standard Deviation for Distributions = ∑ x•P(x) mean 2 = ∑ (x – )2•P(x) variance = ∑ [ x 2 •P(x) ] – 2 variance (alternative formula) = √ ∑ [ x 2 •P(x) ] – 2 standard deviation Rationale: Notes are added here and I wonder if I can see them
Mean, Variance for Distributions (round-off and unusual values) MAT - 150 APFALTRER Mean, Variance for Distributions (round-off and unusual values) Round off at 1 more decimal than data! Minimum usual value – 2 Maximum usual value + 2 Example: In previous calculation, = 7, =1.9. Minimum usual value: – 2 = 7 – 2(1.9) = 3.2 Maximum usual value: + 2 = 7 + 2(1.9) = 10.8 For the group of 14 babies, the usual values for the number of girls fall between 3.2 and 10.8. Notes are added here and I wonder if I can see them Rare event rule: If, under a given assumption, the probability of an event is extremely low, we conclude that the assumption is most likely incorrect. With probabilities: x successes among n trials are unusually high if P(x or more) <0.05 x successes among n trials are unusually low if P(x or less) <0.05 Example (Gender Selection): Getting 13 or more girls. P(13 or more girls) =P(13)+P(14) = 0.001+0.000 = 0.001 unusually high.
Expected Value Example (NJ pick 3 game): MAT - 150 APFALTRER Expected Value The mean of a discrete random variable (expected value) denoted by E or μX , and it represents the average value of the outcomes. μX = E = E[X] = ∑ { x•P(x) } Example (NJ pick 3 game): Bet $ 0.50 and select a 3 digit number between 000 and 999. If you get the number, you collect $275. Your net gain is then $274.50. Suppose that you bet $0.50 on the number 007. What is your expected value of gain or loss? A: Each outcome is equally likely. P(win) = 1/1000 = 0.001 P(loss) = 999/1000 = 0.999 E[X] = ∑ x•P(x) = ∑ x•P(x) =274.50 • 0.001 + (-0.50) • 0.999 win loss = 0.2745 - 0.4995 = - 0.225 On average you will be loosing 22.5 cents every time you play. Notes are added here and I wonder if I can see them
Bernoulli Distribution MAT - 150 APFALTRER Bernoulli Distribution The Bernoulli probability distribution results from a procedure such that: there is one trial, like one flip of a coin there are only two outcomes (heads/tails, 0/1, red/white, success/failure) Examples: Tossing one coin (or bean) 1 trial outcomes: heads or tails Birth of one child: Outcomes: boy or girl Tossing one die, win if it’s 6, loose 1-5 outcomes: win or loose Suppose you pay $1 to play and get $3 back if ‘6’comes out. Weather tomorrow 1 trial (day) Outcomes: rain or shine Probabilities: P(X=heads)=0.5 P(X=tails)=0.5 P(girl)=0.513 = p success probability P(boy)=0.487 = q = (1– p) failure prob X=“number of girls” in one birth: 0 or1 = 0P(0)+1P(1) = 0 q + 1p = p 2 = 0 2P(0)+12P(1) – p 2 =p – p 2 =p(1 – p) = pq P( win) =1/6 = p , P(loose)=5/6 = q X=“number of wins” in one toss: 0 or 1 = 0P(0)+1P(1) = p = 1/6, 2 =pq= 5/36 Expectation: E[X] =3•1/6 + (-1)•6/6 = – 3/6 On average you will be loosing 50 cents per play Notes are added here and I wonder if I can see them
Binomial Distributions MAT - 150 APFALTRER Binomial Distributions A procedure has a binomial probability distribution if: each trial must have all outcomes in 2 categories the procedure has a fixed number of trials the trials are independent the probabilities must remain constant for each trial Notation for binomial probability distributions: 2 categories: S success (p prob. of success) F failure (q prob. of failure) Probabilities: P(S) = p P(F) = q =1– p n fixed number of trials x :: X = x X denotes the random variable, x denotes number of successes in n trials p probability of success (success is arbitrary, can be good or not) q probability of failure P(x) = P( X = x ) probability of getting exactly x successes among n trials P( X ≤ x ) probability of getting x or less successes among n trials B( n , p ) binomial distribution with n trials and probability of success p Notes are added here and I wonder if I can see them Note: B(n,p) = sum of n independent Bernoulli distributions with probability of success p X = Y1 + Y2 + …+ Yn X = Y1 + Y2 +…+ Y n = p + p +…+ p = np 2X = 2Y1+ 2Y2 +…+ 2Y n = pq + pq +…+ pq = npq
Binomial Distributions: Examples MAT - 150 APFALTRER Binomial Distributions: Examples Remember: Poll and test samples usually done without replacement -> dependent If sample small enough (< 5% of population), then it is safe to assume independence (even though there is no independence) Multiple choice answers: (answered at random, options: a,b,c,d,e, 4 questions) P(3 answers correct) Binomially distributed? Number of trials fixed n = 4. Trials independent. (answers do not depend on previous ones). 2 outcomes: right, or wrong. One answer correct, p=1/5=0.2; q = 0.8. YES! Notes are added here and I wonder if I can see them Use binomial formula
Binomial Distributions: Examples Continued MAT - 150 APFALTRER Binomial Distributions: Examples Continued Use table A-1: Hence, P(3) = 0.0256 Question: What is the probability that at least 3 answers are correct? ‘at least 3 answers correct’ = {X≥3} = {X=3 or X= 4} P(X ≥ 3 ) = P(X = 3 ) + P(X = 4 ) = 0.0256 + 0.0016 = 0.0272 HW: Sullivan Review Chapter 6, SC p315 #1-5, 7, 8, 13, 15 Notes are added here and I wonder if I can see them Mean, variance and expectation: X = np = 4 ( 0.2) = 0.8 2X = npq = 4 ( 0.2) (0.8) = 0.64 -> X = 0.8 Suppose that someone you pay $1000 if the person that answers at random won’t answer 3 or more answers correctly, and that you receive $100 otherwise. What is your expected loss/gain? E[ X] = -1000 (0.0272) + 100 ( 1- 0.0272) = - 27.2 + 77.28 = 49.92
Homework Sullivan Review exercises chapter 6 P. 315 (softcover) 1-5, 7, 8, 13, 15