ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring 2014 4. Random variables part two.

Slides:



Advertisements
Similar presentations
DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
Advertisements

Lecture (7) Random Variables and Distribution Functions.
Introduction to Probability and Statistics Chapter 5 Discrete Distributions.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Basic Business Statistics.
Flipping an unfair coin three times Consider the unfair coin with P(H) = 1/3 and P(T) = 2/3. If we flip this coin three times, the sample space S is the.
Probability Distributions Discrete. Discrete data Discrete data can only take exact values Examples: The number of cars passing a checkpoint in 30 minutes.
Chapter 1 Probability Theory (i) : One Random Variable
Discrete Random Variables and Probability Distributions
Probability Distributions
1 Copyright M.R.K. Krishna Rao 2003 Chapter 5. Discrete Probability Everything you have learned about counting constitutes the basis for computing the.
Binomial & Geometric Random Variables
Chap 5-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 5-1 Chapter 5 Discrete Probability Distributions Basic Business Statistics.
Review of important distributions Another randomized algorithm
1 Probability distribution Dr. Deshi Ye College of Computer Science, Zhejiang University
1 Random Variables and Discrete probability Distributions SESSION 2.
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
Chapter 5 Discrete Probability Distribution I. Basic Definitions II. Summary Measures for Discrete Random Variable Expected Value (Mean) Variance and Standard.
Binomial & Geometric Random Variables §6-3. Goals: Binomial settings and binomial random variables Binomial probabilities Mean and standard deviation.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 4 and 5 Probability and Discrete Random Variables.
Chapter 1 Probability and Distributions Math 6203 Fall 2009 Instructor: Ayona Chatterjee.
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Jointly Distributed Random Variables.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Random variables part two.
Expected values and variances. Formula For a discrete random variable X and pmf p(X): Expected value: Variance: Alternate formula for variance:  Var(x)=E(X^2)-[E(X)]^2.
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Conditional probability.
1 Lecture 4. 2 Random Variables (Discrete) Real-valued functions defined on a sample space are random vars. determined by outcome of experiment, we can.
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Random variables part one.
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
Chapter 5: Random Variables and Discrete Probability Distributions
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Properties of expectation.
P. STATISTICS LESSON 8.2 ( DAY 1 )
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
Introduction to Behavioral Statistics Probability, The Binomial Distribution and the Normal Curve.
Introduction to Probability and Statistics Thirteenth Edition Chapter 5 Several Useful Discrete Distributions.
Ch 5 Probability: The Mathematics of Randomness Random Variables and Their Distributions A random variable is a quantity that (prior to observation)
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Jointly Distributed Random Variables.
Week 21 Conditional Probability Idea – have performed a chance experiment but don’t know the outcome (ω), but have some partial information (event A) about.
3. Conditional probability
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Limit theorems.
Expectation. Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected.
Exam 2: Rules Section 2.1 Bring a cheat sheet. One page 2 sides. Bring a calculator. Bring your book to use the tables in the back.
Chapter 16 Week 6, Monday. Random Variables “A numeric value that is based on the outcome of a random event” Example 1: Let the random variable X be defined.
Binomial Distributions Chapter 5.3 – Probability Distributions and Predictions Mathematics of Data Management (Nelson) MDM 4U.
MATH 256 Probability and Random Processes Yrd. Doç. Dr. Didem Kivanc Tureli 14/10/2011Lecture 3 OKAN UNIVERSITY.
Ver Chapter 4 Random Variables 1 SPHsu_Probbability.
Binomial Distributions Chapter 5.3 – Probability Distributions and Predictions Mathematics of Data Management (Nelson) MDM 4U Authors: Gary Greer (with.
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Random variables part one.
3/7/20161 Now it’s time to look at… Discrete Probability.
Random Variables Lecture Lecturer : FATEN AL-HUSSAIN.
Section 7.3. Why we need Bayes?  How to assess the probability that a particular event occurs on the basis of partial evidence.  The probability p(F)
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Conditional Probability 423/what-is-your-favorite-data-analysis-cartoon 1.
ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Continuous Random Variables.
INTRODUCTION TO ECONOMIC STATISTICS Topic 5 Discrete Random Variables These slides are copyright © 2010 by Tavis Barr. This work is licensed under a Creative.
Chapter Five The Binomial Probability Distribution and Related Topics
3 Discrete Random Variables and Probability Distributions
Random variables (r.v.) Random variable
5. Continuous Random Variables
ENGR 201: Statistics for Engineers
Simple Random Sample A simple random sample (SRS) of size n consists of n elements from the population chosen in such a way that every set of n elements.
Introduction to Probability and Statistics
Probability Key Questions
Random Variables Binomial Distributions
Expected values and variances
3. Independence and Random Variables
4. Expectation and Variance Joint PMFs
Introduction to Probability and Statistics
5. Conditioning and Independence
Chapter 11 Probability.
Presentation transcript:

ENGG 2040C: Probability Models and Applications Andrej Bogdanov Spring Random variables part two

Review A discrete random variable X assigns a discrete value to every outcome in the sample space. Probability mass function of X : p(x) = P(X = x) Expected value of X : E[X] = ∑ x x p(x). E[N]E[N] N: number of heads in two coin flips

One die Example from last time F = face value of fair 6-sided die E[F] = =

Two dice S = sum of face values of two fair 6-sided dice Solution s pS(s)pS(s) E[S] = … = 7 We calculate the p.m.f. of S :

Two dice again S = sum of face values of two fair 6-sided dice F1F1 F2F2 S = F 1 + F 2 F 1 = outcome of first die F 2 = outcome of second die

Sum of random variables Let X, Y be two random variables. X + Y is the random variable that assigns value X(  ) + Y(  ) to outcome . X assigns value X(  ) to outcome  Y assigns value Y(  ) to outcome 

Sum of random variables F1F1 F2F2 S = F 1 + F … … … …

Linearity of expectation E[X + Y] = E[X] + E[Y] For every two random variables X and Y

Two dice again S = sum of face values of two fair 6-sided dice Solution 2 E[S] = E[F 1 ] + E[F 2 ]= = 7 F1F1 F2F2 S = F 1 + F 2

Balls We draw 3 balls without replacement from this urn: What is the expected sum of values on the 3 balls? 0

Balls S = B 1 + B 2 + B 3 where B i is the value of i -th ball. E[S] = E[B 1 ] + E[B 2 ] + E[B 3 ] p.m.f of B 1 : -101 x p(x)p(x) E[B 1 ] = -1 (4/9) + 0 (2/9) + 1 (3/9) = -1/9 same for B 2, B 3 E[S] = 3 (-1/9) = -1/3.

Three dice E[N] = E[I 1 ] + E[I 2 ] + E[I 3 ] Solution I1I1 I2I2 I3I3 N = I 1 + I 2 + I 3 E[I 1 ] = 1 (1/6) + 0(5/6) = 1/6 E[I 2 ], E[I 3 ] = 1/6 = 3 (1/6) = 1/2 Ik =Ik = 1 if face value of k th die equals 0 if not N = number of s. Find E[N].

Problem for you to solve Five balls are chosen without replacement from an urn with 8 blue balls and 10 red balls. What is the expected number of blue balls that are chosen? What if the balls are chosen with replacement? (a) (b)

The indicator (Bernoulli) random variable Perform a trial that succeeds with probability p and fails with probability 1 – p p(x)p(x) 1 – pp x p = 0.5 p(x)p(x) p = 0.4 p(x)p(x) E[X] = p If X is Bernoulli(p) then

The binomial random variable Binomial(n, p) : Perform n independent trials, each of which succeeds with probability p. X = number of successes Examples Toss n coins. “number of heads” is Binomial(n, ½). Toss n dice. “Number of s” is Binomial(n, 1/6).

A less obvious example Toss n coins. Let C be the number of consecutive changes ( HT or TH ).  C()C() HTHHHHT 3 THHHHHT HHHHHHH 2 0 Examples: Then C is Binomial(n – 1, ½).

A non-example Draw a 10 card hand from a 52-card deck. Let N = number of aces among the drawn cards Is N a Binomial(10, 1/13) random variable? No! Trial outcomes are not independent.

Properties of binomial random variables If X is Binomial(n, p), its p.m.f. is p(k) = P(X = k) = C(n, k) p k (1 - p) n-k We can write X = I 1 + … + I n, where I i is an indicator random variable for the success of the i -th trial E[X] = E[I 1 ] + … + E[I n ] = p + … + p = np. E[X] = np

Probability mass function Binomial(10, 0.5) Binomial(50, 0.5) Binomial(10, 0.3) Binomial(50, 0.3)

Functions of random variables If X is a random variable, then Y = f(X) is a random variable with p.m.f. p Y (y) = ∑ x: f(x) = y p X (x). x p(x)p(x) 1/31/31/3 p.m.f. of X : p.m.f. of (X – 1) 2 : 0101y p(y)p(y) 1/32/3 -101y p(y)p(y) 1/31/31/3 p.m.f. of X - 1 :

Investments You have two investment choices: A: put $25 in one stock B: put $½ in each of 50 unrelated stocks Which do you prefer?

Investments Probability model Each stock doubles in value with probability ½ loses all value with probability ½ Different stocks perform independently

Investments N A = amount on choice A N B = amount on choice B 50 × Bernoulli(½) A: put $50 in one stockB: put $½ in each of 50 stocks Binomial(50, ½) E[NA]E[NA] E[NB]E[NB]

Variance and standard deviation Let  = E[X] be the expected value of X. The variance of X is the quantity Var[X] = E[(X –  ) 2 ] The standard deviation of X is  = √ Var[X] It measures how close X and  are typically.

Calculating variance  = E[N A ] y25 2 q(y)q(y) 1 p.m.f of (N A –  ) 2 Var[N A ] = E[(N A –  ) 2 ] x050 p(x)p(x) ½ p.m.f of N A = 25 2  = std. dev. of N A = 25  –   + 

Another formula for variance Var[X] = E[(X –  ) 2 ] = E[X 2 – 2  X +  2 ] = E[X 2 ] + E[–2  X] + E[  2 ] = E[X 2 ] – 2  E[X] +  2 = E[X 2 ] – 2  +  2 = E[X 2 ] –  2 for constant c, E[cX] = cE[X] for constant c, E[c] = c Var[X] = E[X 2 ] – E[X] 2

Variance of binomial random variable Suppose X is Binomial(n, p). Then X = I 1 + … + I n, where I i = 1, if trial i succeeds 0, if trial i fails Var[X] = E[X 2 ] –  2 = E[X 2 ] – (np) 2  = E[X] = np E[X2]E[X2] = E[(I 1 + … + I n ) 2 ] = E[I … + I n 2 + I 1 I 2 + I 1 I 3 + … + I n I n-1 ] = E[I 1 2 ] + … + E[I n 2 ] + E[I 1 I 2 ] + … + E[I n I n-1 ] E[I i 2 ] = E[I i ] = p E[I i I j ] = P(I i = 1 and I j = 1) = P(I i = 1) P(I j = 1) = p 2 = n p = n(n-1) p 2 = np + n(n-1) p 2 – (np) 2

Variance of binomial random variable Suppose X is Binomial(n, p). Var[X] = np + n(n-1) p 2 – (np) 2  = E[X] = np = np – np 2 = np(1-p) Var[X] = np(1-p)  = √ np(1-p). The standard deviation of X is

Investments N A = amount on choice A N B = amount on choice B 50 × Bernoulli(½) A: put $50 in one stockB: put $½ in each of 50 stocks Binomial(50, ½)    = 25  = √ 50 ½ ½ = 3.536…  –   + 

Average household size In 2011 the average household in Hong Kong had 2.9 people. Take a random person. What is the average number of people in his/her household? B: 2.9 A: < 2.9 C: > 2.9

Average household size average household size 3 3 average size of random person’s household 3 4⅓4⅓

Average household size What is the average household size? household size12345more % of households From Hong Kong Annual Digest of Statistics, 2012 ≈ 1× × × × × ×.035 = 2.91 Probability model The sample space are the households of Hong Kong Equally likely outcomes X = number of people in the household E[X]E[X]

Average household size Take a random person. What is the average number of people in his/her household? Probability model The sample space are the people of Hong Kong Equally likely outcomes Y = number of people in household Let’s find the p.m.f. p Y (y) = P(Y = y)

Average household size household size12345more % of households N = number of people in Hong Kong N ≈ 1×.166H + 2×.256H + 3×.244H + 4×.214H + 5×.087H + 6×.035H H = number of households in Hong Kong p Y (1) = P(Y = 1) = 1×.166H/N p Y (2) = P(Y = 2) = 2×.256H/N p Y (y) = P(Y = y) = y×P(X = y) H/N = y×p X (y)/E[X]

Average household size X = number of people in a random household Y = number of people in household of a random person p Y (y) = y p X (y) E[X]E[X] E[Y] = ∑ y y p Y (y) ∑ y y 2 p X (y) E[X]E[X] = household size12345more % of households E[Y] ≈ 1 2 × × × × × × ≈ 3.521

Preview E[Y]E[Y] E[X2]E[X2] E[X]E[X] = X = number of people in a random household Y = number of people in household of a random person Because Var[X] ≥ 0, E[X 2 ] ≥ (E[X]) 2 So E[Y] ≥ E[X]. The two are equal only if all households have the same size.

This little mobius strip of a phenomenon is called the “generalized friendship paradox,” and at first glance it makes no sense. Everyone’s friends can’t be richer and more popular — that would just escalate until everyone’s a socialite billionaire. The whole thing turns on averages, though. Most people have small numbers of friends and, apparently, moderate levels of wealth and happiness. A few people have buckets of friends and money and are (as a result?) wildly happy. When you take the two groups together, the really obnoxiously lucky people skew the numbers for the rest of us. Here’s how MIT’s Technology Review explains the math: The paradox arises because numbers of friends people have are distributed in a way that follows a power law rather than an ordinary linear relationship. So most people have a few friends while a small number of people have lots of friends. It’s this second small group that causes the paradox. People with lots of friends are more likely to number among your friends in the first place. And when they do, they significantly raise the average number of friends that your friends have. That’s the reason that, on average, your friends have more friends than you do. And this rule doesn’t just apply to friendship — other studies have shown that your Twitter followers have more followers than you, and your sexual partners have more partners than you’ve had. This latest study, by Young-Ho Eom at the University of Toulouse and Hang-Hyun Jo at Aalto University in Finland, centered on citations and coauthors in scientific journals. Essentially, the “generalized friendship paradox” applies to all interpersonal networks, regardless of whether they’re set in real life or online. So while it’s tempting to blame social media for what the New York Times last month called “the agony of Instagram” — that peculiar mix of jealousy and insecurity that accompanies any glimpse into other people’s glamorously Hudson-ed lives — the evidence suggests that Instagram actually has little to do with it. Whenever we interact with other people, we glimpse lives far more glamorous than our own. That’s not exactly a comforting thought, but it should assuage your FOMO next time you scroll through your Facebook feed.

Bob Mark Zoe Eve Sam Jessica X = number of friends Y = number of friends of a friend In your homework you will show that E[Y] ≥ E[X] in any social network Alice

Apples About 10% of the apples on your farm are rotten. You sell 10 apples. How many are rotten? Probability model N umber of rotten apples you sold is Binomial(n = 10, p = 1/10). E[N] = np = 1

Apples You improve productivity; now only 5% apples rot. You can now sell 20 apples and only one will be rotten on average. N is now Binomial(20, 1/20).

Binomial(10, 1/10) Binomial(20, 1/20)

The Poisson random variable A Poisson(  ) random variable has this p.m.f.: p(k) = e -   k /k! k = 0, 1, 2, 3, … Poisson random variables do not occur “naturally” in the sample spaces we have seen. They approximate Binomial(n, p) random variables when  = np is fixed and n is large (so p is small) p Poisson(  ) (k) = lim n → ∞ p Binomial(n,  /n) (k)

Rain is falling on your head at an average speed of 2.8 drops/second. Divide the second evenly in n intervals of length 1/n. Raindrops Let E i be the event “raindrop hits during interval i.” Assuming E 1, …, E n are independent, the number of drops in the second N is a Binomial(n, p) r.v. Since E[N] = 2.8, and E[N] = np, p must equal 2.8/n. 0 1

Raindrops Number of drops N is Binomial(n, 2.8/n) 0 1 As n gets larger, the number of drops within the second “approaches” a Poisson(2.8) random variable:

Expectation and variance of Poisson If X is Binomial(n, p) then E[X] = npVar[X] = np(1-p) When p =  /n, we get E[X] =  Var[X] =  (1-  /n) As n → ∞, E[X] →  and Var[X] → . This suggests When X is Poisson(  ), E[X] =  and Var[X] = .

Problem for you to solve Rain falls on you at an average rate of 3 drops/sec. You walk for 30 sec from MTR to bus stop. When 100 drops hit you, your hair gets wet. What is the probability your hair got wet?

Problem for you to solve Solution On average, 90 drops fall in 30 seconds. So we model the number of drops N you receive as a Poisson(90) random variable. Using the online Poisson calculator at or the poissonpmf(n, L) function in 14L07.py we getonline P(N > 100) = 1 - ∑ i = 0 P(N = i) ≈ 13.49% 99