Random variables and probability distributions

Slides:



Advertisements
Similar presentations
The Binomial and Geometric Distributions Chapter 8.
Advertisements

Sampling Distributions and Sample Proportions
6.2 Construct and Interpret Binomial Distributions
Chapter 8: Binomial and Geometric Distributions
Business Statistics for Managerial Decision
CHAPTER 13: Binomial Distributions
1 Set #3: Discrete Probability Functions Define: Random Variable – numerical measure of the outcome of a probability experiment Value determined by chance.
Probability Distributions
Probability and Probability Distributions
1 Random variables and probability distributions.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Modeling Random Events: The Normal and Binomial Models
The Binomial Probability Distribution
Chapter 5 Several Discrete Distributions General Objectives: Discrete random variables are used in many practical applications. These random variables.
Unit 6 – Data Analysis and Probability
PROBABILITY DISTRIBUTIONS
Random Variables A random variable A variable (usually x ) that has a single numerical value (determined by chance) for each outcome of an experiment A.
Chapter Four Discrete Probability Distributions 4.1 Probability Distributions.
Chapter 5 Sampling Distributions
© Copyright McGraw-Hill CHAPTER 6 The Normal Distribution.
Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.
Chapter Discrete Probability Distributions © 2010 Pearson Prentice Hall. All rights reserved 3 6.
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 8 Continuous.
5.5 Distributions for Counts  Binomial Distributions for Sample Counts  Finding Binomial Probabilities  Binomial Mean and Standard Deviation  Binomial.
Copyright ©2011 Nelson Education Limited The Normal Probability Distribution CHAPTER 6.
Introduction Discrete random variables take on only a finite or countable number of values. Three discrete probability distributions serve as models for.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
 A probability function is a function which assigns probabilities to the values of a random variable.  Individual probability values may be denoted by.
Applied Business Forecasting and Regression Analysis Review lecture 2 Randomness and Probability.
Random Variables Numerical Quantities whose values are determine by the outcome of a random experiment.
 A probability function is a function which assigns probabilities to the values of a random variable.  Individual probability values may be denoted by.
Lesson Sample Proportions. Knowledge Objectives Identify the “rule of thumb” that justifies the use of the recipe for the standard deviation of.
Population distribution VS Sampling distribution
 A probability function is a function which assigns probabilities to the values of a random variable.  Individual probability values may be denoted by.
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
The Practice of Statistics Third Edition Chapter 8: The Binomial and Geometric Distributions 8.1 The Binomial Distribution Copyright © 2008 by W. H. Freeman.
Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 6 Modeling Random Events: The Normal and Binomial Models.
Introduction to Probability and Statistics Thirteenth Edition Chapter 5 Several Useful Discrete Distributions.
Chapter 5 Discrete Probability Distributions. Random Variable A numerical description of the result of an experiment.
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
Normal distributions The most important continuous probability distribution in the entire filed of statistics is the normal distributions. All normal distributions.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 6 Section 2 – Slide 1 of 31 Chapter 6 Section 2 The Binomial Probability Distribution.
Binomial Distribution
4.2 Binomial Distributions
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 6 Random Variables 6.3 Binomial and Geometric.
Chapter 7: Sampling Distributions Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
P. 403 – 404 #71 – 73, 75 – 78, 80, 82, 84 #72B: Binary? Yes – Success is a person is left-handed. I: Independent? Yes, since students are selected randomly,
Probability Theory Modelling random phenomena. Permutations the number of ways that you can order n objects is: n! = n(n-1)(n-2)(n-3)…(3)(2)(1) Definition:
1 7.3 RANDOM VARIABLES When the variables in question are quantitative, they are known as random variables. A random variable, X, is a quantitative variable.
The Binomial Probability Distribution. ● A binomial experiment has the following structure  The first test is performed … the result is either a success.
Binomial Distributions. Objectives How to determine if a probability experiment is a binomial experiment How to find binomial probabilities using the.
1. 2 At the end of the lesson, students will be able to (c)Understand the Binomial distribution B(n,p) (d) find the mean and variance of Binomial distribution.
+ Binomial and Geometric Random Variables Textbook Section 6.3.
Section 4.2 Binomial Distributions © 2012 Pearson Education, Inc. All rights reserved. 1 of 63.
7.4 and 7.5 Obj: Assess normality of a distribution and find the normal approximation to a binomial distribution.
Chapter 8: The Binomial and Geometric Distributions 8.1 – The Binomial Distributions.
The Binomial Probability Distribution
Section 6.2 Binomial Distribution
Probability Distributions
Binomial and Geometric Random Variables
CHAPTER 6 Random Variables
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Discrete Probability Distributions
Week 8 Chapter 14. Random Variables.
Discrete Probability Distributions
Elementary Statistics
6.1 Construct and Interpret Binomial Distributions
Presentation transcript:

Random variables and probability distributions

Random variables are a key concept for statistical inference We know that in order to use results from a sample to infer about a larger population, we must avoid bias and chose a sample at random. If our sample is random, then the sample mean, the sample proportion, and other statistics that we calculate from the sample are called: random variables. The behavior of such variables is an essential step on the way to performing statistical inference.

Random variables A random variable X associates a numerical value with each elementary outcome of an experiment Example Let X be the number of boys born in a family of 2 children. List the possible values of X X=1 means that there is one boy in the family 2 children in the family X – number of boys BB 2 BG 1 GB GG

Probability distribution of discrete variables In order to understand the pattern of behavior of a random variable, we need to know its possible values and how likely they are to occur. The probability distribution of a discrete random variable X is a list of the distinct numerical values of X, along with their associated probabilities. X x1 x2 x3 …… xk Probability p1 p2 p3 …… pk 0≤pi≤1 i=1,…,k p1+p2+…+pk=1

Example If X represents the number of boys in a family of 2 children, find the probability distribution of X. The probability of BB is: p(first is a boy and second is a boy) = p(first is a boy)× p(second is a boy)=0.5×0.5=0.25 P(BG)= P(GB)= P(GG)= Family with 2 children X – number of boys Probability BB 2 0.25 BG 1 GB GG Family with 2 children X – number of boys Probability BB 2 BG 1 GB GG Since the 2 events are independent 0.25 0.25 0.25

Example - continued Family with 2 children X – number of boys Probability BB 2 0.25 BG 1 GB GG Now we can specify the probability of each value of X Value of X Probability 0.25 1 0.25+0.25=0.5 2

Notations for random variables We denote the random variable by uppercase letters (e.g., X,V,Z) We denote the values that the random variable gets by lowercase letters (e.g., xi, vi, zi). The probability that a random variable X get a certain value x is denoted by: p(X=xi), or, in short, p(xi). For example Value of X P(xi) 0.25 1 0.5 2 P(X=0)=0.25  P(X=1)=0.5 P(X=2)=0.25 Remember that the sum of the probabilities must equal 1!!!

Example A probability distribution is given in the accompanying table with the additional information that the even values of X are equally likely. Determine the missing entries of the table. Answer: The sum of the probabilities in the table: 0.2+0.2+0.3=0.7 The remaining 0.3 probability is equally divided between the values 2,4,and 6 Value of X P(x) 1 0.2 2 3 4 5 0.3 6 0.1 0.1 0.1

Example Here is the probability distribution p(x) for grade X of a randomly chosen student from a certain class. X=0 represents an F, X=4 represents an A. X p(x) (F) 0.08 (D) 1 0.06 (C) 2 0.24 (B) 3 0.36 (A) 4 0.26 1. What is the probability of getting at least a B? 2. What is the probability of getting higher than a D? 3. What is the probability of getting less than a C? 4. What is the probability of getting no better than a B? 1. 2. 3. 4.

Probability histogram A probability histogram serves as a display of a probability distribution Example Number of boys is a family of 2 children X P(x) 0.25 1 0.5 2

Example - rolling two dice: X=sum of two dice X=2,3,4,5,6,7,8,9,10,11,12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 (1,4) (2,3) (3,2) (4,1) 5 (1,5) (2,4) (3,3) (4,2) (5,1) 6 (1,6) (2,5) (3,4) (4,3) (5,2) (6,1) 7 (2,6) (3,5) (4,4) (5,3) (6,2) 8 (3,6) (4,5) (5,4) (6,3) 9 10 11 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 (1,4) (2,3) (3,2) (4,1) 5 (1,5) (2,4) (3,3) (4,2) (5,1) 6 (1,6) (2,5) (3,4) (4,3) (5,2) (6,1) 7 (2,6) (3,5) (4,4) (5,3) (6,2) 8 (3,6) (4,5) (5,4) (6,3) 9 (4,6) (5,5) (6,4) 10 11 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 (1,4) (2,3) (3,2) (4,1) 5 (1,5) (2,4) (3,3) (4,2) (5,1) 6 (1,6) (2,5) (3,4) (4,3) (5,2) (6,1) 7 (2,6) (3,5) (4,4) (5,3) (6,2) 8 (3,6) (4,5) (5,4) (6,3) 9 (4,6) (5,5) (6,4) 10 (5,6) (6,5) 11 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 (1,4) (2,3) (3,2) (4,1) 5 (1,5) (2,4) (3,3) (4,2) (5,1) 6 (1,6) (2,5) (3,4) (4,3) (5,2) (6,1) 7 (2,6) (3,5) (4,4) (5,3) (6,2) 8 (3,6) (4,5) (5,4) (6,3) 9 (4,6) (5,5) (6,4) 10 (5,6) (6,5) 11 (6,6) 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 (1,4) (2,3) (3,2) (4,1) 5 (1,5) (2,4) (3,3) (4,2) (5,1) 6 (1,6) (2,5) (3,4) (4,3) (5,2) (6,1) 7 (2,6) (3,5) (4,4) (5,3) (6,2) 8 9 10 11 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 (1,4) (2,3) (3,2) (4,1) 5 (1,5) (2,4) (3,3) (4,2) (5,1) 6 7 8 9 10 11 12 Outcome of 2 dice X Probability (1,1) 2 3 4 5 6 7 8 9 10 11 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 4 5 6 7 8 9 10 11 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 5 6 7 8 9 10 11 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 (1,4) (2,3) (3,2) (4,1) 5 (1,5) (2,4) (3,3) (4,2) (5,1) 6 (1,6) (2,5) (3,4) (4,3) (5,2) (6,1) 7 8 9 10 11 12 Outcome of 2 dice X Probability (1,1) 2 (1,2) (2,1) 3 (1,3) (2,2) (3,1) 4 (1,4) (2,3) (3,2) (4,1) 5 6 7 8 9 10 11 12 1/36 2/36 3/36 4/36 5/36 6/36

Probability histogram

Example Consider an unbalanced die with the following probabilities: Value of X Probability 1 1/6 2 3 4 1/12 5 6 2/6 Roll a die and let X denote the outcome p(X≤2)=p(X=1)+p(X=2)=1/6+1/6=1/3 p(X<3)= p(X=1)+p(X=2)=1/6+1/6=1/3 p(2≤X≤4)=p(X=2)+p(X=3)+p(X=4)=1/6+1/6+1/12=5/12 p(X>1)=1-p(X=1)=1-1/6=5/6 p(X≤6)=sum of all probabilities =1 Roll a die and let X denote the outcome p(X≤2)= p(X<3)= p(2≤X≤4)= p(X>1)= p(X≤6)=

Mean and standard deviation of a random variable

Mean of a random variable The mean of a list of numbers is their average The mean of a rv X is a weighted average of the possible values of X. Example – household size in the U.S X 1 2 3 4 5 6 7 Probability .251 .321 .171 .154 .067 .022 .014 Mean of X= =(1)(.251)+(2)(.321)+(3)(.171) +(4)(.154) +(5)(.067) +(6)(.022) +(7)(.014)= =2.587 X

Mean of a random variable X x1 x2 x3 …… xk Probability p1 p2 p3 …… pk μX = E(X) = x1p1+x2p2+……+xkpk= Mean = expected value

Example Roll a die once and let X denote the outcome: Value of X Probability 1 1/6 2 3 4 5 6 Calculate the mean outcome of the die: E(X)=μX= =(1)(1/6)+(2)(1/6)+(3)(1/6)+(4)(1/6)+(5)(1/6)+(6)(1/6)=3.5

Standard deviation of a random variable We saw that we can describe a probability distribution by a probability histogram. For the dice example we had: In addition of the mean we need a measure of the spread. - variance of a random variable X - standard deviation of a random variable X

Standard deviation of a random variable Weighted average of the squared deviations of the rv X from its mean μX. X x1 x2 x3 …… xk Probability p1 p2 p3 …… pk = (x1- μX)2P1+(x2- μX)2P2+……+(xk- μX)2Pk= =Σ (xi- μX)2Pi k i=1

In the 2 dice example: μX=2(1/36)+3(2/36)+4(3/36)+5(4/36)+6(5/36)+7(6/36)+ +8(5/36)+9(4/36)+ 10(3/36)+11(2/36)+12(1/36)=7 = Var(X)=(2-7)2(1/36)+(3-7)2(2/36)+ (4-7)2(3/36) +(5-7)2(4/36) +(6-7)2(5/36) +(7-7)2(6/36)+(8-7)2(5/36) +(9-7)2(4/36)+ (10-7)2(3/36)+(11-7)2(2/36)+(12-7)2(1/36)=5.8333 = SD(X)=2.415 Outcome of 2 dice X Probability (1,1) 2 1/36 (1,2) (2,1) 3 2/36 (1,3) (2,2) (3,1) 4 3/36 (1,4) (2,3) (3,2) (4,1) 5 4/36 (1,5) (2,4) (3,3) (4,2) (5,1) 6 5/36 (1,6) (2,5) (3,4) (4,3) (5,2) (6,1) 7 6/36 (2,6) (3,5) (4,4) (5,3) (6,2) 8 (3,6) (4,5) (5,4) (6,3) 9 (4,6) (5,5) (6,4) 10 (5,6) (6,5) 11 (6,6) 12

Question Number 0 1 2 3 Probability .1 .1 .5 .3 At a news stand, the daily number X of requests for a certain out-of-town newspaper has the following probability distribution: Number 0 1 2 3 Probability .1 .1 .5 .3 (i) What kind of random variable is X? discrete/continuous (ii) What is the probability of fewer than 2 requests? p(X<2)=p(X=1)+p(X=0)=0.1+0.1=0.2 (iii) What is the expected number of requests? E(X)= μX=0(.1)+1(.1)+2(.5)+3(.3)=2 (iv) What is the standard deviation of number of requests? SD(x)=(0-2)2(.1) + (1-2)2(.1) + (2-2)2(.5) + (3-2)2(.3)=√.8=.894 (v) What is the shape of the distribution of requests? left skewed / symmetric / right skewed 0 1 2 3 .5 .4 .3 .2 .1

The Binomial Distribution x p(x) Distribution for p=0.5

The Binomial distribution Example A certain medicine may have some side effects with probability 0.3. The medicine is given to 5 people. Find the probability that exactly 2 of them will suffer for the side effects (the other 3 will not have these side effects!) 1 2 3 4 5 side effects side effects 0.3 × 0.3 × 0.7 × 0.7 × 0.7 = =0.03087 But this is only one possibility of choosing the 2 that have side effects

Other possibilities are: 1 2 3 4 5 2 3 4 5 Each possibility has a probability of 0.32×0.73=0.03087 10 such possibilities: 10×0.03087=0.3087

You do not need to count the possibilities– For a total of n trials (5 trials in the medicine example): If: x is the number of “successes” )e.g., side effects) n-x is the number of “failures” (e.g., no side effects) Then The number of different ways of getting x successes is: In the medicine example:

The Binomial distribution A random variable X that counts the number of successes in n independent trials is called a binomial variable. Notations: n – number of independent trials Success – each trial results in a “success” or “failure” p – probability of success in each trial X~B(n,p) x=0,1,2,…,n P(X=x) = px(1-p)n-x n x

The Binomial distribution In the medicine example: n=5 Success =side effects p=0.3 P(X=2) = 0.32(1-0.3)5-2 = 10×0.32×0.73 = .3087 5 2

Example An ad agency seeks comments on a commercial that aired during TV coverage of the super bowl. The proportion of all U.S adults watching was 0.4. The agency takes a random sample of 6 adults. What is the probability that exactly 4 were watching? n= “success”= p= X is the X~ p(X= ) What is the probability that at least 4 were watching? p(X≥4)=p(X=4) + p(X=5) + p(X=6) .44.62 + .45.61 + .46.60 = .1382+.0369+.0041=.1792 6 watching 0.4 number of people watching the game in a sample of 6 people B(6,0.4) 6 4 6∙5∙4∙3∙2∙1 (4∙3∙2∙1)(2∙1) 4 = .44.62 = (.009216) =.138 6 4 6 5 6

Using the Binomial table X~B(6,0.4) p(X=4)=? n=6 p=0.4 k=4 p(X=4)=0.1382 p(X≥4)=p(X=4) + p(X=5) + p(X=6)=.1382+.0369+.0041=.1792 n=6 p=0.4 k=4 n=6 p=0.4 k=5 n=6 p=0.4 k=6

Using the Binomial table X~B(5,0.5) P(X=0)= P(X=2)= P(X=4)= X~B(5,0.1) .0313 .3125 .1562 .5905 .0729 .0004

How to use the table when p>0.5 X~B(5,0.6) n=5 p=.6 p(X=2)=? Instead of p=0.6 we take p=0.4 And look in the table for the probability of failures p(X=5-2)=p(X=3)=.2304 Note that p(X=2)= .62.43 p(X=3)= .43.62 The table does not give us p>0.5 5 2 5 3 n=5, p=.6 n=5, p=.4

How to use the table when p>0.5 X~B(8,0.7) P(X=5)= n=8 p=0.7  no p=.7 in the table Instead, we take: Then- Instead of looking for p(X=5) we look for: P(X=3)= _________ 8 5 .75.33 p=1-0.7=0.3 p(X=3) .2541 n=8, p=.3

Binomial probabilities in Minitab In the session window: Cumulative Distribution Function Binomial with n = 6 and p = 0.400000 x P( X <= x ) 0.00 0.0467 1.00 0.2333 2.00 0.5443 3.00 0.8208 4.00 0.9590 5.00 0.9959 6.00 1.0000 p(X≤4)=0.959

Question X = number of diamonds in 3 random cards that are drawn, without replacement, from a deck of 52 cards. - Why isn’t X a binomial variable? The probability of drawing a diamond is not the same in the 3 trials!!! - How can we redefine X as a binomial variable? X = number of diamonds in 3 random cards that are drawn, with replacement, from a deck of 52 cards.

Binomial mean and SD X~B(6,0.4) x P( X = x ) 0.00 0.0467 1.00 0.1866 2.00 0.3110 3.00 0.2765 4.00 0.1382 5.00 0.0369 6.00 0.0041 μx=(0)(.0467) + (1)(.1866) + (2)(.3110) + (3)(.2765) + (4)(.1382) + + (5)(.0369) + (6)(.0041) = 2.4 Note that μx= 2.4 = (6)(0.4) = np

Binomial mean and SD X~B(6,0.4) x P( X = x ) 0.00 0.0467 1.00 0.1866 2.00 0.3110 3.00 0.2765 4.00 0.1382 5.00 0.0369 6.00 0.0041 σ x=√ (0-2.4)2(.0467) + (1-2.4)2(.1866) + (2-2.4)2(.3110) + (3-2.4)2(.2765) + (4-2.4)2(.1382) + (5-2.4)2(.0369) + (6-2.4)2(.0041) = √ 1.44 = 1.2 Note that σ x= √ 1.44 = √(6)(0.4)(0.6)= √ np(1-p)

Binomial mean and SD Mean of X: Variance of X: X~B(n,p) Mean of X: Variance of X: Standard deviation of X:

Example – blood type The probability of having blood type O is 0.5. 4 people are chosen at random. Use the Binomial formula to find the probability that 2 of them have blood type O. Answer: X= number of people with blood type O in a sample of 4 people n=4 p=0.5 p(X=2)=0.375

Example – blood type 2. Use the Binomial table to find the probability that at least 1 of them have blood type O. Answer: p(X≥1)=p(x=1)+ p(x=2)+ p(x=3)+ p(x=4) = = 1- p(X=0)= 1-0.0625

Example – blood type 3. Draw the probability histogram of the number of people with blood type O in the sample of 4 people Answer: x P( X = x ) 0.00 0.0625 1.00 0.2500 2.00 0.3750 3.00 0.2500 4.00 0.0625

Example – blood type 4. What is the mean number of people that have blood type O out of the 4 people sampled? Answer: μx=E(X)= np = (4)(0.5)=2

Example – blood type 5. What is the standard deviation of the number of people that have blood type O out of the 4 people sampled? Answer: σ x= √np(1-p) = √(4)(0.5)(0.5)=1

Example - internet According to government data, 25% of employed women have never been married. If 10 employed women are selected at random, what is the probability that exactly 2 have never been married? n=10 p=.25 p(X=2)=.2816 (b) What is the probability that 2 or fewer have never been married? n=10 p=.25 p(X≤2)=.0563+.1877+.2816 (c) What is the probability that at least 8 have never been married? n=10 p=.25 p(X≥8)=.00004+.0000+.0000 Minitab:..\binomial in class.MPJ

The normal approximation to the Binomial variable B(n,p) N(μ=np,σ2=np(1-p))

M&M example In a large bowl of M&M’s, the proportion of blues is 1/6 (or .17). X- the number of blue M&M’s in a sample of size 6 X~B(6, 1/6) Draw the probability histogram of X and compute its mean and SD .5 .4 .3 .2 .1 x p(x) 1(1/6)0(5/6)6=.33 1 6(1/6)1(5/6)5=.40 2 15(1/6)2(5/6)4=.20 3 20(1/6)3(5/6)3=.05 4 15(1/6)4(5/6)2=.01 5 6(1/6)5(5/6)1=.00 6 1(1/6)6(5/6)0=.00 0 1 2 3 4 5 6 Shape: Skewed right Mean: 6(1/6)=1 SD: √6(1/6)(5/6)=.9

Suppose we take a sample of size 30 from the M&M bowl X~B(30, 1/6) Describe the center, spread, and shape of the distribution of X x P( X = x ) 0.00 0.0037 1.00 0.0230 2.00 0.0682 . 30.00 0.000 Minitab: ..\binomial in class.MPJ

X~B(30, 1/6) Shape: Smoother, more bell shaped Mean: 30(1/6)=5 SD: √30(1/6)(5/6)=2

Suppose we take a sample of size 90 from the M&M bowl X~B(90, 1/6) Describe the center, spread, and shape of the distribution of X x P( X = x ) 0.00 0.0000 1.00 0.0000 2.00 0.0000 3.00 0.0001 4.00 0.0002 . 90.00 0.000

X~B(90, 1/6) Shape: Even smoother, more bell shaped – very close to a normal curve Mean: 90(1/6)=15 SD: √90(1/6)(5/6)=3.5

The binomial variable X~B(90, 1/6) behaves approximately like a normal variable with mean 15 and SD 3.5

The Normal approximation to the Binomial distribution As sample size n gets large, the distribution of the binomial random variable X is well approximated by the normal distribution, with the same mean and SD of the binomial variable: If X~B(n,p) as n increases X~>N(μ=np, σ=√np(1-p))

The Normal approximation to the Binomial distribution How large should n be? This depends on the value of p. If p is close to 0.5, then the normal approximation applies for small values of n If p is far from 0.5, larger values of n are needed. The following rule of thumb helps us decide when to use the normal approximation: np≥15 and n(1-p) ≥15

The Normal approximation to the Binomial distribution How large should n be? This depends on the value of p. If p is close to 0.5, then the normal approximation applies for small values of n If p is far from 0.5, larger values of n are needed. The following rule of thumb helps us decide when to use the normal approximation: np≥15 and n(1-p) ≥15

Example Check if the rule of thumb is satisfied for: B(n=6, p=1/6) no, since np=6(1/6)=1<15 2. B(n=90, p=1/6) yes, since np=90(1/6)=15 and n(1-p)=90(5/6)=75>15

Example of normal approximation to binomial You operate a restaurant. You read that sample survey by the National Restaurant association shows that 40% of adults are committed to eating nutritious food when eating away from home. To help plan your menu, you decide to conduct a sample survey in your own area. You will use random digit dialing to contact an SRS of 200 households by telephone. If the national results hold in your area, it is reasonable to use B(200,0.4) distribution to describe the count X of respondents who seek nutritious food when eating out. What is the mean number of nutrition-conscious people in your sample if p=.4 is true? Mean of B(200,.4)=200(.4)=80 SD=√200(.4)(.6)=√48=6.93 (b) Use the normal approximation to compute the probability that X lies between 75 and 85

p(75≤X≤85) = Is the normal approximation appropriate? np=200(.4)=80>15 n(1-p)=200(.6)=120>18 X~>N(80,6.932) p(75≤X≤85) =

The normal approximation applies also for proportions Next we will show that if X~B(n,p), As n increases – is approximately N( )

The normal approximation applies also for proportion A poll that surveyed 500 people found that 45% of them support military action in Iraq. X – the number of people that support military action in Iraq in a sample of 500 people X~B(500,0.45) Is the normal approximation appropriate in this case? Since np=500(.45)=225 and n(1-p)=500(.55)=275  we can use the normal approximation: μX=np=225 σX=√ np(1-p)=√500(.45)(.55)=√123.75 = 11.12 X~>N(225, 11.122)

X n Lets define a new variable: - the proportion of people that support military action in Iraq Instead of looking at the number of people who support that attack, we can look at the proportion of people who support the attack. Denote this proportion by How does behave? If X is Normal, any linear transformation of X is also normal. Since is a linear transformation of X - is approximately normal (note that X is approximately normal) Calculating the mean and SD of requires some definitions X n

Mean and SD of a linear transformation of a random variable If X is a random variable and a and b are fixed numbers: μa+bX=E(a+bX)=a+bμX σ2a+bX=Var(a+bX)=b2σ2X

Mean and variance of

Summary - Normal approximation to the Binomial distribution If X~B(n,p) as n increases (rule of thumb: np≥15 and n(1-p) ≥15)

Back to the example μ = p = 0.4 σ =

Example of normal approximation to binomial You operate a restaurant. You read that sample survey by the National Restaurant association shows that 40% of adults are committed to eating nutritious food when eating away from home. To help plan your menu, you decide to conduct a sample survey in your own area. You will use random digit dialing to contact an SRS of 200 households by telephone. If the national results hold in your area, it is reasonable to use B(200,0.4) distribution to describe the count X of respondents who seek nutritious food when eating out. What is the mean number of nutrition-conscious people in your sample if p=.4 is true? (b) Define by p-the proportion of people in the sample that seek nutritious food when eating out. p=X/n. Use the normal approximation to compute the probability that the p is larger than 0.7.

X~>N(80,6.932)