Binomial and Related Distributions 學生 : 黃柏舜 學號 : 102581010 授課老師 : 蔡章仁.

Slides:



Advertisements
Similar presentations
生醫統計學期末報告 Distributions 學生 : 劉俊成 學號 : 授課老師 : 蔡章仁.
Advertisements

Statistics review of basic probability and statistics.
Chapter 5 Discrete Random Variables and Probability Distributions
ฟังก์ชั่นการแจกแจงความน่าจะเป็น แบบไม่ต่อเนื่อง Discrete Probability Distributions.
SECTION 12.2 Tests About a Population Proportion.
Chapter 4 Discrete Random Variables and Probability Distributions
Sampling Distributions
1 MF-852 Financial Econometrics Lecture 4 Probability Distributions and Intro. to Hypothesis Tests Roy J. Epstein Fall 2003.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
8-3 Testing a Claim about a Proportion
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Hypothesis Tests Regarding a Parameter 10.
Inferences About Process Quality
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
BCOR 1020 Business Statistics
Statistics for Managers Using Microsoft® Excel 5th Edition
The Neymann-Pearson Lemma Suppose that the data x 1, …, x n has joint density function f(x 1, …, x n ;  ) where  is either  1 or  2. Let g(x 1, …,
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
One Sample  M ean μ, Variance σ 2, Proportion π Two Samples  M eans, Variances, Proportions μ1 vs. μ2 σ12 vs. σ22 π1 vs. π Multiple.
Chapter 10 Hypothesis Testing
Overview Definition Hypothesis
Confidence Intervals and Hypothesis Testing - II
The smokers’ proportion in H.K. is 40%. How to testify this claim ?
Fundamentals of Hypothesis Testing: One-Sample Tests
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 4 and 5 Probability and Discrete Random Variables.
STAT 5372: Experimental Statistics Wayne Woodward Office: Office: 143 Heroy Phone: Phone: (214) URL: URL: faculty.smu.edu/waynew.
Statistical Review We will be working with two types of probability distributions: Discrete distributions –If the random variable of interest can take.
Random Sampling, Point Estimation and Maximum Likelihood.
1 Introduction to Hypothesis Testing. 2 What is a Hypothesis? A hypothesis is a claim A hypothesis is a claim (assumption) about a population parameter:
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Mid-Term Review Final Review Statistical for Business (1)(2)
Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
Sampling Distribution of the Sample Mean. Example a Let X denote the lifetime of a battery Suppose the distribution of battery battery lifetimes has 
One-Sample Tests of Hypothesis Chapter 10 McGraw-Hill/Irwin Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved.
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
Copyright © 1998, Triola, Elementary Statistics Addison Wesley Longman 1 Testing a Claim about a Proportion Section 7-5 M A R I O F. T R I O L A Copyright.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
One-Sample Tests of Hypothesis. Hypothesis and Hypothesis Testing HYPOTHESIS A statement about the value of a population parameter developed for the purpose.
Large sample CI for μ Small sample CI for μ Large sample CI for p
1 Chapter 8 Hypothesis Testing 8.2 Basics of Hypothesis Testing 8.3 Testing about a Proportion p 8.4 Testing about a Mean µ (σ known) 8.5 Testing about.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Fundamentals of Hypothesis Testing: One-Sample Tests Statistics.
Lecture 9 Chap 9-1 Chapter 2b Fundamentals of Hypothesis Testing: One-Sample Tests.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Overview.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
© Copyright McGraw-Hill 2004
One-Sample Tests of Hypothesis Chapter 10 McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Statistical Inference Making decisions regarding the population base on a sample.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Understanding Basic Statistics Fourth Edition By Brase and Brase Prepared by: Lynn Smith Gloucester County College Chapter Nine Hypothesis Testing.
Welcome to MM305 Unit 3 Seminar Prof Greg Probability Concepts and Applications.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 3 – Slide 1 of 27 Chapter 11 Section 3 Inference about Two Population Proportions.
Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Business Statistics,
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Chapter 4 Discrete Random Variables and Probability Distributions
Chapter 9 Hypothesis Testing Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
© 2010 Pearson Prentice Hall. All rights reserved Chapter Hypothesis Tests Regarding a Parameter 10.
Chapter Nine Hypothesis Testing.
Welcome to MM305 Unit 3 Seminar Dr
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
The Bernoulli distribution
Confidence Intervals.
Presentation transcript:

Binomial and Related Distributions 學生 : 黃柏舜 學號 : 授課老師 : 蔡章仁

In this section of the website, we explore the binomial distribution and, in particular, how to do hypothesis testing using the binomial distribution. We also explain the relationship between the binomial and normal distributions, as well as some related distributions, namely the proportion, negative binomial, geometric, hypergeometric, beta, multinomial and Poisson distributions. Binomial and Related Distributions

Binomial Distribution Definition 1: Suppose an experiment has the following characteristics: where C(n, x) = and n! = n(n – 1)(n – 2) ⋯ 3∙2∙1 as described in Combinatorial Functions. C(n, x) can be calculated by using the Excel function COMBIN(n,x).  the experiment consists of n independent trials, each with two mutually exclusive outcomes (success and failure)  for each trial the probability of success is p (and so the probability of failure is 1 – p) Each such trial is called a Bernoulli trial. Let x be the discrete random variable whose value is the number of successes in n trials. Then the probability distribution function for x is called the binomial distribution, B(n, p), and is defined as follows:

Observation: Figure 1 shows a graph of the probability density function for B(10,.25). That the graph looks a lot like the normal distribution is not a coincidence, as we will see shortly. Figure 1 Binomial distribution Binomial Distribution Property 1:

Excel Function: Excel provides the following functions regarding the binomial distribution: BINOMDIST(x, n, p, cum) where n = the number of trials, p = the probability of success for each trial and cum takes the values TRUE or FALSE. BINOMDIST(x, n, p, FALSE) = probability density function f(x) value at x for the binomial distribution B(n, p), i.e. the probability that there are x successes in n trials where the probability of success on any trial is p. BINOMDIST(x, n, p, TRUE) = cumulative probability distribution F(x) value at x for the binomial distribution B(n, p), i.e. the probability that there are at most x successes in n trials where the probability of success on any trial is p. Binomial Distribution

Example 1: What is the probability that if you throw a die 10 times it will come up six 4 times? We can model this problem using the binomial distribution B(10, 1/6) as follows Alternatively the problem can be solved using the Excel function: BINOMDIST(4, 10, 1/6, FALSE) = Binomial Distribution

Hypothesis Testing for Binomial Distribution Example 1: Suppose you have a die and you suspect that it is biased towards the number 3, and so run an experiment in which you throw the die 10 times and count that the die comes up 4 times with the number 3. Determine whether the die is biased. The population random variable x = the number of times 3 occurs in 10 trials has a binomial distribution B(10, π) where π is the population parameter corresponding to the probability of success on any trial. We define the following null hypothesis. And so we reject the null hypothesis with 95% level of confidence. P(x ≤ 4) = BINOMDIST(4, 10, 1/6, TRUE) = > 0.95 = 1 – α. Now setting α = 0.05 H 0 : π ≤ 1/6; i.e. the die is not biased towards the number 3 H 1 : π > 1/6

Example 2: We suspect that a coin is biased towards heads. When we toss the coin 9 times, how many heads need to come up before we are confident that the coin is biased towards heads? Note that BINOMDIST(6, 9,.5, TRUE) =.9102 <.95, while BINOMDIST(7, 9,.5, TRUE) =.9804 ≥.95. Hypothesis Testing for Binomial Distribution We use the following null hypothesis: H 0 : π ≤.5 H 1 : π >.5 Using a confidence factor of 95% (i.e. α =.05), we calculate CRITBINOM(n, p, 1–α) = CRITBINOM(9,.5,.95) = 7 And so 7 is the critical value. If 7 or more heads come up then we are 95% confident that the coin is biased towards heads, and so can reject the null hypothesis.

Relationship between Binomial and Normal Distributions Theorem 1: If x is a random variable with distribution B(n, p), then for sufficiently largen, the distribution of the variable where Corollary 1: Provided n is large enough, N(μ,σ) is a good approximation for B(n, p) where μ = np and σ 2 = np (1 – p). Observation: The normal distribution is a good approximation for the binomial distribution when np ≥ 5 and n(1 – p) ≥ 5. Another way to look at this, is that the normal distribution is a good approximation for the binomial distribution when n > 10 and.4 30 and.1 < p <.9.

Example 1: What is the normal distribution approximation for the binomial distribution where n = 20 and p =.25 (i.e. the binomial distribution displayed in Figure 1 of Binomial Distribution)? As in Corollary 1, define the following parameters: Since np = 5 ≥ 5 and n(1 – p) = 15 ≥ 5, based on Corollary 1 we can conclude that B(20,.25) ~ N(5,1.94). We now show the graph of both pdf’s to see visibly how close these distributions are: Figure 1 – Binomial vs. normal distribution

Proportion Distribution Definition 1: If x is a random variable with binomial distribution B(n, p) then the random variable y = x/n is said to have the proportion distribution. Property 1: Where y has a proportion distribution as defined above Proof: By Property 1b of Expectation and Property 1a of Binomial Distribution By Property 3b of Expectation Theorem 1: Provided n is large enough – generally when np ≥ 5 and n(1–p) ≥ 5 – then N(μ y,σ y ) is a good approximation for the proportion distribution for y with

Hypothesis Testing – one sample From the theorem, we know that when sufficiently large samples of size n are taken, the distribution of sample proportions is approximately normal, distributed around the true population proportion π, with standard deviation (i.e. the standard error) We can use this fact to do hypothesis testing as was done for the normal distribution. In addition when a two-tailed test is performed a confidence interval can be calculated where where we use the sample mean p as an estimate for the population mean when calculating the standard error. This introduces additional error, which is acceptable for large values of n.

Example 1: A company believes that 50% of their customers are women. A sample of 600 customers is chosen and 325 of them are women. Is this significantly different from their belief? Method 1: Using the binomial distribution, we reject the null hypothesis since: And so we reach the same conclusion, namely to reject the null hypothesis. Method 2: By Theorem 1 we can also use the normal distribution The observed mean is 325/600 =

Example 2: A survey of 1,100 voters showed that 53% are in favor of the new tax reform. Can we conclude that the majority of voters (from the population) are in favor? NORMDIST(.53,.5, , TRUE) = >.95, and so we can reject the null hypothesis and conclude with 95% confidence that the population will vote in favor of the tax reform. We determine the 95% confidence interval as follows: z crit = NORMSINV(1 – α/2) = NORMSINV(0.975) = 1.96 And so the 95% confidence interval is

And so we conclude with 95% confidence that between 50.1% and 55.9% of the population will be in favor. If however we are looking for a 99% confidence interval then z crit = NORMSINV(1 – α/2) = NORMSINV(0.995) = 2.58 And so the 99% confidence interval is This means that with 99% confidence, between 49.1% and 56.9% of the population will be in favor.

Hypothesis Testing – two samples Theorem 2: Let x 1 be a proportional distribution with mean π 1 and number of trials n 1 and let x 2 be a proportional distribution with mean π 2 and number of trials n 2. When the number of trials n 1 and n 2 are sufficiently large, usually when n i π i ≥ 5 and n i (1–π i ) ≥ 5, the difference between the sample proportions p 1 – p 2 will be approximately normal with mean π 1 – π 2 and standard deviation Proof: Based on Theorem 2 of the Binomial Distribution, x i has approximately the distribution Since x 1 and x 2 are independently distributed, by the linear transfer property of the normal distribution, x 1 – x 2 has distribution

Example 4: A company which manufactures long-lasting light bulbs sells halogen and compact florescent bulbs. They ran an experiment in which they ran 100 halogen and 100 florescent bulbs continuously for 250 days. After 250 days they found that half of the halogen bulbs were still working while 60% of the florescent bulbs were still operating. Is there a significant difference between the two types of bulbs? Let x 1 = the percentage of halogen bulbs that are functional after 200 days and x 2 = the percentage of florescent bulbs that are functional after 250 days. The presumption is that the distributions for each of these are proportional. We now test the following null hypothesis: H 0 : π 1 = π 2 Assuming the null hypothesis is true, then based on the null hypothesis by Theorem 2, x 1 – x 2 will be approximately normal with mean π 1 – π 2 = 0 and standard deviation

where the common value of the mean is denoted π and both samples are of size n. Since the value for π is unknown, we estimate its value from the sample, namely, = 110 successes out of 200, i.e. π ≈ 0.55, Thus, the mean of x 1 – x 2 is 0 (based on the null hypothesis) and the standard deviation is approximately. The observed value of x 1 – x 2 is.60 –.50 =.10, and so we have (two-tail test): Thus, we reject the null hypothesis, and conclude there is a significant difference between the two types of bulbs. We reach the same conclusion since critical value of x 1 – x 2 = NORMINV(.975,0,.0497) =.0975 <.1 = observed value of x 1 – x 2. p-value = NORMDIST(.1, 0,.0497, TRUE) =.978 >.975 = 1 – α/2

Thank you for listening !