STATISTICS AND PROBABILITY

Slides:



Advertisements
Similar presentations
1 Slides revised The overwhelming majority of samples of n from a population of N can stand-in for the population.
Advertisements

ฟังก์ชั่นการแจกแจงความน่าจะเป็น แบบไม่ต่อเนื่อง Discrete Probability Distributions.
CHAPTER 13: Binomial Distributions
HAWKES LEARNING SYSTEMS math courseware specialists Copyright © 2010 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Chapter 7 Probability.
Chapter 4 Discrete Random Variables and Probability Distributions
Chapter 5 Basic Probability Distributions
Probability Distributions
Chapter 5 Discrete Random Variables and Probability Distributions
Binomial Probability Distribution.
Chapter 7 Confidence Intervals and Sample Sizes
POSC 202A: Lecture 9 Lecture: statistical significance.
1 If we can reduce our desire, then all worries that bother us will disappear.
DISCRETE PROBABILITY DISTRIBUTIONS
5.5 Distributions for Counts  Binomial Distributions for Sample Counts  Finding Binomial Probabilities  Binomial Mean and Standard Deviation  Binomial.
Binomial Distributions Calculating the Probability of Success.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Chapter 5 Discrete Probability Distributions 5-1 Review and Preview 5-2.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Topics Covered Discrete probability distributions –The Uniform Distribution –The Binomial Distribution –The Poisson Distribution Each is appropriately.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
FINAL REVIEW. The amount of coffee (in ounces) filled in a jar by a machine has a normal distribution with a mean amount of 16 ounces and a standard deviation.
 A probability function is a function which assigns probabilities to the values of a random variable.  Individual probability values may be denoted by.
Sampling W&W, Chapter 6. Rules for Expectation Examples Mean: E(X) =  xp(x) Variance: E(X-  ) 2 =  (x-  ) 2 p(x) Covariance: E(X-  x )(Y-  y ) =
Bernoulli Trials Two Possible Outcomes –Success, with probability p –Failure, with probability q = 1  p Trials are independent.
CHAPTER Discrete Models  G eneral distributions  C lassical: Binomial, Poisson, etc Continuous Models  G eneral distributions 
SESSION 31 & 32 Last Update 14 th April 2011 Discrete Probability Distributions.
Random Variables Presentation 6.. Random Variables A random variable assigns a number (or symbol) to each outcome of a random circumstance. A random variable.
Sample Means & Proportions
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
1 Week NORMAL DISTRIBUTION BERNOULLI TRIALS BINOMIAL DISTRIBUTION EXPONENTIAL DISTRIBUTION UNIFORM DISTRIBUTION POISSON DISTRIBUTION.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 5-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Chapter 5 Probability Distributions 5-1 Overview 5-2 Random Variables 5-3 Binomial Probability Distributions 5-4 Mean, Variance and Standard Deviation.
1 Week n = 10, p = 0.4 mean = n p = 4 sd = root(n p q) ~ 1.55.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Business Statistics,
Chap 5-1 Chapter 5 Discrete Random Variables and Probability Distributions Statistics for Business and Economics 6 th Edition.
THE NORMAL DISTRIBUTION
Probability Distributions ( 확률분포 ) Chapter 5. 2 모든 가능한 ( 확률 ) 변수의 값에 대해 확률을 할당하는 체계 X 가 1, 2, …, 6 의 값을 가진다면 이 6 개 변수 값에 확률을 할당하는 함수 Definition.
Probability Distributions
Lecture #14 Thursday, October 6, 2016 Textbook: Sections 8.4, 8.5, 8.6
Continuous Probability Distributions
MAT 446 Supplementary Note for Ch 3
Sampling Distributions
Binomial and Geometric Random Variables
CHAPTER 14: Binomial Distributions*
Chapter 4 Probability Distributions
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Chapter 5 Probability 5.2 Random Variables 5.3 Binomial Distribution
Chapter 5 Joint Probability Distributions and Random Samples
Basic Practice of Statistics - 3rd Edition Binomial Distributions
Sampling Distributions
Samples and Populations
Chapter 5 Sampling Distributions
IENG 486: Statistical Quality & Process Control
Combining Random Variables
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Simple Random Sample A simple random sample (SRS) of size n consists of n elements from the population chosen in such a way that every set of n elements.
Introduction to Probability and Statistics
Chapter 5 Sampling Distributions
Inference for Proportions
Random Variables Binomial Distributions
Quantitative Methods Varsha Varde.
Chapter 5 Sampling Distributions
Lecture: statistical significance.
Continuous Probability Distributions
ESTIMATION OF THE MEAN AND PROPORTION
CHAPTER 5 REVIEW.
The Binomial Distributions
Chapter 5: Sampling Distributions
Chapter 11 Probability.
Presentation transcript:

STATISTICS AND PROBABILITY Raoul LePage Professor STATISTICS AND PROBABILITY www.stt.msu.edu/~lepage click on STT315_F06 Week 9-25-06 and some preparation for exam 2.

solutions given in text 3-33, 3-41, 3-42 (except b, c, h, m, n), suggested exercises solutions given in text 3-33, 3-41, 3-42 (except b, c, h, m, n), 3-43, 3-49, 3-57 (except c, d), 3-59, 3-61, 3-63, 3-65. textbook exercises are not comprehensive Week 9-25-06 and some preparation for exam 2.

HAVING BROAD APPLICATION PROBABILITY MODELS HAVING BROAD APPLICATION NORMAL DISTRIBUTION BERNOULLI TRIALS BINOMIAL DISTRIBUTION POISSON DISTRIBUTION

NORMAL DISTRIBUTION: WHERE ARE THE MEAN AND STANDARD DEVIATION IN THIS PICTURE? note the point of inflexion note the balance point

IQ DISTRIBUTION: ~NORMAL, MEAN 100 STANDARD DEVIATION 15 point of inflexion SD=15 MEAN = 100

DISTRIBUTION OF THE NUMBER OF HEADS IN 100 COIN TOSSES: APPROXIMATELY NORMAL, MEAN 50, STD DEVIATION 5 5 50

DISTRIBUTION OF THE NUMBER OF ACCIDENTS IN ONE MONTH IF WE AVERAGE 39.7 PER MONTH: APPROXIMATELY NORMAL, MEAN 39.7, STD DEVIATION 6.3 6.3 39.7

~68% NORMAL DISTRIBUTIONS ARE ALIKE IN SD UNITS FROM THE MEAN ~ 68% WITHIN 1 SD OF MEAN ~ 95% WITHIN 2 SD OF MEAN Illustrated for the Standard Normal Mean=0, SD=1 ~68%

~95% NORMAL DISTRIBUTIONS ARE ALIKE IN SD UNITS FROM THE MEAN ~ 68% WITHIN 1 SD OF MEAN ~ 95% WITHIN 2 SD OF MEAN Illustrated for the Standard normal Mean=0, SD=1 ~95%

IQ DISTRIBUTION: ~NORMAL, MEAN 100 STANDARD DEVIATION 15 15 ~68/2 =34% ~95/2=47.5% 85 130 100

IQ DISTRIBUTION: ~NORMAL, MEAN 100 STANDARD DEVIATION 15 15 ~68/2 =34% ~95/2=47.5% 85 130 100

STANDARD SCORES CONVERT TO 0 MEAN; SD 1 IQ Z 1 15 Standard Normal 100

STANDARD SCORES CONVERT TO 0 MEAN; SD 1

Z - TABLE CUT AND PASTE P(Z > 0) = P(Z < 0 ) = 0.5 = 0.5 - 0.4961 = 0.0039 P(Z < 1.92) = 0.5 + P(0 < Z < 1.92) = 0.5 + 0.4726 = 0.9726

BERNOULLI DISTRIBUTION x p(x) p (1 denotes “success”) 0 q (0 denotes “failure”) __ 1 0 < p < 1 q = 1 - p

Notation: BERNOULLI RANDOM VARIABLE X P(success) = P(X = 1) = p P(failure) = P(X = 0) = q e.g. X = “sample voter is Democrat” Population has 48% Dem. p = 0.48, q = 0.52 P(X = 1) = 0.48

INDEPENDENT BERNOULLI-p "S" denotes success "F" denotes failure P(S1 S2 F3 F4 F5 F6 S7) = p3 q4 just write P(SSFFFFS) = p3 q4 “the answer only depends upon how many of each, not their order.” e.g. 48% Dem, 5 sampled, with-repl: P(Dem Rep Dem Dem Rep) = 0.483 0.522

BINOMIAL DISTRIBUTION FOR THE TOTAL NUMBER OF SUCCESSES IN INDEPENDENT p-BERNOULLI TRIALS. e.g. P(exactly 2 Dems out of sample of 4) = P(DDRR) + P(DRDR) + P(DDRR) + P(RDDR) + P(RDRD) + P(RRDD) = 6 .482 0.522 ~ 0.374. There are 6 ways to arrange 2D 2R.

BINOMIAL DISTRIBUTION FOR THE TOTAL NUMBER OF SUCCESSES IN INDEPENDENT p-BERNOULLI TRIALS. e.g. P(exactly 3 Dems out of sample of 5) = P(DDDRR) + P(DDRDR) + P(DDRRD) + P(DRDDR) + P(DRDRD) + P(DRRDD) + P(RDDDR) +P(RDDRD) + P(RDRDD) + P(RRDDD) = 10 .483 0.522 ~ 0.299. There are 10 ways to arrange 3D 2R. Same as the number of ways to select 3 from 5.

COUNTING ARRANGEMENTS 5! ways to arrange 5 things in a line Do it thus (1:1 with arrangements): select 3 of the 5 to go first in line, arrange those 3 at the head of line then arrange the remaining 2 after. 5! = (ways to select 3 from 5) 3! 2! So num ways must be 5! /( 3! 2!) = 10.

BINOMIAL FORMULA Let random variable X denote the number of “S” in n independent Bernoulli p-Trials. By definition, X has a Binomial Distribution and for each of x = 0, 1, 2, …, n P(X = x) = (n!/(x! (n-x)!) ) px qn-x e.g. P(44 Dems in sample of 100 voters) = (100!/(44! 56!)) 0.4844 0.52100-44 = 0.05812.

Caveats: Binomial Binomial Coefficient n!/(x! (n-x)!) is the count of how many arrangements there are of a string of x letters “S” and n-x letters “F.” . px qn-x is the shared probability of each string of x letters “S” and n-x letters “F.” (define 0! = 1, p0 = q0 = 1 and the formula goes through for every one of x = 0 through n) is short for the arrangement count = Binomial Coefficient

Normal Approx of Binomial Poisson and its normal Approx Aspects of random sampling Week 9-25-06

Normal Approx of Binomial n = 10, p = 0.4 mean = n p = 4 sd = root(n p q) ~ 1.55 Week 9-25-06

Normal Approx of Binomial n = 30, p = 0.4 mean = n p = 12 sd = root(n p q) ~ 2.683 Week 9-25-06

Normal Approx of Binomial n = 100, p = 0.4 mean = n p = 40 sd = root(n p q) ~ 4.89898 Week 9-25-06

p(x) = e-mean meanx / x! for x = 0, 1, 2, ..ad infinitum Poisson Distribution Governing Counts of Rare Events p(x) = e-mean meanx / x! for x = 0, 1, 2, ..ad infinitum Week 9-25-06

e..g. X = number of times ace of spades turns up in 104 tries Poisson e..g. X = number of times ace of spades turns up in 104 tries X~ Poisson with mean 2 p(x) = e-mean meanx / x! e.g. p(3) = e-2 23 / 3! ~ 0.18 Week 9-25-06

Poisson e.g. X = number of raisins in MY cookie. Batter has 400 raisins and makes 144 cookies. E X = 400/144 ~ 2.78 per cookie p(x) = e-mean meanx / x! e.g. p(2) = e-2.78 2.782 / 2! ~ 0.24 (around 24% of cookies have 2 raisins) Week 9-25-06

note: Poisson sd = root(mean) THE FIRST BEST THING ABOUT THE POISSON IS THAT THE MEAN ALONE TELLS US THE ENTIRE DISTRIBUTION! note: Poisson sd = root(mean) Week 9-25-06

E X = 400/144 ~ 2.78 raisins per cookie sd = root(mean) = 1.67 (for Poisson) Week 9-25-06

Poisson THE SECOND BEST THING ABOUT THE POISSON IS THAT FOR A MEAN AS SMALL AS 3 THE NORMAL APPROXIMATION WORKS WELL. 1.67 = sd = root(mean) Special to Poisson Week 9-25-06 mean 2.78

WE AVERAGE 127.8 ACCIDENTS PER MO. E X = 127.8 accidents If Poisson then sd = root(127.8) = 11.3049 and the approx dist is: sd = root(mean) = 11.3 Special to Poisson ~ Week 9-25-06 mean 127.8 accidents

Aspects of Random Sampling Week 9-25-06

THE GREAT TRICK OF STATISTICS The overwhelming majority of samples of n from a population of N can stand-in for the population. ATT Sysco Pepsico GM Dow population of N = 5 sample of n = 2

THE GREAT TRICK OF STATISTICS The overwhelming majority of samples of n from a population of N can stand-in for the population. ATT Sysco Pepsico GM Dow ATT Pepsico population of N = 5 sample of n = 2

GREAT TRICK : SOME CAVEATS Sample size n must be “large.” For only a few characteristics at a time, such as profit, sales, dividend. SPECTACULAR FAILURES MAY OCCUR! ATT 12 Sysco 21 Pepsi 42 GM 8 Dow 9 population of N = 5 sample of n = 2

With-replacement HOW ARE WE SAMPLING ? ATT 12 Sysco 21 Pepsi 42 GM 8 Dow 9 Pepsi 42 population of N = 5 sample of n = 2

With-replacement vs without replacement. HOW ARE WE SAMPLING ? With-replacement vs without replacement. ATT 12 Sysco 21 Pepsi 42 GM 8 Dow 9 population of N = 5 sample of n = 2

GREAT TRICK : SOME CAVEATS This sample is obviously “not representative.” ATT 12 Sysco 21 Pepsi 42 GM 8 Dow 9 Sysco 21 Pepsi 42 population of N = 5 sample of n = 2

DOES IT MAKE A DIFFERENCE ? Rule of thumb: With and without replacement are about the same if root [(N-n) /(N-1)] ~ 1. with vs without SAME ? population of N sample of n

CORRECTION TO PAGE 25 OF TEXT They would have you believe the population is {8, 9, 12, 42} and the sample is {42}. A SET is a collection of distinct entities. ATT 12 IBM 42 AAA 9 Pepsi 42 GM 8 Dow 9 WE SAMPLE COMPANIES NUMBERS COME WITH THEM Pepsi 42

THE ROLE OF RANDOM SAMPLING IF THE OVERWHELMING MAJORITY OF SAMPLES ARE “GOOD SAMPLES” THEN WE CAN OBTAIN A “GOOD” SAMPLE BY RANDOM SELECTION.

SELECTING A LETTER AT RANDOM HOW TO SAMPLE RANDOMLY ? SELECTING A LETTER AT RANDOM Digits are made to correspond to letters. a = 00-02 b = 03-05 …. z = 75-77 Random digits then give random letters. 1559 9068 … (Table 14, pg. 809) 15 59 90 68 etc… (split into pairs) f t * w etc… (take chosen letters) For samples without replacement just pass over any duplicates.

The Great Trick is far more powerful than we have seen The Great Trick is far more powerful than we have seen. A typical sample closely estimates such things as a population mean or the shape of a population density. But it goes beyond this to reveal how much variation there is among sample means and sample densities. A typical sample not only estimates population quantities. It estimates the sample-to-sample variations of its own estimates.

EXAMPLE : ESTIMATING A MEAN The average account balance is $421.34 for a random with-replacement sample of 50 accounts. We estimate from this sample that the average balance is $421.34 for all accounts. From this sample we also estimate and display a “margin of error” $421.34 +/- $65.22 = . s denotes "sample standard deviation"

SAMPLE STANDARD DEVIATION NOTE: Sample standard deviation s may be calculated in several equivalent ways, some sensitive to rounding errors, even for n = 2.

EXAMPLE : MARGIN OF ERROR CALCULATION The following margin of error calculation for n = 4 is only an illustration. A sample of four would not be regarded as large enough. Profits per sale = {12.2, 15.3, 16.2, 12.8}. Mean = 14.125, s = 1.92765, root(4) = 2. Margin of error = +/- 1.96 (1.92765 / 2) Report: 14.125 +/- 1.8891. A precise interpretation of margin of error will be given later in the course, including the role of 1.96. The interval 14.125 +/- 1.8891 is called a “95% confidence interval for the population mean.” We used: (12.2-14.125)2 + (15.3-14.125)2 + (16.2-14.125)2 + (12.8-14.125)2 = 11.1475.

EXAMPLE : ESTIMATING A PERCENTAGE A random with-replacement sample of 50 stores participated in a test marketing. In 39 of these 50 stores (i.e. 78%) the new package design outsold the old package design. We estimate from this sample that 78% of all stores will sell more of new vs old. We also estimate a “margin of error +/- 11.5% Figured: 1.96 root(pHAT qHAT)/root(n) =1.96 root(.78 .22)/root(50) = 0.114823 in Binomial setup

SAMPLING ONLY 600 FROM 500 MILLION ? A sample of only n = 600 from a population of N = 500 million. (FINE resolution) sample of n = 600 sample mean = 32.84 POP mean = 32.02 FINE resolution densities very close population of N = 500,000 with a sample of n = 600