Lecture 15: Statistics and Their Distributions, Central Limit Theorem

Slides:



Advertisements
Similar presentations
Chapter 18 Sampling distribution models
Advertisements

Chapter 6 Sampling and Sampling Distributions
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Sampling Distributions (§ )
Modeling Process Quality
ELEC 303 – Random Signals Lecture 18 – Statistics, Confidence Intervals Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 10, 2009.
Chapter 7 Introduction to Sampling Distributions
Engineering Probability and Statistics - SE-205 -Chap 4 By S. O. Duffuaa.
Normal Distribution ch5.
Sampling Distributions
Chapter 6 Introduction to Sampling Distributions
Chapter 7 Sampling and Sampling Distributions
Chapter 6 Continuous Random Variables and Probability Distributions
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
Continuous Random Variables and Probability Distributions
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
1 Sampling Distribution Theory ch6. 2  Two independent R.V.s have the joint p.m.f. = the product of individual p.m.f.s.  Ex6.1-1: X1is the number of.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
Continuous Probability Distribution  A continuous random variables (RV) has infinitely many possible outcomes  Probability is conveyed for a range of.
Chapter 5 Several Discrete Distributions General Objectives: Discrete random variables are used in many practical applications. These random variables.
Chapter 4 Continuous Random Variables and Probability Distributions
Chapter 5 Sampling Distributions
Jointly Distributed Random Variables
AP Statistics Chapter 9 Notes.
Continuous Probability Distributions  Continuous Random Variable  A random variable whose space (set of possible values) is an entire interval of numbers.
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Moment Generating Functions
Random Sampling, Point Estimation and Maximum Likelihood.
Copyright ©2011 Nelson Education Limited The Normal Probability Distribution CHAPTER 6.
Theory of Probability Statistics for Business and Economics.
 A probability function is a function which assigns probabilities to the values of a random variable.  Individual probability values may be denoted by.
 A probability function is a function which assigns probabilities to the values of a random variable.  Individual probability values may be denoted by.
ENGR 610 Applied Statistics Fall Week 3 Marshall University CITE Jack Smith.
Ch5. Probability Densities II Dr. Deshi Ye
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
1 Lecture 16: Point Estimation Concepts and Methods Devore, Ch
Distributions of the Sample Mean
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
Sampling Distribution Models Chapter 18. Toss a penny 20 times and record the number of heads. Calculate the proportion of heads & mark it on the dot.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
1 Topic 5 - Joint distributions and the CLT Joint distributions –Calculation of probabilities, mean and variance –Expectations of functions based on joint.
B AD 6243: Applied Univariate Statistics Data Distributions and Sampling Professor Laku Chidambaram Price College of Business University of Oklahoma.
Chapter 7 Point Estimation of Parameters. Learning Objectives Explain the general concepts of estimating Explain important properties of point estimators.
Exam 2: Rules Section 2.1 Bring a cheat sheet. One page 2 sides. Bring a calculator. Bring your book to use the tables in the back.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Continuous Random Variables and Probability Distributions
Chapter 18 Sampling distribution models math2200.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Sampling Distributions Chapter 18. Sampling Distributions A parameter is a number that describes the population. In statistical practice, the value of.
Chapter 6 Sampling and Sampling Distributions
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
Parameter, Statistic and Random Samples
MECH 373 Instrumentation and Measurements
Some useful results re Normal r.v
CHAPTER 6 Random Variables
Joint Probability Distributions and Random Samples
Chapter 5 Sampling Distributions
Chapter 7: Sampling Distributions
Lecture 13 Sections 5.4 – 5.6 Objectives:
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Sampling Distributions (§ )
Quantitative Methods Varsha Varde.
Presentation transcript:

Lecture 15: Statistics and Their Distributions, Central Limit Theorem Devore, Ch. 5.3-5.5

Topics The Concept of a “Statistic” Independent, Identically Distributed (iid) Samples Deriving Sampling Distribution of Statistic By Probability Rules By Simulation Application – Tolerances Distribution of the Sample Mean / Total Central Limit Theorem Distribution of a Linear Combination

I. Concepts of a “Statistic” Consider taking two samples of size n from the same population distribution. A: 30.7, 29.4, 31.1  Mean 30.4 B: 28.8, 30.0, 31.1  Mean 29.97 Which group has the larger mean? Propositions The uncertainty of individual values xi when sampling from a population distribution implies a r.v. This uncertainty further implies that any statistic calculated from the population distribution also varies from sample to sample.

Example: Minitab Suppose X ~ Weibull (shape= 2, scale = 5) E(X) = 4.4311; V(X) = 5.365 Using Minitab generate samples of 10 and observe differences in mean and variance. Results shown are from Devore, p. 226-227

Point Estimates / Sampling Distributions Point Estimate – value for a sample statistic from a particular sample. Statistic – rv whose value may be calculated from a sample of data -- lowercase letter indicates the calculated or observed value of the statistic. S  s Probability Distribution of a Statistic is known as its Sampling Distribution.

II. iid Random Samples Sampling Distribution depends on several items: Population Distribution (parameters) Sample size, n Method of Sampling (with or without replacement) rv’s X1, X2, .. Xn form a random sample of size n if: Xi’s are independent rv’s (independent) Every Xi has the same probability distribution (identically distributed) If satisfy above two conditions  we say Xi’s are iid sampling with replacement or from infinite population  iid sampling w/o replacement requires sample sizes n much smaller than population N to assume iid (rule: n/N <= 0.05).

III. Deriving Sampling Distribution of a Statistic By Probability Rules used for simple cases with a few Xi’s cases where derivation is already done. By Simulation (more common!) typically used when derivation via probability rules is complicated, or if: Underlying distribution of interest in unknown (assumed). We use, we dont derive!

Deriving Via Simple Probability Example: Suppose you sell two brands of DVD players for A: $150, and B: $200. Sales records indicate the following: A – 60% of Sales; B: 40% of Sales Let X1 – revenue from selling A; X2  revenue from B Suppose you take samples of size n=2. List the possible outcome, p(x1,x2), sample mean and variance.

DVD Example: Sampling Distribution Compute: What is the relationship between the expected value of X-bar and variance of X-bar and the original statistics?

DVD Example n=3 Now, what is the relationship between mean and variance of original distribution X Versus X-bar?

Deriving Sampling Distributions for Continuous Variables Similar to discrete distributions, we can also derive the sampling distributions of continuous variables.

Example: Two Exponential Exponential Distribution f(x; l) = l e- l x E(X) = 1/l V(X) = 1/l2 Suppose you have two independent rv’s, each following an exponential distribution and you are interested in the sum of the two rv’s (n=2). It can be shown that:

Practical Applications For many well-known distributions, the sampling distributions of their primary statistics (mean, variance) have already been determined. In those cases where the sampling distribution is unknown or complicated, a very useful alternative is simulation.

Simulation Experiment To perform a simulation, you need: statistic of interest (e.g., X-bar, S, median, ..) population distribution (e.g., normal, uniform, ..) sample size n (e.g., n=10, n=100) number of k replications (e.g., k=500)

Simulation #1: Range Vs. S Conduct an experiment to determine the relationship between Range and S for n=2, n=5, and n=100. Assume X ~ N(0,12)

Simulation #2: Jointly Distributed Variables with Tolerance Stack-Up Develop tolerances for the mean +/- 4s for the volume of an engine cylinder whose: bore ~ N(81 mm, 0.252 mm) and stroke ~ N(83.5 mm, 0.202 mm) What is the volume equation? 25.4 mm = 1 in, 1 L = 10^6 mm^3

IV. Distribution of Sample Mean/ Total Proposition - Let X1, X2, .. Xn random sample from a distribution with mean value m and std deviation of s, then: Let Total, To = X1 + X2 + .. Xn , then: Note difference between average and summing rv’s.

Sample Problem: Using the Avg or Sum of rv’s Let Y = # Parking Tickets issued on any given weekday. Suppose Y has Poisson distribution with l = 50. Assuming you may approximate with normal, What are the mean and variance of the avg # tickets per 5-day week? What are the mean and variance of the sum of tickets per 5-day week? What is the probability that the average # tickets per 5-day week is less than 48? What is the probability that the total # tickets per 5-day week is between 225 and 275? a) Mean = 50 variance = 50 variance(x-bar) = (50) / 5 = 5 s(x-bar) = 2.236 b) Total -- mean = 5*50 = 250; variance to = 250 sigma to = sqrt 250 = 15.811 Test at 47.5. C) P(x-bar < 48) 47.5 - 50 / 2.236 = phi(-1.12) = 0.1318 d) Zu 275.5 - 250 / 15.81 Zl 224.5 - 250 / 15.81 phi(1.61) - phi (-1.61) = 0.8926

V. Central Limit Theorem (CLT) Let X1, X2, .. Xn be a random sample from a distribution with mean value m and variance s2, and if n is sufficiently large, then Rule of Thumb: n > 30. But can be much less!

Understanding the CLT Using Minitab, let us generate 100 groups of service times (4 samples per group) from an exponential distribution with mean = 20 min. Describe what is happening to the distribution? Histogram - 400 times Histogram - 100 Group Avgs

Increasing sample size What is happening to the distribution of the sample averages? (Note: underlying distribution - exponential)

Average Multiple Distributions Suppose you have samples from 3 different distributions (e.g., exp, weibull, and uniform). Minitab results from exponential (l = 20), weibull (shape = 2, scale = 12) and uniform (20, 80). ALL 300 Observations Sample Averages

Summarizing the CLT Regardless of the underlying distribution, averaging produces a distribution which is more bell-shaped than before. Usefulness of CLT If n becomes sufficiently large and we wish to compute a probability of the sample mean, we may approximate with a normal. CLT provides analytical robustness! Issue of how robust depends on n and the underlying distribution -- the closer the underlying distribution resembles a normal (bell-shape) the smaller the n that is needed.

Other Applications Bernoulli Trials (Binomial Distribution) Let a sample n consist of Xi Bernoulli trials (where each trial equals 0 for failure, 1 for success). As n (# of trials) becomes large and both: np > 10 and nq > 10 then the distribution of the sample mean (np) will become normally distributed. Consider the following example: 10K bernoulli trials, if you group them in samples of size 100, what will be the distribution of the groups?

Bernoulli Trial Example What does this experiment show about the importance of sample size, particularly for binary attributes?

Rules of Thumb with CLT How large a sample size do you need to invoke the CLT? Uniform n >= 4 Symmetric Triangular n >= 3 Normal n >= 1 Unimodal with extreme n >= 30 (e.g., exponential) Discrete - apply normal approx rules Binomial ~ np >= 10 for p < 0.5 (Or, np >= and nq >= 10) Poisson ~ l >= 15

VI. Distribution of Linear Combination (Independent Xi’s) Let X1, X2, .. Xn be a collection of random variables with constraints a1, a2, .. an then, Linear Combination Y = If X1, X2, .. Xn are independent:

Differences Between Variables If Y = X1 - X2, E(X1 - X2) = a1E(X1) - a2E(X2) V(X1 - X2) = a12V(X1) + a22V(X2) Regardless of whether Xi are added or subtracted, the variances are additive!

Linear Combination: Tolerance Suppose you need to slide tube A into tube B. What is the linear combination of assembly clearance if tube A is N(24.8, 0.052) and tube B N(25, 0.052)? Assume the tube measurements are independent. E(B) - E(A) = 0.2 S clear = 0.07

Weighted Linear Combination: Tolerance Suppose you are welding two pieces of metal together: a thick piece and a thin piece. Let Xthin be the position of the thin piece. Let Xthick be the position of the thick piece. From experience, you find the final position is based on the following: Yassembly = 0.2 Xthin + 0.8 Xthick What is the expected variance of the assembly if the standard deviation thin piece is 0.4 mm, and the standard deviation of the thick piece is 0.15 mm? (assume the measurements of each piece is independent) V(Yasm) = .2^2*.4^2 + .8^2*.15^2 = 0.0208 sasm = 0.144