Paul Cornwell March 31, 2011 1.  Let X 1,…,X n be independent, identically distributed random variables with positive variance. Averages of these variables.

Slides:



Advertisements
Similar presentations
Chapter 18 Sampling distribution models
Advertisements

Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
THE MEANING OF STATISTICAL SIGNIFICANCE: STANDARD ERRORS AND CONFIDENCE INTERVALS.
The Normal Distribution. n = 20,290  =  = Population.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Standard Normal Distribution
Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.
Chapter Topics Confidence Interval Estimation for the Mean (s Known)
Chapter 11: Inference for Distributions
Statistical Inference Lab Three. Bernoulli to Normal Through Binomial One flip Fair coin Heads Tails Random Variable: k, # of heads p=0.5 1-p=0.5 For.
BCOR 1020 Business Statistics Lecture 18 – March 20, 2008.
PSY 307 – Statistics for the Behavioral Sciences
UWHC Scholarly Forum April 17, 2013 Ismor Fischer, Ph.D. UW Dept of Statistics, UW Dept of Biostatistics and Medical Informatics
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
Statistics for Managers Using Microsoft® Excel 7th Edition
Aron, Aron, & Coups, Statistics for the Behavioral and Social Sciences: A Brief Course (3e), © 2005 Prentice Hall Chapter 8 Introduction to the t Test.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
7.2 Confidence Intervals When SD is unknown. The value of , when it is not known, must be estimated by using s, the standard deviation of the sample.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
Confidence Intervals (Chapter 8) Confidence Intervals for numerical data: –Standard deviation known –Standard deviation unknown Confidence Intervals for.
© 2003 Prentice-Hall, Inc.Chap 6-1 Business Statistics: A First Course (3 rd Edition) Chapter 6 Sampling Distributions and Confidence Interval Estimation.
Mid-Term Review Final Review Statistical for Business (1)(2)
1 Happiness comes not from material wealth but less desire.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Biostatistics Dr. Chenqi Lu Telephone: Office: 2309 GuangHua East Main Building.
TobiasEcon 472 Law of Large Numbers (LLN) and Central Limit Theorem (CLT)
February 2012 Sampling Distribution Models. Drawing Normal Models For cars on I-10 between Kerrville and Junction, it is estimated that 80% are speeding.
Chapter 18: Sampling Distribution Models
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
: An alternative representation of level of significance. - normal distribution applies. - α level of significance (e.g. 5% in two tails) determines the.
Mystery 1Mystery 2Mystery 3.
Math 4030 – 9b Comparing Two Means 1 Dependent and independent samples Comparing two means.
ISMT253a Tutorial 1 By Kris PAN Skewness:  a measure of the asymmetry of the probability distribution of a real-valued random variable 
Chapter 18 Sampling Distribution Models *For Means.
8.1 Estimating µ with large samples Large sample: n > 30 Error of estimate – the magnitude of the difference between the point estimate and the true parameter.
Lesoon Statistics for Management Confidence Interval Estimation.
Statistics: Unlocking the Power of Data Lock 5 Section 6.4 Distribution of a Sample Mean.
Paul Cornwell March 31,  Let X 1,…,X n be independent, identically distributed random variables with positive variance. Averages of these variables.
Central Limit Theorem Let X 1, X 2, …, X n be n independent, identically distributed random variables with mean  and standard deviation . For large n:
Introduction to Basic Statistical Methods Part 1: Statistics in a Nutshell UWHC Scholarly Forum May 21, 2014 Ismor Fischer, Ph.D. UW Dept of Statistics.
Essential Statistics Chapter 171 Two-Sample Problems.
Lecture 8 Estimation and Hypothesis Testing for Two Population Parameters.
Review Confidence Intervals Sample Size. Estimator and Point Estimate An estimator is a “sample statistic” (such as the sample mean, or sample standard.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.
Introductory Statistics: Exploring the World through Data, 1e
Differences between t-distribution and z-distribution
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Chapter 4. Inference about Process Quality
Sampling Distributions and Estimation
STAT 311 REVIEW (Quick & Dirty)
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Estimation & Hypothesis Testing for Two Population Parameters
Math 4030 – 10a Tests for Population Mean(s)
PROBABILITY DISTRIBUTION Dr.Fatima Alkhalidi
Sample Mean Distributions
Chapter 18: Sampling Distribution Models
Chapter 7: Sampling Distributions
Introductory Statistics: Exploring the World through Data, 1e
Chapter 18 – Central Limit Theorem
Sampling Distributions
Basic Practice of Statistics - 3rd Edition Two-Sample Problems
Continuous Random Variable Normal Distribution
Essential Statistics Two-Sample Problems - Two-sample t procedures -
Central Limit Theorem: Sampling Distribution.
Statistical Inference for the Mean: t-test
Presentation transcript:

Paul Cornwell March 31,

 Let X 1,…,X n be independent, identically distributed random variables with positive variance. Averages of these variables will be approximately normally distributed with mean μ and standard deviation σ/√n when n is large. 2

 How large of a sample size is required for the Central Limit Theorem (CLT) approximation to be good?  What is a ‘good’ approximation? 3

 Permits analysis of random variables even when underlying distribution is unknown  Estimating parameters  Hypothesis Testing  Polling 4

 Performing a hypothesis test to determine if set of data came from normal  Considerations ◦ Power: probability that a test will reject the null hypothesis when it is false ◦ Ease of Use 5

 Problems ◦ No test is desirable in every situation (no universally most powerful test) ◦ Some lack ability to verify for composite hypothesis of normality (i.e. nonstandard normal) ◦ The reliability of tests is sensitive to sample size; with enough data, null hypothesis will be rejected 6

 Symmetric  Unimodal  Bell-shaped  Continuous 7

 Skewness: Measures the asymmetry of a distribution. ◦ Defined as the third standardized moment ◦ Skew of normal distribution is 0 8

 Kurtosis: Measures peakedness or heaviness of the tails. ◦ Defined as the fourth standardized moment ◦ Kurtosis of normal distribution is 3 9

 Cumulative distribution function: 10

11 parametersKurtosisSkewness% outside 1.96*sd K-S distance Mean Std Dev n = 20 p = (.25).3325 (1.5) n = 25 p = n = 30 p = n = 50 p = n = 100 p = *from R

 Cumulative distribution function: 12

13 parametersKurtosisSkewness% outside 1.96*sd K-S distance Mean Std Dev n = 5 (a,b) = (0,1) (-1.2).004 (0) (.129) n = 5 (a,b) = (0,50) (6.455) n = 5 (a,b) = (0,.1) (.0129) n = 3 (a,b) = (0,50) (8.333) *from R

 Cumulative distribution function: 14

15 parametersKurtosisSkewness% outside 1.96*sd K-S distance Mean Std Dev n = 5 λ = (6).904 (2) (.4472) n = (.316) n = (.2581) *from R

 Find n values for more distributions  Refine criteria for quality of approximation  Explore meanless distributions  Classify distributions in order to have more general guidelines for minimum sample size 16

Paul Cornwell May 2,

 Central Limit Theorem: Averages of i.i.d. variables become normally distributed as sample size increases  Rate of converge depends on underlying distribution  What sample size is needed to produce a good approximation from the CLT? 18

 Real-life applications of the Central Limit Theorem  What does kurtosis tell us about a distribution?  What is the rationale for requiring np ≥ 5?  What about distributions with no mean? 19

 Probability for total distance covered in a random walk tends towards normal  Hypothesis testing  Confidence intervals (polling)  Signal processing, noise cancellation 20

 Measures the “peakedness” of a distribution  Higher peaks means fatter tails 21

 Traditional assumption for normality with binomial is np > 5 or 10  Skewness of binomial distribution increases as p moves away from.5  Larger n is required for convergence for skewed distributions 22

 Has no moments (including mean, variance)  Distribution of averages looks like regular distribution  CLT does not apply 23

 α = β = 1/3  Distribution is symmetric and bimodal  Convergence to normal is fast in averages 24

 Heavier-tailed, bell-shaped curve  Approaches normal distribution as degrees of freedom increase 25

 4 statistics: K-S distance, tail probabilities, skewness and kurtosis  Different thresholds for “adequate” and “superior” approximations  Both are fairly conservative 26

27 Distribution∣Kurtosis∣ <.5 ∣Skewness∣ <.25 Tail Prob..04<x<.06 K-S Distance <.05 max Uniform31223 Beta (α=β=1/3)41334 Exponential Binomial (p=.1) Binomial (p=.5) Student’s t with 2.5 df NA 1320 Student’s t with 4.1 df

28 Distribution∣Kurtosis∣ <.3 ∣Skewness∣ <.15 Tail Prob..04<x<.06 K-S Distance <.02 max Uniform41224 Beta (α=β=1/3)61346 Exponential Binomial (p=.1) Binomial (p=.5) Student’s t with 2.5 df NA Student’s t with 4.1 df

 Skewness is difficult to shake  Tail probabilities are fairly accurate for small sample sizes  Traditional recommendation is small for many common distributions 29