Presentation is loading. Please wait.

Presentation is loading. Please wait.

Daniela Stan Raicu School of CTI, DePaul University

Similar presentations


Presentation on theme: "Daniela Stan Raicu School of CTI, DePaul University"— Presentation transcript:

1 Daniela Stan Raicu School of CTI, DePaul University
CSC 323 Quarter: Spring 02/03 Daniela Stan Raicu School of CTI, DePaul University 11/28/2018 Daniela Stan - CSC323

2 Outline Chapter 5: Sampling Distributions Population and sample
Sampling distribution of a sample mean Central limit theorem Examples 11/28/2018 Daniela Stan - CSC323

3 Introduction Population Sample
This chapter begins a bridge from the study of probabilities to the study of statistical inference, by introducing the sampling distribution. Quality of sample data: The quality of all statistical analysis depends on the quality of the sample data If the data sample is not representative, analyzing the data and drawing conclusions will be unproductive-at best. Sample Population Random Sampling: every unit in the population has an equal chance to be chosen 11/28/2018 Daniela Stan - CSC323

4 Some definitions  2 Parameter: A number describing a population.
Statistic: A number describing a sample. 1. A random sample should represent the population well, so sample statistics from a random sample should provide reasonable estimates of population parameters. Sample statistics Population parameter Sample mean x Sample proportion p_hat p Sample variance s2 2 11/28/2018 Daniela Stan - CSC323

5 Some definitions (cont.)
2. All sample statistics have some error in estimating population parameters. 3. If repeated samples are taken from a population and the same statistic (e.g. mean) is calculated from each sample, the statistics will vary, that is, they will have a distribution. 4. A larger sample provides more information than a smaller sample so a statistic from a large sample should have less error than a statistic from a small sample. 11/28/2018 Daniela Stan - CSC323

6 Describing the Sample Mean
Let us assume that we want to estimate the mean  of the population since usually this is the first piece of information that an analyst wants to analyze: Since the value of the sample mean depends on the particular sample we draw, the sample mean is a variable with a huge number of possible values. The sample mean is a random variable because the samples are drawn randomly. The best way to summarize this vast amount of information is to describe it with a probability distribution. 11/28/2018 Daniela Stan - CSC323

7 The Distribution of the Sample Mean
Problem: Population: {A,B,C,D,E,F} Population mean:  = .1483 Population Variance:  = 11/28/2018 Daniela Stan - CSC323

8 The Distribution of the Sample Mean
Assumptions: What is the central value of the variable x? What is its variability? Is there a familiar pattern in the variability?  = .1483  = 11/28/2018 Daniela Stan - CSC323

9 What is the central value of the sample mean?
For large samples, the distribution of x should be symmetrical: x should be larger than  about 50% of the time and x should be smaller than  about 50% of the time. It can be shown theoretically (Central Limit theorem) that the mean of the sample means equals the population mean: E(x) =  In our example, E(x)= =  x is an unbiased estimator 11/28/2018 Daniela Stan - CSC323

10 What is the variance of the sample mean?
An estimator variance reveals a great deal about the quality of the estimator. The variance of the sample mean s2 = 2/n Where 2 = variance of the population n = sample size Increase of the sample size n Decrease of the variance s2 Better accuracy of the estimator 11/28/2018 Daniela Stan - CSC323

11 Accuracy of the Estimator
As in many problems, there is a trade off between accuracy and dollars. What we will get from our money if we invest dollars in obtaining a larger size? n = 100? n = 200? 11/28/2018 Daniela Stan - CSC323

12 Is there a familiar pattern in the data?
As the sample size becomes larger, the distribution of the sample mean becomes closer to a normal distribution, regardless the distribution of the population from which the sample is drawn. The central limit theorem summarizes the distribution of the sample mean. 11/28/2018 Daniela Stan - CSC323

13 The Central Limit Theorem
11/28/2018 Daniela Stan - CSC323

14 Importance of the central limit theorem
The most important feature is that it can be applied to any population as long as the sample size n is large enough. How large is large? n >= 30 11/28/2018 Daniela Stan - CSC323

15 Importance of the central limit theorem
Examples: 11/28/2018 Daniela Stan - CSC323

16 Is x normal distributed?
Is the population normal? Yes No Is ? Is ? Yes No Yes No is normal has t-student distribution is considered to be normal may or may not be considered normal (We need more info) 11/28/2018 Daniela Stan - CSC323


Download ppt "Daniela Stan Raicu School of CTI, DePaul University"

Similar presentations


Ads by Google