 The situation in a statistical problem is that there is a population of interest, and a quantity or aspect of that population that is of interest. This.

Slides:



Advertisements
Similar presentations
Chapter 6 Sampling and Sampling Distributions
Advertisements

Estimation in Sampling
 These 100 seniors make up one possible sample. All seniors in Howard County make up the population.  The sample mean ( ) is and the sample standard.
AP Statistics: Section 9.1 Sampling Distributions
Sampling: Final and Initial Sample Size Determination
Chapter 10: Estimating with Confidence
Sampling Distributions and Sample Proportions
Sampling Distributions
Sampling Distributions (§ )
Chapter 7 Sampling and Sampling Distributions
Part III: Inference Topic 6 Sampling and Sampling Distributions
Chapter 10: Estimating with Confidence
Chapter 12 Section 1 Inference for Linear Regression.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
A P STATISTICS LESSON 9 – 1 ( DAY 1 ) SAMPLING DISTRIBUTIONS.
Chapter 7 Sampling Distributions
CHAPTER 11: Sampling Distributions
Chapter 5 Sampling Distributions
Sociology 5811: Lecture 7: Samples, Populations, The Sampling Distribution Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
From Sample to Population Often we want to understand the attitudes, beliefs, opinions or behaviour of some population, but only have data on a sample.
POSC 202A: Lecture 9 Lecture: statistical significance.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Sampling Distributions
5.5 Distributions for Counts  Binomial Distributions for Sample Counts  Finding Binomial Probabilities  Binomial Mean and Standard Deviation  Binomial.
ESTIMATION. STATISTICAL INFERENCE It is the procedure where inference about a population is made on the basis of the results obtained from a sample drawn.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Estimation: Sampling Distribution
PARAMETRIC STATISTICAL INFERENCE
AP Statistics Section 9.3A Sample Means. In section 9.2, we found that the sampling distribution of is approximately Normal with _____ and ___________.
Standard Error and Confidence Intervals Martin Bland Professor of Health Statistics University of York
Sampling Distributions Chapter 7. The Concept of a Sampling Distribution Repeated samples of the same size are selected from the same population. Repeated.
Chapter 7: Data for Decisions Lesson Plan Sampling Bad Sampling Methods Simple Random Samples Cautions About Sample Surveys Experiments Thinking About.
Warm-up 7.1 Sampling Distributions. Ch. 7 beginning of Unit 4 - Inference Unit 1: Data Analysis Unit 2: Experimental Design Unit 3: Probability Unit 4:
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
CHAPTER 11: Sampling Distributions ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Chapter 7 Sampling Distributions Statistics for Business (Env) 1.
CHAPTER 15: Sampling Distributions
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 10.1 Confidence Intervals: The Basics.
Section 10.1 Confidence Intervals
AP Statistics: Section 9.1 Sampling Distributions.
6.1 Inference for a Single Proportion  Statistical confidence  Confidence intervals  How confidence intervals behave.
STA Lecture 171 STA 291 Lecture 17 Chap. 10 Estimation – Estimating the Population Proportion p –We are not predicting the next outcome (which is.
CONFIDENCE STATEMENT MARGIN OF ERROR CONFIDENCE INTERVAL 1.
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Chapter 10: Confidence Intervals
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
The Practice of Statistics Chapter 9: 9.1 Sampling Distributions Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
1 Chapter 9: Sampling Distributions. 2 Activity 9A, pp
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. PPSS The situation in a statistical problem is that there is a population of interest, and a quantity or.
Chapter 13 Sampling distributions
Chapter 7 Data for Decisions. Population vs Sample A Population in a statistical study is the entire group of individuals about which we want information.
Sampling Distributions: Suppose I randomly select 100 seniors in Anne Arundel County and record each one’s GPA
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
Chapter 7 Introduction to Sampling Distributions Business Statistics: QMIS 220, by Dr. M. Zainal.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Parameter versus statistic  Sample: the part of the population we actually examine and for which we do have data.  A statistic is a number summarizing.
9.1: Sampling Distributions. Parameter vs. Statistic Parameter: a number that describes the population A parameter is an actual number, but we don’t know.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Statistics for Business and Economics Module 1:Probability Theory and Statistical Inference Spring 2010 Lecture 4: Estimating parameters with confidence.
10.1 Estimating with Confidence Chapter 10 Introduction to Inference.
Chapter 8: Estimating with Confidence
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.1 Confidence Intervals: The.
Chapter 9 Sampling Distributions 9.1 Sampling Distributions.
Chapter 6 Sampling and Sampling Distributions
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
Distribution of the Sample Means
Sampling Distributions (§ )
Presentation transcript:

 The situation in a statistical problem is that there is a population of interest, and a quantity or aspect of that population that is of interest. This quantity is called a parameter. The value of this parameter is unknown.  To learn about this parameter we take a sample from the population and compute an estimate of the parameter called a statistic.

 It is usually not practical to study an entire population  in a random sample each member of the population has an equal chance of being chosen  a representative sample might have the same proportion of men and women as does the population.

 Population  All Saudis  All inpatients in KKUH  All depressed people  Sample  A subset of Saudis  A subset of inpatients  The depressed people in Riyadh.

Statistics and Parameters  a parameter is a characteristic of a population  e.g., the average heart rate of all Saudis.  a statistic is a characteristic of a sample  e.g., the average heart rate of a sample of Saudis.  We use statistics of samples to estimate parameters of populations.  “mew”  “sigma”  “rho”

 Inference – extension of results obtained from an experiment (sample) to the general population  use of sample data to draw conclusions about entire population  Parameter – number that describes a population  Value is not usually known  We are unable to examine population  Statistic – number computed from sample data  Estimate unknown parameters  Computed to estimate unknown parameters Mean, standard deviation, variability, etc..

What would happen if we took many samples of 10 subjects from the population? Steps: 1. Take a large number of samples of size 10 from the population 2. Calculate the sample mean for each sample 3. Make a histogram of the mean values 4. Examine the distribution displayed in the histogram for shape, center, and spread, as well as outliers and other deviations

How can experimental results be trusted? If is rarely exactly right and varies from sample to sample, how it will be a reasonable estimate of the population mean μ? How can experimental results be trusted? If is rarely exactly right and varies from sample to sample, how it will be a reasonable estimate of the population mean μ? How can we describe the behavior of the statistics from different samples? How can we describe the behavior of the statistics from different samples? E.g. the mean value E.g. the mean value

Very rarely do sample values coincide with the population value (parameter). The discrepancy between the sample value and the parameter is known as sampling error, when this discrepancy is the result of random sampling. Fortunately, these errors behave systematically and have a characteristic distribution.

StudentGPA Susan2.1 Karen2.6 Bill2.3 Calvin1.2 Rose3.0 David2.4

Susan 2.1 Karen 2.6 Bill 2.3 Rose 3.0 David 2.4 Calvin 1.2

ONE SAMPLE = Susan + Karen +Bill / 3 = / 3 = 7.0 / 3 = 2.3 Standard Deviation: ( ) 2 =.2 2 =.04 ( ) 2 =.3 2 =.09 ( ) 2 = 0 2 = 0 s 2 =.13/3 and s = =.21 So this one sample of 3 has a mean of 2.3 and a sd of.21

 A SECOND SAMPLE = Susan + Karen + Calvin = = 1.97 SD =.58  20 th SAMPLE = Karen + Rose + David = = 2.67 SD =.25

 Assume the true mean of the population is known, in this simple case of 6 people and can be calculated as 13.6/6 =  =2.27  The mean of the sampling distribution (i.e., the mean of all 20 samples) is 2.30.

 Sample mean is a random variable.  If the sample was randomly drawn, then any differences between the obtained sample mean and the true population mean is due to sampling error.  Any difference between and μ is due to the fact that different people show up in different samples  If is not equal to μ, the difference is due to sampling error.  “Sampling error” is normal, it is to-be-expected variability of samples

 A distribution made up of every conceivable sample drawn from a population.  A sampling distribution is almost always a hypothetical distribution because typically you do not have and cannot calculate every conceivable sample mean.  The mean of the sampling distribution is an unbiased estimator of the population mean with a computable standard deviation.

If we keep taking larger and larger samples, the statistic is guaranteed to get closer and closer to the parameter value.

Draw 500 different SRSs. What happens to the shape of the sampling distribution as the size of the sample increases?

 As the sample size increases the mean of the sampling distribution comes to more closely approximate the true population mean, here known to be  = 3.5  AND-this critical-the standard error-that is the standard deviation of the sampling distribution gets systematically narrower.

 Probabilistically, as the sample size gets bigger the sampling distribution better approximates a normal distribution.  The mean of the sampling distribution will more closely estimate the population parameter as the sample size increases.  The standard error (SE) gets narrower and narrower as the sample size increases. Thus, we will be able to make more precise estimates of the whereabouts of the unknown population mean.

We are unlikely to ever see a sampling distribution because it is often impossible to draw every conceivable sample from a population and we never know the actual mean of the sampling distribution or the actual standard deviation of the sampling distribution. But, here is the good news: We can estimate the whereabouts of the population mean from the sample mean and use the sample’s standard deviation to calculate the standard error. The formula for computing the standard error changes, depending on the statistic you are using, but essentially you divide the sample’s standard deviation by the square root of the sample size.

 Don’t get confuse with the terms of STANDARD DEVEIATION and STANDARD ERROR

 Standard deviation: measures the variation of a variable in the sample.  Technically,

 Standard error of mean is calculated by:

 The standard deviation (s) describes variability between individuals in a sample.  The standard error describes variation of a sample statistic.  The standard deviation describes how individuals differ.  The standard error of the mean describes how the sample means differ(the precision with which we can make inference about the true mean).

 Standard error of the mean (sem):  Comments:  n = sample size  even for large s, if n is large, we can get good precision for sem  always smaller than standard deviation (s)

 Example: If the SD., of FEV1 values = 0.12, the SE of mean for a sample of size n=40 is 0.12/√40 = For a sample size n=400, the SE of mean is 0.12/√400 = SE reduces as the sample size increases.

P-Hat Definition

Sample proportions The proportion of an “event of interest” can be more informative. In statistical sampling the sample proportion of an event of interest is used to estimate the proportion p of an event of interest in a population. For any SRS of size n, the sample proportion of an event is:  In an SRS of 50 students in an undergrad class, 10 are O +ve blood group: = (10)/(50) = 0.2 (proportion of O +ve blood group in sample)  The 30 subjects in an SRS are asked to taste an unmarked brand of coffee and rate it “would buy” or “would not buy.” Eighteen subjects rated the coffee “would buy.” = (18)/(30) = 0.6 (proportion of “would buy”)

 How does p-hat behave? To study the behavior, imagine taking many random samples of size n, and computing a p-hat for each of the samples.  Then we plot this set of p-hats with a histogram.

Sampling distribution of the sample proportion The sampling distribution of is never exactly normal. But as the sample size increases, the sampling distribution of becomes approximately normal. The normal approximation is most accurate for any fixed n when p is close to 0.5, and least accurate when p is near 0 or near 1.

Sampling variability Each time we take a random sample from a population, we are likely to get a different set of individuals and calculate a different statistic. This is called sampling variability. If we take a lot of random samples of the same size from a given population, the variation from sample to sample—the sampling distribution—will follow a predictable pattern.

 Example: Proportion  Suppose a large department store chain is considering opening a new store in a town of 15,000 people.  Further, suppose that 11,541 of the people in the town are willing to utilize the store, but this is unknown to the department store chain managers.  Before making the decision to open the new store, a market survey is conducted.  200 people are randomly selected and interviewed. Of the 200 interviewed, 162 say they would utilize the new store.

 Example: Proportion  What is the population proportion p? 11,541/15,000 = 0.77  What is the sample proportion p ^ ? 162/200 = 0.81  What is the approximate sampling distribution (of the sample proportion)? What does this mean?

 Example: Proportions  What does this mean? Population: 15,000 people, p = 0.77 Suppose we take many, many samples (of size 200): 200 p ^ = 0.74 Then we find the sample proportion for each sample. p ^ = 0.78 p ^ = 0.82 p ^ = 0.73 p ^ = 0.76 and so forth…

 Example: Proportion The sample we took fell here. 0.81

 Example: Proportion  The managers didn’t know the true proportion so they took a sample.  As we have seen, the samples vary.  However, because we know how the sampling distribution behaves, we can get a good idea of how close we are to the true proportion.  This is why we have looked so much at the normal distribution.  Mathematically, the normal distribution is the sampling distribution of the sample proportion, and, as we have seen, the sampling distribution of the sample mean as well.

Properties of p-hat When sample sizes are fairly large, the shape of the p-hat distribution will be normal. When sample sizes are fairly large, the shape of the p-hat distribution will be normal. The mean of the distribution is the value of the population parameter p. The mean of the distribution is the value of the population parameter p. The standard deviation of this distribution is the square root of p(1-p)/n. The standard deviation of this distribution is the square root of p(1-p)/n.

 Example: If the proportion showing recovery within 10 minutes of an anesthetic block in a sample of 60 patients is 0.30 (30%), this could be 0.25 in the second sample, and 0.40 in the third sample. In a sample of 60 patients, if p =0.30, the estimated SE of p is √0.30(1.0.30)/60 =

Two Steps in Statistical Inference Process 1.Calculation of “confidence intervals” from the sample mean and sample standard deviation within which we can place the unknown population mean with some degree of probabilistic confidence 2.Compute “test of statistical significance” (Risk Statements) which is designed to assess the probabilistic chance that the true but unknown population mean lies within the confidence interval that you just computed from the sample mean.

 Example: A random sample of 24 students mean weight is 78 kgs. And standard deviation is 16.8 kgs. Find the standard error of sample mean. Answer: Here n = 24, Sd., = 16.8 S.E. of sample mean = sd.,/√n = 16.8/√24 = 3.4

Example: A random sample of 800 medical students were asked, “whether you do daily exercise” ?. Only 128 responded as “YES”. Find the S.E., of the proportion of medical students who do daily exercise. Answer: Here n=800, p = sample of proportion of students responded as “YES”, i.e., p = 128/800 = 0.16 or 16% S.E. of p = √pq/n =√0.16 x 0.84/800 = 0.13