Chapter 12: AP Statistics

Chapter 12: AP Statistics
Sample Surveys Chapter 12: AP Statistics

General Idea of Sampling Theory
Sampling allows us to go beyond a batch of data we are given. It allows us to do more with the data than display, describe and summarize. Sampling allows us to model a population and see how they behave. Like all models, it represents the population and does contain some errors.

Review of Models Other Models
Normal Model: Univariate data Symmetric Regression Model: Bivariate data Linear Remember, all models have some natural error if we are estimating or predicting about the population beyond the given data.

Three Important Ideas Needed for Good Sampling
Examine the part of the whole Randomization Correct Sample Size

#1: Examine Part of the Whole
This is the main idea behind sampling. It is nearly impossible to gain information about the entire population. To gain an accurate view of the population, we can select a sample of the population of interest, as long as the sample is representative of the population.

#2: Randomization This might be the most important concept in sampling. It protects you against factors that you know are in the data, as well as unknown factors that might make your sample unrepresentative. IT PROTECTS YOU FROM THE INFLUENCES OF ALL THE FEATURES OF THE POPULATION. It will make sure the sample, on average, looks like the population.

#2: Randomization (cont.)
Not only does randomization protect us from these problems and their bad effects (bias), but it also allows us to draw inferences about the population. If we sample correctly, then we can use the sample to make predictions or draw conclusions about the population.

#3: Sample Size Sample too few, your results will contain a large amount of error. Sample too many, costly. What matters, is the size of the sample (n)—not the size of the population from which the sample came from.* * Except in extreme cases where the population is small—then the sample should not be more than 10% of the population.

#3: Sample Size (cont.) How big should the sample be?
Enough to be representative Usually several hundred people (discussed in depth in Chapter 19)

Sampling Error Since we are dealing with a sample and trying to make a prediction about the population, there will always be “error”—IT IS EXPECTED AND IS NOT A PROBLEM. This expected error is called “Sampling Error” or “Variability”. In a poll it is called “Margin of Error”. It can be reduced (if we feel it is too big), by increasing the sampling size.

Bias—Big Problem with Sampling
This is created by poor sampling methods—we did not follow those three ideas explained earlier. Usually created when a certain segment of the population is over or underrepresented in the sample. Creates results that are different from what they ought to be. It distorts the population and therefore, we cannot draw any conclusions about the population. It is BAD.

Sampling Error vs. Bias Sampling Error Good Always present—expected
Reduced by increasing sample size Repeated polls would give results that are very different from one another Bias Bad Only present when bad sampling methods are used. Reduce by examining sampling methods—is it randomized Repeated polls would miss the truth about the population in the same direction (over or underestimation.)

Types of Good Sampling Methods
Simple Random Sample (SRS) Stratified Sample Cluster Sample Non-Biased Methods Multistage Sample Systematic Sample All incorporate selecting subjects at random—gives each subject, or group of subjects, equal chance of being selected.

Simple Random Sample (SRS)
Gold Standard Method Composed by selecting individuals by chance. Chosen by impersonal choice, which attacks bias. Need to define our sampling frame—list from which sample is drawn. Once we have the sampling frame, we use random numbers to obtain sample.

Stratified Sample Divide the population into homogeneous groups called strata (subjects in strata are similar in some way, but the strata differ from eachother.) We then use a SRS within each strata to produce our sample. Examples of strata would be gender, race, income level, etc Useful if you think you will get different results from different groups or if different groups occur in different proportions in the population.

Cluster Sample Divide the population into heterogeneous groups called clusters (subjects in each cluster are different—clusters are smaller versions of the population—they create a sample that is representative of the population). Once we have the clusters created, we then randomly choose some clusters and perform a census on each cluster (or SRS if multistage). Since each cluster is a good representative of the population, we will get an unbiased sample.

Strata or Cluster? Strata Divide the population into groups of similar individuals so that each strata is different than the others in terms of some attribute. Cluster Divide the population into groups of different individuals that are representative of the population. Each cluster is the same make up of subjects—a smaller version of the entire population.

Systematic Sample Start with a list of “subjects”.
Start with a randomly selected subject on the list and then choose every nth person on the list to be in the sample. Need to make sure that the list is not ordered in any meaningful way that could result in bias.

Types of Biased (Bad) Sampling Methods
Voluntary Response Sample Large groups of individuals are invited to respond. Only those who respond are counted. Internet, mail survey, etc Not representative and therefore is biased Convienience Sample Whoever is available. Standing outside a supermarket Is not representative and therefore is biased

Forms of Bias Undercoverage/Selection Bias Nonresponse Bias
Measurement/Response Bias

Forms of Bias Undercoverage/Selection Bias: Introduced during the selection process—certain individuals are given greater (than intended) probabilities of being selected, or are excluded from the selection process. Failing to include all individuals in the selection process is often called undercoverage. Voluntary Response Samples and Convenience Samples suffer from this form of bias.

Forms of Bias Nonrespondent Bias: Is introduced when individuals (people mostly) refuse to be measured/refused to answer questions. Telephone surveys suffer from this. This type of bias is almost unavoidable, so minimizing ifs effect(s) is important. Always try to “capture” a certain number of subjects—call others if some refuse.

Forms of Bias Measurement/Response Bias: Is introduced when the measurement process tends to give results that differ (systematically) from the population. A common source of measurement bias is wording bias—they way in which a question is worded can often have an effect on the responses. Response bias refers to anything that might affect the measured results—maybe what the interviewer is wearing, or eating, etc

Chapter 12: AP Statistics

Similar presentations

Presentation on theme: "Chapter 12: AP Statistics"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 12: AP Statistics

Similar presentations

Presentation on theme: "Chapter 12: AP Statistics"— Presentation transcript:

Similar presentations

About project

Feedback