Review Law of averages, expected value and standard error, normal approximation, surveys and sampling.

Slides:



Advertisements
Similar presentations
Chapter 6 – Normal Probability Distributions
Advertisements

Introduction to Sampling (Dr. Monticino). Assignment Sheet  Read Chapter 19 carefully  Quiz # 10 over Chapter 19  Assignment # 12 (Due Monday April.
6-1 Stats Unit 6 Sampling Distributions and Statistical Inference - 1 FPP Chapters 16-18, 20-21, 23 The Law of Averages (Ch 16) Box Models (Ch 16) Sampling.
Stick Tossing and Confidence Intervals Asilomar - December 2006 Bruce Cohen Lowell High School, SFUSD
Ch. 17 The Expected Value & Standard Error Review of box models 1.Pigs – suppose there is a 40% chance of getting a “trotter”. Toss a pig 20 times. –What.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Ch.18 Normal approximation using probability histograms Review measures center and spread –List of numbers (histogram of data) –Box model For a “large”
Suppose we are interested in the digits in people’s phone numbers. There is some population mean (μ) and standard deviation (σ) Now suppose we take a sample.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Ch.18 Normal approximation using probability histograms Review measures of center and spread For a “large” number of draws, a histogram of observed sums.
Chapter Sampling Distributions and Hypothesis Testing.
The Excel NORMDIST Function Computes the cumulative probability to the value X Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc
Ch. 17 The Expected Value & Standard Error Review box models –Examples 1.Pigs – assume 40% chance of getting a “trotter” -20 tosses 2.Coin toss – 20 times.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Statistical inference Population - collection of all subjects or objects of interest (not necessarily people) Sample - subset of the population used to.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Chapter 10: Estimating with Confidence
Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?
Chapter 11: Random Sampling and Sampling Distributions
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 10 Sampling Distributions.
Review of normal distribution. Exercise Solution.
A P STATISTICS LESSON 9 – 1 ( DAY 1 ) SAMPLING DISTRIBUTIONS.
Chapter 24 Survey Methods and Sampling Techniques
1 Ch6. Sampling distribution Dr. Deshi Ye
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Sampling Distributions
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 6 Sampling Distributions.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Random Sampling, Point Estimation and Maximum Likelihood.
Introduction to Inferential Statistics. Introduction  Researchers most often have a population that is too large to test, so have to draw a sample from.
PARAMETRIC STATISTICAL INFERENCE
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
Sampling Distributions & Standard Error Lesson 7.
FPP Expected Values, Standard Errors, Central Limit Theorem.
7-1 Introduction The field of statistical inference consists of those methods used to make decisions or to draw conclusions about a population. These.
Sampling Distributions Chapter 7. The Concept of a Sampling Distribution Repeated samples of the same size are selected from the same population. Repeated.
Ka-fu Wong © 2003 Chap 8- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Chapter 3 Numerically Summarizing Data 3.2 Measures of Dispersion.
Anthony J Greene1 Where We Left Off What is the probability of randomly selecting a sample of three individuals, all of whom have an I.Q. of 135 or more?
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
Distributions of the Sample Mean
Biostatistics Unit 5 – Samples. Sampling distributions Sampling distributions are important in the understanding of statistical inference. Probability.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
Physics 270 – Experimental Physics. Let say we are given a functional relationship between several measured variables Q(x, y, …) x ±  x and x ±  y What.
Chapter 18: Sampling Distribution Models
Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O’Halloran.
Ka-fu Wong © 2003 Chap 6- 1 Dr. Ka-fu Wong ECON1003 Analysis of Economic Data.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Sampling and Sampling Distributions Basic Business Statistics 11 th Edition.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
Basic Business Statistics
Introduction to Inference Sampling Distributions.
Sampling Distributions Sampling Distributions. Sampling Distribution Introduction In real life calculating parameters of populations is prohibitive because.
INFERENTIAL STATISTICS DOING STATS WITH CONFIDENCE.
Sec 6.3 Bluman, Chapter Review: Find the z values; the graph is symmetrical. Bluman, Chapter 63.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Tests of Significance We use test to determine whether a “prediction” is “true” or “false”. More precisely, a test of significance gets at the question.
The expected value The value of a variable one would “expect” to get. It is also called the (mathematical) expectation, or the mean.
The normal approximation for probability histograms.
Review Statistical inference and test of significance.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
Review Design of experiments, histograms, average and standard deviation, normal approximation, measurement error, and probability.
Sampling Distribution of the Sample Mean
Estimating standard error using bootstrap
Sampling Distributions
Accuracy of Averages.
Chapter 4 (cont.) The Sampling Distribution
Presentation transcript:

Review Law of averages, expected value and standard error, normal approximation, surveys and sampling

Basic concepts

Example According to genetic theory, there is very close to an even chance that both children in a two-child family will be of the same sex. Which is more likely? (i) 15 couples have two children each. In 10 or more of the families, both children are of the same sex. (ii) 30 couples have two children each. In 20 or more of the families, both children are of the same sex. Answer: (i). With more families, the percentage is more and more close to the even chance (50%), then 10/15 = 20/30 = 2/3 is less likely to happen.

Example A die will be thrown some number of times, and the object is to guess the total number of spots. There is a one-dollar penalty for each spot that the guess is off. For instance, if you guess 200 and the total is 215, you lose $15. Which do you prefer: 50 throws, or 100? Answer: 50. The best number to be guessed is the expected value. Then the larger number of throws, the chance error is likely to be larger (you lose more money).

Remark The SE for number and SE for percentage behave quite differently: The SE for number will go up like the square root of the number of draws. The SE for percentage will go down like the square root of the number of draws.

Basic concepts What is a probability histogram? A graph represents probability/chance, not data. What is the relationship between the empirical histogram for the observed data and the ideal probability histogram? If the chance process is repeated many times, the empirical histogram converges to the probability histogram. In general, the process is about the sum of draws. What if the process is about the product of draws? The convergence still applies.

Basic concepts What is the central limit theorem? When drawing at random with replacement from a box, the probability histogram for the sum will follow the normal curve, provided that the histogram must be put into standard units and the number of draws must be reasonably large. What if the process is about the product of draws? The convergence fails to apply.

Basic concepts What is a population? What is a sample? In a survey, a population is the group of subjects that we want to study. A sample is part of the population. It will represent some properties of the population. We study samples, when it is impractical to study the whole population. What is a parameter? A parameter is a numerical fact about a population. Usually a parameter cannot be determined exactly, but can only be estimated.

Basic concepts What is a statistic? A statistic is an estimate to the parameter, and it can be computed from a sample. A statistic is what we know. The parameter is what we want to know. What are the two main bias we studied in class? The selection bias and the non-response bias.

Basic concepts How do we determine if there is selection bias in a survey? There is discretion on the part of interviewers, there is discretion on the part of investigator or survey designer, the process does not involve probability theory so that the chance for each individual is not even, and so on. How do we determine if there is non-response bias in a survey? The life style of the non-respondents can be very different from the respondents, we may also calculate the non-response rate: personal interviews 65% and mailed questionnaires 25%. (threshold)

Basic concepts What is the best method to draw a sample in a survey? The probability methods. What is the simplest probability method? The simple random sampling. Is it practical? No. The length of the name list is too long, it is not easy to send out interviewers to find the selected individuals, and so on. What other probability methods we have studied? Multistage cluster sampling and random digit dialing(RDD) from telephone survey.

Basic concepts According to the equation: statistic = parameter + chance error, what is sample percentage and what is population percentage in a sampling process? Do they have to be equal? Population percentage is the parameter or the expected value. Sample percentage is the statistic or the estimate, and it is often off by a chance error which is measured by SE for percentage. According to the square root law, what determines the accuracy of the sampling process? When the sample is only a small part of the population, it is the sample size which mainly determines the accuracy. The population size has almost no influence on it.

Calculation and formula

Models and Examples

Normal approximation

Models and Examples Sampling process: A group of 50,000 tax forms has an average gross income of $37,000, with an SD of $20,000. About 20% of the forms have a gross income over $50,000. A group of 900 forms is chosen at random for audit. Q1: estimate the probability that between 19% and 21% of the forms chosen for audit have gross incomes over $50,000. Q2: estimate the probability that the total gross income of the audited forms is over $33,000,000. (This question can be also translated into average version: the average gross income is over $33,000,000/900.)

Solutions

Example Suppose in a calculus test, a group of 10,000 students has an average score of 70, with an SD of 10. A group of 400 students is chosen for sampling. Q: Assume the scores follow the normal distribution, can we estimate the probability that between 14% and 18% of the students chosen for sampling have scores above 80? Answer: Yes, we can!

Solution In the box model, we have 10,000 tickets, average = 70, SD = 10, 400 draws. This problem is about determine whether each student have a score above 80 or not. So it is a counting process. We need a new box model: 0-1 box. There are 10,000 tickets. But we don’t know the composition percentage of 1’s and 0’s. We first have to use the normal curve to estimate it. Because we have the assumption that the data follow the normal curve. In the original box, with average = 70, SD = 10, the score 80 is converted to 1 in standard units. From the normal table, to the right of 1, it is about 16%. That is, in the population of 10,000 students, there are about 16% of the students have scores above 80. So in the new 0-1 box, 16% are 1’s, rest are 0’s.

Solution

Good Luck!