STAT262: Lecture 5 (Ratio estimation)

Slides:



Advertisements
Similar presentations
Introduction Simple Random Sampling Stratified Random Sampling
Advertisements

Statistical Sampling.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Estimation in Sampling
1 STRATIFIED SAMPLING Stratification: The elements in the population are divided into layers/groups/ strata based on their values on one/several.
Chapter 5 Stratified Random Sampling n Advantages of stratified random sampling n How to select stratified random sample n Estimating population mean and.
Simple Linear Regression
Ch 4: Stratified Random Sampling (STS)
Complex Surveys Sunday, April 16, 2017.
Research Methods in MIS: Sampling Design
Who and How And How to Mess It up
Quantitative Methods – Week 6: Inductive Statistics I: Standard Errors and Confidence Intervals Roman Studer Nuffield College
Why sample? Diversity in populations Practicality and cost.
Fundamentals of Sampling Method
Ratio estimation with stratified samples Consider the agriculture stratified sample. In addition to the data of 1992, we also have data of Suppose.
A new sampling method: stratified sampling
Stratified Simple Random Sampling (Chapter 5, Textbook, Barnett, V
Ratio Estimation and Regression Estimation (Chapter 4, Textbook, Barnett, V., 1991) 2.1 Estimation of a population ratio:
STAT 4060 Design and Analysis of Surveys Exam: 60% Mid Test: 20% Mini Project: 10% Continuous assessment: 10%
Formalizing the Concepts: Simple Random Sampling.
5.10: Stratification after Selection of Sample – Post Stratification n Situations can arise in which we cannot place sampling units into their correct.
SAMPLING METHODS. Reasons for Sampling Samples can be studied more quickly than populations. A study of a sample is less expensive than studying an entire.
Sampling Designs Avery and Burkhart, Chapter 3 Source: J. Hollenbeck.
STRATIFIED SAMPLING DEFINITION Strata: groups of members that share common characteristics Stratified sampling: the population is divided into subpopulations.
Mathematical Statistics Lecture Notes Chapter 8 – Sections
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Chapter 7 Estimation: Single Population
Copyright 2010, The World Bank Group. All Rights Reserved. Agricultural Census Sampling Frames and Sampling Section A 1.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Sampling: Theory and Methods
Chapter 7 Sampling and Sampling Distributions Sampling Distribution of Sampling Distribution of Introduction to Sampling Distributions Introduction to.
1 1 Slide Chapter 7 (b) – Point Estimation and Sampling Distributions Point estimation is a form of statistical inference. Point estimation is a form of.
PARAMETRIC STATISTICAL INFERENCE
1 Ratio estimation under SRS Assume Absence of nonsampling error SRS of size n from a pop of size N Ratio estimation is alternative to under SRS, uses.
Scot Exec Course Nov/Dec 04 Survey design overview Gillian Raab Professor of Applied Statistics Napier University.
Agricultural and Biological Statistics. Sampling and Sampling Distributions Chapter 5.
Lecture 6 Forestry 3218 Forest Mensuration II Lecture 6 Double Sampling Cluster Sampling Sampling for Discrete Variables Avery and Burkhart, Chapter 3.
Active Learning Lecture Slides For use with Classroom Response Systems Statistical Inference: Confidence Intervals.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Sampling Design and Analysis MTH 494 Lecture-30 Ossam Chohan Assistant Professor CIIT Abbottabad.
Sampling Design and Analysis MTH 494 LECTURE-12 Ossam Chohan Assistant Professor CIIT Abbottabad.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 7-1 Chapter 7 Sampling Distributions Basic Business Statistics.
1 Chapter 2: Sampling and Surveys. 2 Random Sampling Exercise Choose a sample of n=5 from our class, noting the proportion of females in your sample.
Sampling Design and Analysis MTH 494 Lecture-22 Ossam Chohan Assistant Professor CIIT Abbottabad.
Bangor Transfer Abroad Programme Marketing Research SAMPLING (Zikmund, Chapter 12)
Sampling technique  It is a procedure where we select a group of subjects (a sample) for study from a larger group (a population)
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Probability Sampling. Simple Random Sample (SRS) Stratified Random Sampling Cluster Sampling The only way to ensure a representative sample is to obtain.
Lecture 4 Forestry 3218 Avery and Burkhart, Chapter 3 Shiver and Borders, Chapter 5 Forest Mensuration II Lecture 4 Stratified Random Sampling.
Sampling Design and Analysis MTH 494 LECTURE-11 Ossam Chohan Assistant Professor CIIT Abbottabad.
ESTIMATING RATIOS OF MEANS IN SURVEY SAMPLING Olivia Smith March 3, 2016.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Variability. The differences between individuals in a population Measured by calculations such as Standard Error, Confidence Interval and Sampling Error.
Variability.
Confidence Intervals and Sample Size
Statistical Reasoning for everyday life
ESTIMATION.
Chapter 7 (b) – Point Estimation and Sampling Distributions
Meeting-6 SAMPLING DESIGN
STRATIFIED SAMPLING.
Ratio and regression estimation STAT262, Fall 2017
Stratified Sampling STAT262.
2. Stratified Random Sampling.
2. Stratified Random Sampling.
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Simple Linear Regression
Simple Linear Regression
Estimating population size and a ratio
Lecture # 2 MATHEMATICAL STATISTICS
Presentation transcript:

STAT262: Lecture 5 (Ratio estimation)

Introduction Today we introduce new estimation methods Ratio estimators Ratio estimators involve two characteristics: Y: a characteristic we are interested X: a characteristic that is related with Y The same sampling method: Sample Random Sampling (SRS)

Why consider ratio estimation? In many situations we want to estimate the ratio of two population characteristics. E.g., average yield of corn per acre. Use ratio to assist the estimation of a correlated quantity N is unknown. We can estimate it by Use ratio estimators to increase the precision of estimated means or totals

Examples E.g.1: The average yield of corns per acre E.g.2: The number of hummingbirds in a national forest Sample a few regions, record the number (yi) and area (xi) for each region. Calculate sample ratio Total area of the national forest is tx Estimate of ty is

Examples E.g.3: Laplace wanted to know the number of persons living in France in 1802. There was no census on that year Two candidate estimators Which was Laplace’s choice? # persons # registered births Sample: 30 counties 2,037,615 71,866 France: N (known) ty (???) tx (known)

Examples Laplace reasoned that using ratio estimator is more accurate. Large counties have more registered births. Number of registered births and number of persons are positively correlated. Thus, using information in x is likely to improve our estimate of y.

Examples E.g. 4. McDonal Corp. The average of annual sale of this year One can use information from last year. Details will be discussed later

Ratio estimators in SRS Sampling method: SRS Two quantities (xi, yi) are measured in each sampled unit, where xi is an auxiliary variable

Population quantities Size: N Totals: Means: Ratio: Variances and covariance: Correlation coefficient:

Example of population quantities

Contents Ratio estimator in SRS Bias – the exact expression Bias – an approximated formula MSE and variance Examples Efficiency

Bias – the exact expression Ratio estimators are usually biased

Bias – the exact expression

Bias – the exact expression Exact, but not easy to use with data.

Bias – an approximation

Bias – an approximation

The bias is usually small if

Variance and Mean Squared Error The bias is usually small, thus can be ignored

A hypothetical example Population. N=8

A hypothetical example

A hypothetical example Mean estimate = 39.85036 Bias = -0.003178 Bias approx: Mean estimate = 40

Estimate variance

Estimate variance

Example 1

Example 1

Example 2

Example 2

Example 2

Efficiency of ratio estimation

STAT262: Lecture 6 (Regression estimation)

Regression estimation Ratio estimation works well if the data are well fit by a straight line through the origin Often, data are scattered around a straight line that does not go through the origin

Regression estimation The regression estimator of the population mean is

Bias For large SRS, the bias is usually small

Variance and MSE Bias is small

Variance

Standard error

The McDonald Example

The McDonald Example

Relative Efficiencies

Relative Efficiencies

Relative Efficiencies

Relative Efficiencies

Summary We introduced two new estimators: Ratio estimator: Regression estimator: Both exploit the association between x and y The regression estimator is the most efficient (asymptotically) The ratio estimator is more efficient than the SRS estimator if R is large

Estimation in Domains: A motivating example We are often interested in separate estimatef for subpopulations (also called domains) E.g. after taking an SRS of 1000 persons, we want to estimate the average salary for men and the average salary for women

Estimation in Domains: A motivating example

Estimation in Domains: A motivating example The calculation in the previous slide treats as a constant. But it is not. We should take the randomness into consideration The formulas we derived for ratio estimators can be used

Estimations in Domains

Estimations in Domains

Estimations in Domains If the sample is large

A new sampling method: stratified sampling In stratified sampling, we conduct SRS in each stratum Outline Definition and motivation Statistical inference (theory of stratified sampling) Advantages of stratified sampling Sample size calculation

Stratified sampling: definition and motivation A motivating example: average number of words in save messages of people in this room What is stratified sampling? Stratify: make layers Strata: subpopulations Strata do not overlap Each sampling unit belongs to exactly one stratum Strata constitute the whole population

Why do we use stratified sampling? Be protected from obtaining a really bad sample. Example Population size is N=500 (250 women and 250 men) SRS of size n=50 It is possible to obtain a sample with no or a few males Pr(less than or equal to 15 men in an SRS)=0.003 Pr(less than or equal to 20 men in an SRS)=0.10 In stratified sampling, we can sample 25 men and 25 women

Why do we use stratified sampling? Stratified sampling allows us to compare subgroups Convenient, reduce cost, easy to sample More precise. See the following example

Total number of farm acres (3078 counties) SRS of 300 counties from the Census of Agriculture Estimate: , standard error: Stratified sampling: about 10% stratum (region)

Total number of farm acres (3078 counties) Estimate: Standard error:

Theory of stratified sampling

Notation for Stratification: Population

Notation for Stratification: Sample

Stratified sampling: estimation

Statistical Properties: Bias and Variance

Variance Estimates for stratified samples

Confidence intervals for stratified samples Some books use t distribution with n-H degrees of freedom

Sampling probabilities and weights In a population with 1600 men and 400 women and the stratified sample design specifies sampling 200 men and 200 women, Each man in the sample has weight 8 and woman has weight 2 Each woman in the sample represents herself and 1 other woman not selected Each man represents himself and 7 other men not in the sample

Sampling probabilities and weights The sampling probability for the jth unit in the hth stratum is Sampling weight: The sum of sampling weight is N

Sampling probabilities and weights

Sampling probabilities and weights example

Sampling probabilities and weights in proportional allocation In proportional allocation, the number of sampled units in each stratum is proportional to the size of the stratum, i.e., Every unit in the sample has the same weight and represents the same number of units in the population. The sample is called self-weighting

Sampling probabilities and weights in proportional allocation Sampling probability for all units is about 10% All the weights are the same: 10