Download presentation
Presentation is loading. Please wait.
1
STAT262: Lecture 5 (Ratio estimation)
2
Introduction Today we introduce new estimation methods
Ratio estimators Ratio estimators involve two characteristics: Y: a characteristic we are interested X: a characteristic that is related with Y The same sampling method: Sample Random Sampling (SRS)
3
Why consider ratio estimation?
In many situations we want to estimate the ratio of two population characteristics. E.g., average yield of corn per acre. Use ratio to assist the estimation of a correlated quantity N is unknown. We can estimate it by Use ratio estimators to increase the precision of estimated means or totals
4
Examples E.g.1: The average yield of corns per acre
E.g.2: The number of hummingbirds in a national forest Sample a few regions, record the number (yi) and area (xi) for each region. Calculate sample ratio Total area of the national forest is tx Estimate of ty is
5
Examples E.g.3: Laplace wanted to know the number of persons living in France in There was no census on that year Two candidate estimators Which was Laplace’s choice? # persons # registered births Sample: 30 counties 2,037,615 71,866 France: N (known) ty (???) tx (known)
6
Examples Laplace reasoned that using ratio estimator is more accurate.
Large counties have more registered births. Number of registered births and number of persons are positively correlated. Thus, using information in x is likely to improve our estimate of y.
7
Examples E.g. 4. McDonal Corp. The average of annual sale of this year
One can use information from last year. Details will be discussed later
8
Ratio estimators in SRS
Sampling method: SRS Two quantities (xi, yi) are measured in each sampled unit, where xi is an auxiliary variable
9
Population quantities
Size: N Totals: Means: Ratio: Variances and covariance: Correlation coefficient:
10
Example of population quantities
11
Contents Ratio estimator in SRS Bias – the exact expression
Bias – an approximated formula MSE and variance Examples Efficiency
12
Bias – the exact expression
Ratio estimators are usually biased
13
Bias – the exact expression
14
Bias – the exact expression
Exact, but not easy to use with data.
15
Bias – an approximation
16
Bias – an approximation
17
The bias is usually small if
18
Variance and Mean Squared Error
The bias is usually small, thus can be ignored
19
A hypothetical example
Population. N=8
20
A hypothetical example
21
A hypothetical example
Mean estimate = Bias = Bias approx: Mean estimate = 40
22
Estimate variance
23
Estimate variance
24
Example 1
25
Example 1
26
Example 2
27
Example 2
28
Example 2
29
Efficiency of ratio estimation
30
STAT262: Lecture 6 (Regression estimation)
31
Regression estimation
Ratio estimation works well if the data are well fit by a straight line through the origin Often, data are scattered around a straight line that does not go through the origin
32
Regression estimation
The regression estimator of the population mean is
33
Bias For large SRS, the bias is usually small
34
Variance and MSE Bias is small
35
Variance
36
Standard error
37
The McDonald Example
38
The McDonald Example
39
Relative Efficiencies
40
Relative Efficiencies
41
Relative Efficiencies
42
Relative Efficiencies
43
Summary We introduced two new estimators:
Ratio estimator: Regression estimator: Both exploit the association between x and y The regression estimator is the most efficient (asymptotically) The ratio estimator is more efficient than the SRS estimator if R is large
44
Estimation in Domains: A motivating example
We are often interested in separate estimatef for subpopulations (also called domains) E.g. after taking an SRS of 1000 persons, we want to estimate the average salary for men and the average salary for women
45
Estimation in Domains: A motivating example
46
Estimation in Domains: A motivating example
The calculation in the previous slide treats as a constant. But it is not. We should take the randomness into consideration The formulas we derived for ratio estimators can be used
47
Estimations in Domains
48
Estimations in Domains
49
Estimations in Domains
If the sample is large
50
A new sampling method: stratified sampling
In stratified sampling, we conduct SRS in each stratum Outline Definition and motivation Statistical inference (theory of stratified sampling) Advantages of stratified sampling Sample size calculation
51
Stratified sampling: definition and motivation
A motivating example: average number of words in save messages of people in this room What is stratified sampling? Stratify: make layers Strata: subpopulations Strata do not overlap Each sampling unit belongs to exactly one stratum Strata constitute the whole population
52
Why do we use stratified sampling?
Be protected from obtaining a really bad sample. Example Population size is N=500 (250 women and 250 men) SRS of size n=50 It is possible to obtain a sample with no or a few males Pr(less than or equal to 15 men in an SRS)=0.003 Pr(less than or equal to 20 men in an SRS)=0.10 In stratified sampling, we can sample 25 men and 25 women
53
Why do we use stratified sampling?
Stratified sampling allows us to compare subgroups Convenient, reduce cost, easy to sample More precise. See the following example
54
Total number of farm acres (3078 counties)
SRS of 300 counties from the Census of Agriculture Estimate: , standard error: Stratified sampling: about 10% stratum (region)
55
Total number of farm acres (3078 counties)
Estimate: Standard error:
56
Theory of stratified sampling
57
Notation for Stratification: Population
58
Notation for Stratification: Sample
59
Stratified sampling: estimation
60
Statistical Properties: Bias and Variance
61
Variance Estimates for stratified samples
62
Confidence intervals for stratified samples
Some books use t distribution with n-H degrees of freedom
63
Sampling probabilities and weights
In a population with 1600 men and 400 women and the stratified sample design specifies sampling 200 men and 200 women, Each man in the sample has weight 8 and woman has weight 2 Each woman in the sample represents herself and 1 other woman not selected Each man represents himself and 7 other men not in the sample
64
Sampling probabilities and weights
The sampling probability for the jth unit in the hth stratum is Sampling weight: The sum of sampling weight is N
65
Sampling probabilities and weights
66
Sampling probabilities and weights
example
67
Sampling probabilities and weights in proportional allocation
In proportional allocation, the number of sampled units in each stratum is proportional to the size of the stratum, i.e., Every unit in the sample has the same weight and represents the same number of units in the population. The sample is called self-weighting
68
Sampling probabilities and weights in proportional allocation
Sampling probability for all units is about 10% All the weights are the same: 10
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.