Download presentation
Presentation is loading. Please wait.
Published byDanielle Desroches Modified over 5 years ago
1
Use of Auxiliary Information 12-13 (morning) October 2015 Jorge M
Use of Auxiliary Information (morning) October Jorge M. Mendes CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE EUROPEAN COMMISSION
2
Use of Auxiliary Information Outline
Introduction Why and When use auxiliary information Use of auxiliary information at the design stage 3.1 Qualitative information 3.2 Quantitative information Use of auxiliary information at the estimation stage 4.1 Post-stratification 4.2 Ratio estimation 4.3 Regression estimation 4.4 Other calibration methods
3
1. Introduction
4
Conceptual frame Several methods can be used in order to know more about a certain population, such as: surveys, censuses, experiments, observations, etc. A survey is a planned process of collecting and extrapolating information obtained from a sample. We are inferring about a parameter or a set of parameters using just a fraction of the population (sample).
5
Conceptual frame Population parameters Population Sample parameters
A survey is essentially composed of two tasks: To observe a fraction of a population (sample), To extrapolate the results to the whole population. Population parameters unknown Population Auxiliary information extract estimates Sample parameters (statistics) Sample results
6
Some basic terms used in sampling theory Sampling frames
It is a list, map or other registry where the sampling units are registered. Ideally, the list should be exhaustive and without duplications. It is the list of all elements in the study population that has a chance to be selected into the survey sample. Ex: File of statistical units (companies) in a National Statistical Office.
7
Some basic terms used in sampling theory
Analysis Statistical Units Elements or groups of elements of the study population holding the information we want to know about. Sampling Units Elements of the study population or other group of elements there are selected to the sample (may not exactly match the analysis units).
8
Target population Sampling frame Introduction
9
Some basic terms used in sampling theory Sample
Subset of elements of a study population. In a survey, a sample is selected from which we collect data that is extrapolated to the whole population. s={1, ..., n}
10
Target population Sampling frame Sample Introduction
11
Some basic terms used in sampling theory
Selection of the units from a population that will constitute the sample. Sampling design It is the procedure describing the way sampling units are to be selected.
12
Some basic terms used in sampling theory Representative sample
This expression denotes a sample that allows inference to a population with an accuracy considered adequate to the study objectives.
13
Survey It is a systematic method for gathering information from entities for the purposes of constructing quantitative descriptors of the attributes of the large population. However… sometimes surveys attempt to measure everyone in a population and sometimes just a subset of the population.
14
Sample survey When we collect data on a subset of the study population. It is used to infer about a parameter or set of parameters of a certain population using information obtained from a limited subset of population elements (sample). It corresponds to obtain an approximate image of a population, using only a subset of it.
15
Introduction Auxiliary information:
It is information available regarding the target population, beyond the one in the sample. Auxiliary information recorded from the population elements can be used: - To design the sampling scheme; - To assist the estimation of the study variable(s). The deeper the knowledge about a population the better will be a sampling survey…
16
Introduction Auxiliary information can be classified into two categories: 1 – Knowledge of population totals, or means, of characteristics that are observed on the elements of the sample but not all elements of the population; Example: The consumption expenditure distribution of the population of a European country may be treated as known on the basis of recent census, but the consumption expenditure of people in a sample of households persons is not known until the households are contacted. The consumption expenditure of nonsampled people is unknown.
17
Introduction Auxiliary information can be classified into two categories: 2 – Knowledge of characteristics for every element in the population. Example: The geographic location of the households on an address list. Remark: Auxiliary information can be available for every sampling unit or only at the aggregated population level.
18
Introduction Where can auxiliary information be obtained?
19
Introduction Where can auxiliary information be obtained?
Sampling frame Censuses Administrative records Previous surveys … Examples?
20
Introduction So, we assume that one (X) or a set (X1, X2, …, Xp) of auxiliary variables are available: For each of the N population elements, so that the values x1, x1, …, xn are at our disposal. At the population level, so some parameters x , x , Y , … are known. Remark: Some of the auxiliary variables might be categorical and some quantitative.
21
2. Why and When use auxiliary information
22
When and Why Why do we use auxiliary information?
We use auxiliary information in order to: To attain a manageable and efficient sampling scheme; For example, the use of auxiliary information to carry out stratification can lead to small within-stratum variation. To improve the efficiency of estimators. For example, the use of calibration methods can produce estimates that are close to the corresponding population values, decreasing the design variances of the estimators.
23
When and Why When do we use auxiliary information? A priori…
When we have auxiliary information at our disposal we seek to use it a priori , that is, to design the sampling scheme. This is the case, for example, of stratified sampling and cluster sampling. A posteriori… The calibration methods use auxiliary information a posteriori, that is, at the estimation stage, to improve estimates quality as well.
24
3. Use of auxiliary information at the design stage
25
3.1. Qualitative information
3.1.1 Stratified random sampling Cluster sampling
26
Stratified random sampling Introduction
The main aspects of a stratified sampling are: To divide the population into non-overlapping subpopulations (strata); To select random samples independently from each stratum; To estimate population parameters, using the estimates for each subpopulation (stratum).
27
Stratified random sampling Introduction
The population U is partitioned into H subpopulations (strata), from which independent samples are drawn : U sh Uh (…) UH U1 U2 s1 s2 sH
28
Stratified random sampling Introduction
29
Stratified random sampling Correlation ratio
Let X be a nominal variable with H categories defining a partition over U into H strata. The square of the correlation ratio, , between X and Y is defined by:
30
Stratified random sampling Use of auxiliary information
In order to select an efficient stratified sampling, it is very important to choose the characteristics used for subdividing the population into strata. For example, it could be used NUTSII, NUTSIII, age groups, income groups, activity sector, …
31
Stratified random sampling Use of auxiliary information
What kind of variables should be used to stratification purposes? Qualitative variables or discretised quantitative variables: Whose distribution is known in population; Associated with the research variables (this allows for grouping observations in homogeneous strata) That are groups of interest for the study (subpopulations we want to make inference about).
32
Stratified random sampling Use of auxiliary information
Which stratification variables should be used? Demographic or socio-economic variables such as habitat type, gender, age groups, income groups etc., are often used to stratify target populations consisting of individuals. Economic variables like economic sector, sales amounts, number of employees, type of corporation, etc. are used to stratify target populations consisting of companies. Geographical variables such as country, NUTSII, NUTSIII, region, county, etc., are used to stratify both populations of individuals and companies.
33
Stratified random sampling Use of auxiliary information
So, to carry out stratification, appropriate auxiliary information is required in the sampling frame: Geographical variables (country, NUTSII, …); Demographic variables (age group, marital status, …); Socioeconomic variables (income group, activity sector, …). Remark: Information for the stratification can sometimes be inherent in the population. Example: Strata may be clearly identified if a country is divided into regional administrative areas that are non-overlapping.
34
3.1. Qualitative information
3.1.1 Stratified random sampling Cluster sampling
35
Cluster sampling Introduction
It is uneconomic to visit a sample of elements drawn from a large geographical area; Cluster sampling reduces the cost of data collection: Sample enumeration areas and households within them; Sample schools and children within them, Cluster sampling is also useful when the sampling frame lists clusters but not units: Frame of addresses but not individuals; Frame of firms but not firm workers.
36
Cluster sampling Introduction
What are the reasons for a cluster sampling? Inexistence or deficiencies of the sampling frame; Kind of collected data; Cost considerations; Feasibility considerations.
37
Cluster sampling Introduction
Suppose the population U is partitioned into M clusters denoted by Ug, of size Ng , g=1, …, M. yig: value of the ith unit in cluster g (i=1, …, Ng) Size of U:
38
Cluster sampling Introduction
U U U (…) Ug (…) UM U s s (…) sg (…) sm
39
Cluster sampling Introduction
40
Cluster sampling Sampling with equal probability
Examples: - Lote controll - Medical studies - Passengers surveys - Ecological studies - Marked research - Socio-economic studies - Polling surveys
41
Cluster sampling Sampling with equal probability
Special case: clusters of equal size
42
Cluster sampling Sampling with equal probability
Design Effect Special case: clusters of equal size
43
Cluster sampling Sampling with equal probability
Favorable conditions for cluster sampling: • heterogeneous clusters ( small). • clusters with small (and similar) sizes. • select a maximum possible of clusters. • g should have low dispersion. • previous stratification of clusters.
44
Cluster sampling Use of auxiliary information
The practical aspects of sampling and data collection are the main motivation for the use of auxiliary information in cluster sampling. What kind of auxiliary information is needed? Auxiliary information related to the grouping of the population elements into clusters; The properties of the clusters (if stratification is possible): - Previous stratification of clusters; - Stratification of secondary units within clusters.
45
3.2. Quantitative information
3.2.1 Stratified random sampling Unequal probability sampling 3.2.3 Cluster sampling
46
Stratified random sampling Introduction
To fully benefit from the gains in efficiency of stratified sampling, it is important not only to be careful when selecting stratification variables but also to appropriately allocate the total sample to the strata. So, if there are auxiliary information available at population o strata level (means, totals, variances), we should use those information to compute the best strata allocation.
47
Stratified random sampling Introduction
Alternative allocations under a stratified random sampling: Proportional allocation; Optimum allocation; X-Optimum allocation; Allocation proportional to the Y-total; Allocation proportional to the X-total; Optimum allocation under fixed costs. Make use of auxiliary information
48
Stratified random sampling Proportional allocation
49
Stratified random sampling Optimal or Neyman allocation
If the variability within strata of the study variable, Sh2, are known, we can define the following objective:
50
Stratified random sampling Optimal or Neyman allocation
Remark:
51
Stratified random sampling Optimal or Neyman allocation
The most common forms are: Using information obtained in previous surveys on the same or similar matters to approximate the standard deviations in each subpopulation. Using a known variable correlated with the interest variable to approximate the standard deviations in each subpopulation; the larger the correlation between the two variables the closer we get to the optimum sample allocation. Perform a low cost preliminary survey to obtain initial estimates of the unknown parameters.
52
Stratified random sampling Optimal or Neyman allocation
Under the Neyman allocation, the variance of the mean estimator is smaller than in proportional allocation, except if the dispersion (variance) is the same across all strata.
53
Stratified random sampling Optimum allocation under fixed costs
If the variability within strata of the study variable, Sh2, are known, and the total cost (or budget) of the survey is fixed, C0, we can define the following objective:
54
Stratified random sampling Optimum allocation under fixed costs
55
Stratified random sampling X-Optimal allocation
Suppose that X is an auxiliary variable, highly correlated with Y, and variability within strata of X, Sxh2, is known. The X-optimum allocation is obtained as: Remark: If the correlation between X and Y is perfect, then this allocation is in fact optimal.
56
Stratified random sampling Allocation proportional to the Y-total
Suppose that population and strata totals of Y, and h, are known, and the interest variable is always positive. The allocation proportional to the Y-total is obtained as: Remark: This allocation is the Neyman allocation if the coefficient of variation of Y is constant in all strata.
57
Stratified random sampling Allocation proportional to the X-total
Suppose that: X is an auxiliary variable, highly correlated with Y, population and strata totals of X, x and xh, are known, the interest variable is always positive. The allocation proportional to the X-total is obtained as: Remark: If X is highly correlated with Y and if the CV of X is about the same in all strata, then this allocation should be not far from optimal.
58
Stratified random sampling Example
Consider a population of N=800 elements, from which it is known the following data: Strata U1 U2 U3 U4 Nh 100 120 200 380 Sh 9,6 6,2 3,1 1,8 h 30 20 10 5 Sxh 15,7 11,0 7,6 4,2 xh 88 58 33 16 Ch 1,44 2,25 4,00 6,25
59
Stratified random sampling Example
a) Proportional allocation:
60
Stratified random sampling Example
b) Neyman allocation:
61
Stratified random sampling Example
c) Optimum allocation under fixed costs:
62
Stratified random sampling Example
d) X-optimum allocation: N=800 ; n=40 h=1, 2, 3, 4
63
Stratified random sampling Example
e) Allocation proportional to the Y-total: N=800 ; n=40 h=1, 2, 3, 4
64
Stratified random sampling Example
f) Allocation proportional to the X-total: N=800 ; n=40 h=1, 2, 3, 4
65
3.2. Quantitative information
3.2.1 Stratified random sampling Unequal probability sampling 3.2.3 Cluster sampling
66
Unequal probability sampling Introduction
Equal selection probabilities lead to simple estimators but are not typical of survey sampling… So, most designs used in practice are unequal probability designs, because they are usually more efficient. The main objective consists on using auxiliary information at a design stage of a survey to create a sampling design that increases the precision of the Horvitz-Thompson estimator.
67
Unequal probability sampling The Horvitz-Thompson estimator
The Horvitz-Thompson estimator is a general estimator for a population total, which can be used for any probability sampling plan. Let be: the probability that the ith element of the population will be included in the sample the probability that both of the elements ith and jth of the population will be included in the sample
68
Unequal probability sampling The Horvitz-Thompson estimator
The Horvitz-Thompson estimator (or -estimator) of the population total is given by: If i>0 (i U) then: Remark: i=1, …, N
69
Unequal probability sampling The Horvitz-Thompson estimator
The variance of the Horvitz-Thompson estimator is given by: where Under an unequal probability sampling with n fixed:
70
Unequal probability sampling The Horvitz-Thompson estimator
The Horvitz-Thompson estimator of the population mean is given by: This estimator is unbiased and its (estimated) variance is obtained as:
71
Unequal probability sampling Probability Proportional-to-Size sampling
Considering the well-known HT estimator, there is a lot of choices for the i, for a fixed sample size n, proportional to a positive size measures, yi or xi. The sampling schemes used to implement these kind of choices are called probability proportional-to-size (PPS) designs. These designs could be with or without replacement, but we going to show the without-replacement case.
72
Unequal probability sampling Probability Proportional-to-Size sampling
2) Probability proportional to X If Y is approximately proportional to X, we can define the following probability such that i xi : Remark: Since must be satisfied, we must assume that
73
Unequal probability sampling Probability Proportional-to-Size sampling
Remark: If the requirement i ≤1 is not satisfied, we could set i =1 for all i such that and let i be proportional to X for the remaining elements i. That is: For i U-A, where A is the set of elements such that
74
Unequal probability sampling Probability Proportional-to-Size sampling
Comments: The HT estimator will have a small variance if we use selection probabilities proportional to known auxiliary variables, when X is more or less proportional to Y.
75
3.2. Quantitative information
3.2.1 Stratified random sampling Unequal probability sampling 3.2.3 Cluster sampling
76
Cluster sampling Introduction
Auxiliary information in cluster sampling could be used not only due to practical aspects of sampling and data collection, but also in order o take advantages from different selection probabilities. The clusters could be selected with: Equal probabilities; Probabilities proportional to size; Probabilities proportional to auxiliary variables. Make use of auxiliary information
77
Cluster sampling Equal probability
If the g clusters are selected without replacement:
78
Cluster sampling Probability Proportional-to-Size
If the size of the population clusters are known (even roughly) as auxiliary information, we can use this information to select clusters proportional-to-size.
79
Cluster sampling Probability proportional to the total of auxiliary variables
If there is an auxiliary variable strictly proportional to the target variable, we can use this information to select clusters proportional to the total of that auxiliary variable.
80
Cluster sampling Example
Suppose that a state government whishes to estimate the monthly average worker’s wage in a specific private sector. There are N=50 workers in this private sector, working for M=10 independent companies. A Cluster sampling of m=4 companies was drawn and all workers of these companies were observed. Assume that the amount of income taxes are known for that group of workers; The data about the number of workers in each company, total amount of wages and income taxes are presented in the following table.
81
Cluster sampling Example
Company Number of workers Total amount of wages Total amount of income taxes 1 5 6.500 3.450 2 6 8.750 4.520 3 4 3.730 1.980 8 9.455 4.570 3.500 1.750 10.200 5.530 7 4.290 2.025 8.500 4.010 9 4.300 2.190 10 11.700 6.570
82
Cluster sampling Example
What advantages could be achieved if one uses auxiliary information to select clusters, rather than drawing a sample without using it? What are the probabilities of selection using… Equal probabilities; Probabilities proportional to size; Probabilities proportional to auxiliary variables.
83
Cluster sampling Example
Probabilities of selection Company Equal Probabilities Probab. Proportional Size Probab. Proportional of auxiliary var. 1 0,40 0,38 2 0,48 0,49 3 0,32 0,22 4 0,64 0,50 5 0,16 0,19 6 0,60 7 8 0,44 9 0,24 10 0,72
84
4. Use of auxiliary information at the estimation stage
85
Introduction In the techniques discussed so far, auxiliary information of the population elements was used in the sampling phase to attain an efficient sampling design. From this point forward, auxiliary information will be used to obtain better estimates of the parameters of interest, relative to the estimates calculated with estimators based on the sampling design used.
86
Introduction There are some principles common to all calibration methods: We know the population mean or population total of one or more quantitative auxiliary variables, or the distribution of some auxiliary qualitative variables; The sampling design did not take that information into account; We realize a posteriori that the sample does not reconstruct the known statistics for auxiliary variables.
87
Introduction That information is used in estimation process, i.e., we calibrate the estimators by adjusting the estimation weights; The calibrated estimation weights (extrapolation coefficients) should be as close as possible to the original weights. Among all calibration (or model-assisted) techniques, the most well known are: post-stratification estimation, ratio estimation, regression estimation, multi-criteria calibration.
88
4.1. Post-stratification estimation
4.1.1 Simple random sampling 4.1.2 Stratified random sampling Generic sampling design
89
Post-stratification estimation
This method uses auxiliary information about the sizes of the subpopulation (post-strata) to stratify the sample. After a sampled has been drawn it is divided in mutually exclusive and exhaustive categories (G post-strata), for which the population totals are known, Ng (g=1, …, G). Based on this division and on the information about categories sizes the estimation process follows the same procedure as the a priori stratified sampling.
90
Post-stratification estimation
When categories are used to construct an estimator, they are called post-strata. For individuals the post-strata could be age/gender groups or other socio-economic groups. For business the post-strata could be size classes, economic activity classes, etc.
91
Post-stratification estimation
But, some conditions should be taken into account in order to use poststratification: Some auxiliary information should be available (at least the classes sizes, but preferably the total of one or more auxiliary variable); Auxiliary variables should be correlated with survey variables; The auxiliary variables statistics should be reliable.
92
Post-stratification estimation Simple random sampling
If the original sample is a SRS, the sample observed in each post-stratum is a SRS of elements in that post-stratum. Let us explain this case: Population U Sample with size n, with equal probabilities is the mean of a variable Y:
93
Post-stratification estimation Simple random sampling
94
Post-stratification estimation Simple random sampling
The Post-stratified estimator of the population mean is given by:
95
Post-stratification estimation Simple random sampling: properties of
$ , m y post The Post-stratified estimator is an unbiased estimator of y: and its variance is given by:
96
Post-stratification estimation Comparison with other sampling designs
97
Post-stratification estimation Comparison with other sampling designs
98
Post-stratification estimation Comparison with other sampling designs
99
Post-stratification estimation Comparison with other sampling designs
100
Post-stratification estimation Simple random sampling: comments
101
Post-stratification estimation Simple random sampling: comments
102
Post-stratification estimation Simple random sampling: comments
103
Post-stratification estimation Simple random sampling: example
104
Post-stratification estimation Simple random sampling: example
105
Post-stratification estimation Simple random sampling: example
5) Assume that the variance of the interest variable in each class of employees, as well as the global variance are known:
106
Post-stratification estimation Simple random sampling: example
6) The data collected under a Simple Random Sampling is presented in the following table: Company Financial debts ( yi) Class of number of employees 1 15.000 2 10.000 3 12.000 4 14.000 5 6 22.000 7 8 17.000
107
Post-stratification estimation Simple random sampling: example
A) If the total of financial debts it estimated using the basic estimator under a simple random sampling (disregarding the auxiliary information) we have:
108
Post-stratification estimation Simple random sampling: example
The variance of this estimator is: The absolute and relative precision of the estimate (=95%) will be:
109
Post-stratification estimation Simple random sampling: example
B) If the total of financial debts it estimated using the post-stratified estimator, we have to estimate first the number of companies in each class:
110
Post-stratification estimation Simple random sampling: example
111
Post-stratification estimation Simple random sampling: example
From the previous estimates the post-stratified estimate of the total is:
112
Post-stratification estimation Simple random sampling: example
Remark:
113
Post-stratification estimation Illustration Use of Auxiliary Information Workshop Post-stratification section
114
4.1. Post-stratification estimation
4.1.1 Simple random sampling 4.1.2 Stratified random sampling Generic sampling design
115
Post-stratification estimation Stratified random sampling
116
Post-stratification estimation Stratified random sampling
117
Post-stratification estimation Stratified random sampling
118
Post-stratification estimation Stratified random sampling
And for a Proportional Stratified Sampling: where The Post-stratified estimator of the population mean is given by:
119
Post-stratification estimation Stratified random sampling: properties of
$ , m y post The Post-stratified estimator is an asymptotically unbiased estimator of y:
120
Post-stratification estimation Stratified random sampling: example
Consider the previous example… Suppose we want to estimate the total of bank debts of a population composed by N=100 companies, but assume that the population of companies is stratified by region (1-rural ; 2-urban); Assume that there are 70 companies into stratum 1 and 30 companies into stratum 2;
121
Post-stratification estimation Stratified random sampling: example
Suppose that a proportional stratified sampling was carried out to select n=10 companies were selected (without taking into account the number of employees), for which the total of financial debts was observed, as well as the number of employees; Sample weights:
122
Post-stratification estimation Stratified random sampling: example
Suppose that the joint distribution of the number of companies under the region (strata) and the class of number of employees is given by: But the stratification had not taken into account the class of number of employees! Class of number of employees 1-Until 99 employees 2-100 or more employees Region 1-Rural 20 10 2-Urban 40 30
123
Post-stratification estimation Stratified random sampling: example
7. Furthermore, assume that it is known the following parameters (only for illustration purposes): h g Nhg mhg 1 20 9.464 2 10 13.044 40 13.378 30 15.996 h Nh 1 30 2 70 g Ng mg 1 60 12.073 2 40 15.258
124
Post-stratification estimation Stratified random sampling: example
8) The data collected under a Proportional Stratified Sampling is presented in the following table: Company Financial debts ( yi) Strata Class of number of employees 1 10.000 2 12.000 3 14.000 4 15.000 5 6 14.500 7 8 22.000 9 17.000 10 19.000
125
Post-stratification estimation Stratified random sampling: example
A) If the total of financial debts it estimated using the basic estimator under a stratified random sampling (disregarding the auxiliary information) we have:
126
Post-stratification estimation Stratified random sampling: example
B) If the total of financial debts it estimated using the post-stratified estimator, we have to estimate first the number of companies in each class: a) Until 99 employees:
127
Post-stratification estimation Stratified random sampling: example
b) Until 99 employees: But there are 60 employees in the first class and 40 employees in the second class !
128
Post-stratification estimation Stratified random sampling: example
From the previous estimates the post-stratified estimate of the total is:
129
Post-stratification estimation Stratified random sampling: example
Remark:
130
4.1. Post-stratification estimation
4.1.1 Simple random sampling Stratified random sampling Generic sampling design
131
Post-stratification estimation Generic sampling design
Post-stratified estimator: Implicit model:
132
4.2. Ratio estimation 4.2.1 Simple random sampling
4.2.2 Stratifies random sampling Generic sampling design
133
Ratio estimation
134
Ratio estimation
135
Ratio estimation Simple random sampling
For the simple random sampling: is a ratio estimated by:
136
Ratio estimation Simple random sampling
137
Ratio estimation Simple random sampling
138
Ratio estimation Simple random sampling
Remark: The implicit model underlying the ratio estimator is that the relationship between Y and X is a straight line through the origin.
139
Post-stratification estimation Illustration Use of Auxiliary Information Workshop Ratio estimation section
140
4.2. Ratio estimation 4.2.1 Simple random sampling
4.2.2 Stratified random sampling Generic sampling design
141
Ratio estimation Stratified random sampling
142
Ratio estimation Stratified random sampling
143
Ratio estimation Stratified random sampling: comments
144
Ratio estimation Stratified random sampling: example
Consider a population of N=800 companies from different regions, from which a sample of n=40 companies was drawn; The population is stratified by region, Uh, h=1, ..., 4. Stratum U1 U2 U3 U4 Nh 100 120 200 380 xh 10,2 7,8 5,7 3,8 10,4 9,3 7,3 5,2 9,8 7,5 5,4 3,5 20,3 16,4 9,9 5,1 19,6 10,8 5,8 3,6 17,7 11,5 6,0 3,3
145
Ratio estimation Stratified random sampling: example
Goal: Estimate the average annual sales at a survey time, y. Auxiliary information: Total sales at sampling frame constitution
146
Ratio estimation Stratified random sampling: example
a) If it is used the mean estimator (under a STP), the estimate for the average annual sales at a survey time is:
147
Ratio estimation Stratified random sampling: example
b) Alternatively the average annual sales can be estimated using a combined ratio estimator such as:
148
Ratio estimation Stratified random sampling: example
c) But, since the total sales at sampling frame constitution is known for every stratum it is possible to use the separate ratio estimator to estimate the average annual sales:
149
4.2. Ratio estimation 4.2.1 Simple random sampling
4.2.2 Stratified random sampling Generic sampling design
150
Ratio estimation Generic sampling design
First method: Implicit model:
151
Ratio estimation Generic sampling design
Second method: Implicit model:
152
4.3. Regression estimation
4.3.1 Simple random sampling Stratified random sampling Generic sampling design
153
Regression estimation
154
Regression estimation Simple random sampling
The regression estimator for the mean is given by:
155
Regression estimation Simple random sampling
and the estimator of the regression coefficients is given by:
156
Post-stratification estimation Illustration Use of Auxiliary Information Workshop Regression estimation section
157
4.3. Regression estimation
4.3.1 Simple random sampling Stratified random sampling Generic sampling design
158
Regression estimation Stratified random sampling
a) First method (combined regression estimator of the mean) where and and the estimator of the regression coefficients is:
159
Regression estimation Stratified random sampling
b) Second method (separate regression estimator of the mean) where and and the estimator of the regression coefficients is:
160
Regression estimation Stratified random sampling: example
Consider a population of N=800 companies from different regions, from which a sample of n=40 companies was drawn; The population is stratified by region, Uh, h=1, ..., 4. Stratum U1 U2 U3 U4 Nh 100 120 200 380 xh 10,2 7,8 5,7 3,8 10,4 9,3 7,3 5,2 9,8 7,5 5,4 3,5
161
Regression estimation Stratified random sampling: example
a) If it is used the combined regression estimator, the estimate for the average annual sales at a survey time is:
162
Regression estimation Stratified random sampling: example
b) If it is used the separate regression estimator, the estimate for the average annual sales at a survey time is:
163
4.3. Regression estimation
4.3.1 Simple random sampling Stratified random sampling Generic sampling design
164
Regression estimation Generic sampling design
The combined regression estimator for the mean is given by:
165
4.4. Other calibration methods
166
Other calibration methods
167
Other calibration methods
168
Other calibration methods
169
Other calibration methods
170
Other calibration methods
171
Other calibration methods
172
Other calibration methods
173
Other calibration methods
174
Use of Auxiliary Information 12-13 (morning) October 2015 Jorge M
Use of Auxiliary Information (morning) October Jorge M. Mendes CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE EUROPEAN COMMISSION
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.