Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICES 2007 Labour Cost Index and Sample Allocation Outi Ahti-Miettinen and Seppo Laaksonen Statistics Finland (+ University of Helsinki) Labour cost index.

Similar presentations


Presentation on theme: "ICES 2007 Labour Cost Index and Sample Allocation Outi Ahti-Miettinen and Seppo Laaksonen Statistics Finland (+ University of Helsinki) Labour cost index."— Presentation transcript:

1 ICES 2007 Labour Cost Index and Sample Allocation Outi Ahti-Miettinen and Seppo Laaksonen Statistics Finland (+ University of Helsinki) Labour cost index and its measurement requirements Sample allocation: Power, Neyman-Tschuprow Empirical results and which principles are followed in the first year sample selection. Content

2 ICES 2007 Labour Cost Index (1) is going to be a new quarterly index that will be published first time in 2009 in Finland. The first base year will be 2007 and hence the data collection needs to be covered this year its principles are to be followed an EU regulation but this does not give any detailed instructions for many basic methodologies such as sampling and questionnaire; the regulation however presents at which level the index should be published, that is, the NACE Rev.1 classification established by European Council. In Finland, LCI is based on 23 industries of the private sector due to some wishes of future users of index.

3 ICES 2007 Labour Cost Index (2) The structure of the index formula as it is in the regulation: = labour costs per worked and paid hour per worker at industry i of quarter t of year j = labour costs per worked and paid hour at industry i in year k = worked and paid hours of all workers at industry i in year k = labour costs of all workers at industry i in year k

4 ICES 2007 Labour Cost Index (3) The formula can be further developed as follows: Here: = sampling weight following from the number of respondents m ih calculated by strata ih in which h = size band. Symbol c l refers to labour costs of a sample enterprise l.

5 ICES 2007 Labour Cost Index (4) The index formula needs to be used at industry level but the data are collected using a sample. There is a certain amount of money available and consequently, the sampling design requires a careful planning. This paper is focused on sample allocation in which we try to take into account the requirements of the LC Index. But how to do it, since the index is mainly longitudinal in its nature? Have we any possibilities to include this feature in sampling design? Our answer is: Not completely but if we are able to well estimate the key cross-sectional para- meters, we believe to succeed also in index calculations. What are these key parameters? Next page!

6 ICES 2007 Labour Cost Index (5) Key parameters in LCI, all at each industry level: - Totals of worked and paid hours - Each quarter - Each calendar year = some type of a sum of quarters Note that it is for many enterprises difficult to get information about worked and paid hours but this is easier for paid hours and hence we ask only one of these for monthly paid workers. This leads to a demanding imputation task but this has not been considered in this paper. - Totals of labour costs (wages etc.) - Each quarter - Each calendar year = some type of a sum of quarters - Changes from one quarter to the next keeping the same enter- prises in the sample with a reasonable probability.

7 ICES 2007 Labour Cost Index (6) The index formula already includes size band h of each index industry i. This is usual in enterprise surveys since the formula also implicitly indicates that larger enterprises have a higher weight for the index, both cross-sectionally and longi- tudinally. Hence when we try to allocate a gross sample size to strata, we will have both industries i and size bands h in these strata. Our target in the rest of the paper is to find a best possible and realistic allocation strategy for the given gross sample size that was approximately 2000 enterprises. This should be made so that the industry level estimates are unbiased and reasonably accurate while, naturally, the whole index should be of high quality too.

8 ICES 2007 Sample Allocation (1) Naturally, there are many possible strategies for allocation in the case that the sampling design would follow simple random (srs) sampling within strata. We thus exclude pps as often has been made in business surveys due to problems in sampling frames, i.e. any business register has not been well up-dated and hence many changes in businesses have been found in survey field work. Especially, if a size variable of pps is not correct, we will have problems to well estimate figures. In LCI such changes are more expected than usually due to a four-quarters follow-up period. Indeed, these problems are present in srs but however easier to handle.

9 ICES 2007 Sample Allocation (2) We exclude the simplest allocations such as proportional or absolute but instead try to take into account enterprise size and variation in target variable within strata. In addition, some type of minimum allocation is required to use since it is known that some strata are small in the numbers of frame units, and hence maybe a take-all sample is required. After the key allocation we revised the gross sample sizes with anticipated response rates in each strata. Here, the experience of structural earnings statistics (SES) has been exploited although this new LCI survey is not completely similar to SES. Our response-rate allocation is rather rough. We do not present its results in this paper.

10 ICES 2007 Sample Allocation (3) The literature gives the three alternatives for sample allocation for this case (with our interpretation): - Neyman-Tschuprow allocation - Power allocation (Bankier 1988, et al) - Lavallée-Hidiroglou algorithm (1988) We looked at the latest approach but could not follow it completely. However, their strategy is also made as ours in two phases. So, we tested mainly the two first alternatives but our suggestion for the practice is based on a two-phase allocation so that - first an allocation for industries is made, and - secondly the first-phase allocation is continued to size bands. Note, that we tested one-phase allocation but this did not work as well as the two-phase approach did.

11 ICES 2007 Data set in pilot study (1) We could not get any data set that would be similar to the data set being available, but we had opportunity to create something enough similar for evaluating the behavior of this phenomenon and when deciding which sample allocation could work in practice, and finally to draw a real sample for the first year. The ongoing year sample cannot be much different but it will be decided later which revision is needed. Our data set for tests is constructed from the business register and the structural earnings statistics that was not an easy job, but we do not go to details. Its variables are annual, not quarterly. We could not construct a good longitudinal data set for analysing change effects due to too many incorrect values. So, our tests are cross-sectional.

12 ICES 2007 Data set in pilot study (2) For testing the sample, we constructed a sampling frame from year 2004 out of register and previous surveys. The precision of the point-estimates was evaluated by coefficient of variation (cv). cv is defined as proportion of standard error in the estimate. A total of K (=1000) independent samples were drown from the frame population. Measure of efficiency:

13 ICES 2007 Power and Neyman-Tschuprow allocation (1) for the reference year 2007 of the LCI in Finland concerning the first phase Bankier 1988: X = size variable that is ‘number of employees’ in our example Y = target variable, ‘earnings’ in our example S h = standard deviation of Y. q = power varying from 0 to 1

14 ICES 2007 Power and Neyman-Tschuprow allocation (2) We tested q - values from 0 to 1 and also q=1 and X h =Y h that leads to Neyman-Tschuprow allocation. In the following four pages we present our basic results from the first phase of sampling. The X-axis shows the different powers q and in the end there is Neyman-Tschuprow (N-T). The Y-axis indicates the accuracy in coefficients of variation (cv). There are different types of industries in different pages.

15 ICES 2007 Power and Neyman-Tschuprow allocation (3) These figures are most strange, but often due to a small industry. Even so that cv = 0 in some cases since all enterprises are needed to draw.

16 ICES 2007 Power and Neyman-Tschuprow allocation (4) In these cases cv’s are decreasing while q is increasing and N- T gives the lowest values. These are the large industry classes where N-T gives large sampling sizes.

17 ICES 2007 Power and Neyman-Tschuprow allocation (5) In these cases the results are rather opposite to the previous ones, including rather high values for N-T.

18 ICES 2007 Power and Neyman-Tschuprow allocation (6) In these cases the results are mainly similar to the previous ones except that cv’s are increasing more and N-T results are close to those with large q.

19 ICES 2007 Conclusion The allocation results are not uniform over industries and hence it is not automatically clear which allocation should be used in practice. Maybe the best solution could be to apply a bit different allocation for different groups of industries. This can be problematic in practice and hence the sample allocation for the first phase (for industries) was decided to use the allocation with q = 0.5 that is also called square-root allocation. Everything will be checked again after the first-year field work period (=2007) but the sampling design cannot be changed much.


Download ppt "ICES 2007 Labour Cost Index and Sample Allocation Outi Ahti-Miettinen and Seppo Laaksonen Statistics Finland (+ University of Helsinki) Labour cost index."

Similar presentations


Ads by Google