Antonio R. Discenza: Silvia Loriga: Alessandro Martini: ISTAT – Italian National Statistical Institute Labour Force Survey Division Rome, May 15-16, th Workshop on LFS Methodology The Italian Labour Force Survey consistency framework 9 th Workshop on LFS Methodology
It is designed as a quarterly survey, all information obtained by interview, with no use of “Wave approach”. Space-time allocation in order to produce direct monthly estimates of the main figures Overview of the Italian LFS
Monthly figures at national level (both SA and NSA series) Quarterly figures up to the 21 NUTS_2 “ regions” (both SA and NSA series) and micro-data Yearly figures up to the 110 NUTS_3 “provinces” and 13 larger Municipalities, as “direct estimates”, and micro-data Yearly figures of employment and unemployment for the 686 Local Labour Market Areas, as small-area-estimates Yearly figures by the households perspective Quarter-on-quarter flow estimates and longitudinal micro- data Year-on-year flows estimates and longitudinal micro-data IT-LFS assures full consistency between figures and micro-data using calibration estimators and other benchmarking techniques. Dissemination of results
External information on reference population For the consistency framework of IT-LFS and its timeliness, a fundamental role is played by the auxiliary information updated on monthly bases, by the Demographic Division, for weighting purposes: resident population in each Municipality by sex, age and citizenship (Nationals/Non-Nationals). A monthly population is used for monthly estimates A weighted average of the monthly population is used for monthly estimates Is the number of weeks in the month (4 or 5)
The quarterly weighting procedure A Generalized calibration estimator has been adopted in order to improve the accuracy of the estimates Final weights are obtained in three steps: the base weights are obtained for all selected households as the inverse of the probability of inclusion in the sample; the base weights are adjusted by a correction factor for total non-response worked out as the reciprocal of the response ratio for sub-groups of households; final weights are obtained applying a calibration estimator that assures that the sample replies the same structure as the population, with regard to the several constraints.
Calibration to the reference population Is obtained using constraints from several external sources –population by sex and fourteen 5-year age groups (0-14, 15-19, …, 70-74, 75 and more years) at NUTS_2 level; –non-national population (males, females, EU, Not EU) at NUTS_2 level; –population by sex and five age groups (0-14, 15-29, 30-49, 50-64, 65 and more years) at NUTS_3 level –population by sex and five age groups (0-14, 15-29, 30-49, , 65 and more years) for 13 large municipalities (> inhabitants) –number of households at NUTS_2 level for each rotation group; –population by sex at NUTS_2 level each of the three months of the quarter (representing 4/13, 4/13, 5/13 of the whole quarter)
Monthly constraints and monthly weights The weighting procedure provides fully consistent monthly and quarterly weights. Monthly estimates could be directly obtained using the monthly sample and its monthly weights Problems: These estimates are only available at the end of the quarter when all the interviews have been completed and quarterly weights have been computed; Time series showed a very high variability. Monthly direct estimates were never published.
MONTHLY ESTIMATES
For few years Istat studied the possibility to improve timeliness and quality of the monthly estimates. It was found that a Regression Composite Estimator would have suited the purpose: –it is a design based estimator, purely based on LFS data, –and exploits the longitudinal dimension of the sample to produce more robust estimate) Provisional and final monthly estimates Q1_Y1Q2_Y1Q3_Y1Q4_Y1Q1_Y2Q2_Y2 JanFebMarAprMayJunJulAugSepOctNovDecJanFebMarAprMayJun After evaluating the results and tuning the model for a long period, monthly estimates where finally disseminated in The framework: monthly estimates are disseminated as Provisional (timely and as Final at a later stage
Data are first disseminated on a provisional basis, about 30 days after each reference month, computed over a partial sample (the fieldwork is not completed yet). Press release on monthly unemployment the same day as Eurostat, focused on Seasonal Adjusted (SA) data; Simultaneously, monthly data (both SA and Not SA) are made available on Istat data warehouse (I.Stat) The production process starts about 22 days after the end of the reference month. Provisional monthly estimates
La produzione di stime mensili Provisional monthly data production timetable
Step 1: a calibration to apply the regression composite estimator Step 2: the seasonal adjustment of the estimates: First a univariate seasonal adjustment; Then a time series reconciliation procedure in two steps to ensure consistency between different aggregates and the total population, and between monthly and quarterly SA series. Procedure based on a dual system of constraints: contemporary constraints (monthly population by sex and age groups) inter-temporal constraints (quarterly SA figures of Employment, Unemployment, Inactivity; quarterly population by sex and age groups). The approach of benchmarking is based on the “movement preservation principle” in order to maintain the temporal profile of the original series. Estimation procedure in two steps
SA Reconciliation Procedure
Source: Q2010 Conference, Assessing quality by means of temporal disaggregation. Riccardo Gatto, Silvia Loriga, Andrea Spizzichino and Alessio Guandalini Employment figures with three different estimators Another representation: irregular vs. seasonal
Final monthly data are then produced when the corresponding quarterly data are available, that is about 60 days after the reference quarter, for each of the three months An additional step is added in the estimation procedure: this is a specific calibration step that assures that monthly data are consistent with the quarterly ones ( for the main aggregates, the weighted average of the three monthly figures, with weights equal to 4/13 or 5/13, is equal to the corresponding quarterly figures). the constraints are related to both single months (total population by sex and age groups at different levels of geographical detail); the quarterly estimates of the main aggregates: employed, unemployed and inactive, by gender and three age groups (15- 24, 25-64, 65+) Framework for dissemination of monthly estimates
CROSS SECTIONAL AND LONGITUDINAL ESTIMATES
GROSS LABOUR MARKET FLOWS Quarterly and Yearly net changes are the final result of a high number of gross flows of different nature and different size
Definition of a reference longitudinal population The longitudinal micro-data files constitute a “by-product” of the survey itself; LFS is not a “real” panel survey (the longitudinal sample has no information on persons which move out of the selected households, or household which move out of the municipality) Longitudinal estimates can refer only to a specific longitudinal reference population Longitudinal Weights should: reflect the longitudinal population, account for the panel attrition (usually not at random), ensure consistency with the other quarterly estimates.
Reference longitudinal population and weights The longitudinal population in the IT-LFS is defined as: the population which is resident in the same municipality for the entire 3 or 12 months period, excluding Deaths; those who have moved to other Italian municipalities (change of residence); Migrants to other countries It is fully consistent with the quarterly reference populations, given the general population equation the longitudinal population is A multi-step calibration procedure is used compute longitudinal weights, which produce results which are also consistent with quarterly cross-sectional populations and figures
This approach allow us to produce several kind of longitudinal estimates of gross flows and transition rates, assuring consistency of a large number of stock/flow results, by sex and age groups, and at NUTS2 and NUTS3 level. It is straightforward to calculate: quarterly flows: from one quarter to the subsequent one (3 months, quarter-on-quarter overlap); yearly flows: from one quarter to the same quarter of the subsequent year (12 months, quarter-on-quarter overlap ); average yearly flows: as average of the 4 yearly flows, referring to the 4 quarters of the calendar year (12 months, year-on-year overlap) append of the yearly longitudinal datasets and their weights divided by four. flow estimates are consistent with yearly cross-sectional estimates (annual averages) for the 2 consecutive years. more detailed analysis at regional level and for subgroups longitudinal micro-data and transition matrices
GESIS – Mannheim, 5 – 6 march 2009 Complete Matrix with net and gross flows. Quarter – Quarter (Thousands)
GESIS – Mannheim, 5 – 6 march 2009 Complete Matrix with net and gross flows. Quarter – Quarter (Thousands) Net change in employment +384 Net change due to Migratory flows Net change due to Demographic flows - 35 Net change due to Longitudinal Population + 96
GESIS – Mannheim, 5 – 6 march 2009 Net change +96 Leaving employment Entering employment about movements Persistence in employment Transition Matrix for longitudinal population. Quarter – Quarter (Thousands)
It is worth to have this consistency ? The use of this methodological approach requires the availability of data on longitudinal population of good quality and details, and this is the case for Istat. It would be interesting to study the possibility to use it in other countries, or at European level. What could be the limitations or the advantages of this method in countries with different survey design which sample dwellings, with area sample, etc. Points of discussion about consistency between stock and flows
A brief exercise on WAVE APPROACH
IT-LFS never used wave approach. All the variables are collected, in all quarters, on the whole sample. We have the possibility to simulate a wave approach on past data and compare results with the annual averages already disseminated. We assumed that some of the structural variables were observed only on the first waves of the 4 quarters This exercises has been conducted to evaluate the impact of the introduction wave approach on: estimation procedures in terms of coherence/consistency between yearly estimates (from sub-sample) and annual averages (from the full-sample) A brief exercise on Wave Approach
SUB-SAMPLE STRUCTURAL VARIABLES Rotational pattern, full and sub samples the sub-sample has the same theoretical sample size of a quarterly sample. We have reweighted the sub-sample benchmarking to the averages of the 4 quarters (from the full-samples) to get consistency with annual averages.
sets “conditions for the use of a sub-sample for the collection of data on structural variables” It states that: “Consistency between annual sub-sample totals and full- sample annual averages shall be ensured for employment, unemployment and inactive population by sex and for the following age groups: 15 to 24, 25 to 34, 35 to 44, 45 to 54, 55 +” “The sample used to collect information on ad hoc modules shall also provide information on structural variables”. Commission Regulation (EC) No 377/2008
Considering that: -the sub-sample has to be used for the actual ad-hoc modules and future Supplementary Annual Modules (we want to possibility to analyse regional differences) -It is important to take into account also the differences between the theoretical and the actual sample in terms of distribution over time and space (to compensate for a possible different total-non-response in different quarters and different regions). -the higher is the total non-response and the bias in the different waves or quarters, the higher is the risk of inconsistencies between the two kinds of annual averages Conditions for weighting the sub-sample
-Some yearly variables in the sub-sample could be strictly correlated with those collected quarterly, not only with ILO status. -If the sub-sample is biased with respect to those quarterly variable then the estimate of the yearly variable could be biased. -For example, “income”, “second job” and “looking for another job” are probably correlated with STAPRO, FTPT, TEMP, NACE, ISCO. Under these conditions, is the minimum set of requirements in the regulation 377/2008 sufficient to achieve coherent results, and to produce unbiased yearly estimates? Weighting the sub-sample:
Conditions in the regulation do not seem sufficient to us Several sets of Final Weights have been obtained: Using calibrator estimators, Starting from the quarterly weights, with several different sets of constraints (SoC) Annual distribution of the reference population by sex, age, region and citizenship (similar to quarterly weights) Annual averages of several main variables correlated with the structural variables For each SoC all constraints are contemporary defined at NUTS 2 level. Different sets of constraints (SoC)
SoC_1: Only the minimum set of constraints in the regulation 377/2008, but at NUTS2 level. SoC_2: The same constraints on the populations as in the regular quarterly weights; not those in the 377/2008 regarding labour status SoC_3: The same constraints on the populations as in the quarterly weights; plus WSTATOR by sex and broad age groups (the same traditionally used at NUTS3 level) SoC_6: The same constraints on the populations as in the quarterly weights; plus WSTATOR by sex and broad age groups (the same traditionally used at NUTS3 level); plus STAPRO (employee/self-employed), FTPT, TEMP, NACE (3 groups), ISCO (3 groups). SoC_7:The same constraints on the populations as in the quarterly weights; plus labour status by sex and age groups (the same traditionally used at NUTS3 level); plus STAPRO (employee/self-employed), FTPT, TEMP, NACE (3 groups), ISCO (3 groups); plus population 15 and over, by sex and labour status, by quarter.
Different sets of constraints (SoC) Table 1 – INCDECIL: Annual averages obtained from the full sample and the sub-sample using different sets of constraints. Year (Percentages) For INCDECIL the sub-sample provides higher relative frequencies for lower monthly pay than the full-sample, especially for the first decile. The opposite happens for higher monthly pay. The differences became bigger in Soc_7 where constraints are put on the characteristics of the employment also.
Different sets of constraints (SoC) Table 2 – MAINSTAT: Annual averages obtained from the full sample and the sub- sample using different sets of constraints. Year (Absolute values, Percentages) For MAINSTAT (see Table 2), the sub-sample provides a lower number of employed (about 100 thousands) and a higher number of unemployed than the full-sample (100 thousands). The greater difference occur with Soc_2, where no constraints are put on labour statuses. No much difference between the other SoC’s.
Different sets of constraints (SoC) Table 3 – EXIST2J-STAPRO2J-NACE2D2J-HWACTUA2: Annual averages obtained from the full sample and the sub-sample using different sets of constraints. Year (Absolute values, Percentages, averages) Table 3 shows the results for some of the variables related to the SECOND JOB. The sub-sample provides a much higher number of employed with a second job (+30%), and a much higher incidence (from 1.4% to 1.9%). As consequence, the number of total hours worked is higher (about 20%) providing a much smaller number of hours worked per employees (from 23.5 to 18.6). The estimates are higher for both employees and self-employed, and in all the main NACE sectors. However, the sub-sample tends to reduce the incidence of employees and of the employed in the Service sector, and increase the incidence of self- employed and of the employed in Agriculture and industry.
It is indubitable that a panel attrition exist and that quarterly estimates could be biased. Thus their annual averages could also be biased but have higher precision. On the other hand, it seems also reasonable that estimates from the sub-sample should be “in principle” less biased than those from the full-sample, but with a lower precision. An important questions arises: Is it methodologically correct to benchmark the sub-sample estimates to the full sample ones if we suspect that the latter are more biased than the former ? Points of discussion about the wave approach
Are we sure that the benefits of a reduction in respondents burden are so high that they compensate, or exceed, the much bigger effort needed for the continuous management of questionnaires and micro- data, the implementation of a more complex methodology? Time series for the structural variables could have breaks when we introduce wave approach. How to manage this? What would be the dissemination strategy? (given the new limitations due to the consistency problem) What kind of yearly indicators can be produced: levels or percentage distributions? Points of discussion about the wave approach
and VERY MUCH INDEED for your PATIENCE, TOLERANCE, TENACITY, mental alertness, physical resistance, great capacity to remain calm.... although.. THANK YOU FOR YOUR ATTENTION!
and VERY MUCH INDEED for your PATIENCE, TOLERANCE, TENACITY, mental alertness, physical resistance, great capacity to remain calm.... although.. THANK YOU FOR YOUR ATTENTION!
European Conference on Quality in Official Statistics – Q May Helsinki Deaths Employed Unemployed Inactive Total Total Labour Status at 2008Q1 Inactive Longitudinal PopulationEmployedUnemployed Labour Status at 2007Q1 Net change due to Longitudinal Population flows People Leavingthe Municipalities Population aged Q Children aged People Entering the Municipalities Population aged Q1 Net change in cross-sectional employment +324 Net change due to Migratory flows Net change due to Demographic flows - 49 Complete Matrix with net and gross flows. Quarter – Quarter (Thousands)
European Conference on Quality in Official Statistics – Q May Helsinki Transition Matrix for longitudinal population. Quarter – Quarter (Thousands) Employed Unemployed Inactive Total Total Labour Status at 2008Q1 Inactive Longitudinal PopulationEmployedUnemployed Labour Status at 2007Q1 Net change +105 Leaving employment Entering employment Persistence in employment almost movements