1 1 Statistical registers by restricted neighbor imputation – An application to the Norwegian Agriculture Survey Nina Hagesæther and Li-Chun Zhang Statistics Norway
2 Outline Objective Method Empirical results Future work
3 Statistical register: Objective Known Unknown Units Variables
4 Statistical register: Objective Known Unknown Good quality Register variables Poor quality Units Known
5 Statistical register: Objective Known Unknown Units in sample Target variables Units outside sample Register variables
6 Objective Triple-goal criterion (Zhang and Nordbotten, 2008) 1.Efficient estimates 2.Correct covariance structure 3.Non-stochastic
7 The RENI Method REstricted Neighbor Imputation Restrictions: Totals are already estimated Donors = respondents Receivers = population – respondents Nearest neighbor (NN) = unit in same imputation class that satisfy
8 Algorithm Fine-tune phase (FT) –Donor among k nearest neighbors –Choose the donor that best satisfy the restrictions –An iterative process Jump-start phase (JS) –NN imputation for a given proportion of totals –Speeds up the process –Proportion can be reduced or JS omitted
9 Agriculture Survey units in the population, in the sample 84 target variables Publish: class of farmlands in decares (6), farming activity (FA, 12), county Important topics: leasing, investment, maintenance
10 Empirical results – Number of neighbors FA- 2: 2660 receivers, 727 donors FA- 4: 9984 receivers, 3266 donors FA-10: 384 receivers, 243 donors FA-11: 586 receivers, 340 donors
11 Empirical results (FA-4) – Restriction Donors: 3000, Receivers: Alt 1: Equal weight for all 84 restrictions when calculating delta (84) Alt 2: Chosen 12 restrictions 10 times higher weights Alt 3: 9 sets of sub-population restrictions in addition to alternative 2 (12x10) (12) (84)
12 Empirical results (FA-10) – Restriction Donors: 240, Receivers: 380 Alt 1: Equal weight for all 84 restrictions when calculating delta Alt 2: Chosen 12 restrictions 10 times higher weights Alt 3: 9 sets of sub-population restrictions in addition to alternative 2 (12x10) (12) (84)
13 Empirical results – CV RENI / Weighting
14 Empirical results - Correlations Farming activity 10
15 Future work Restriction –How to choose restrictions –How to calculate delta Adjust for partial non-response –Donor and receiver do not match on observed values of receiver –Partial missing in target variables –Unit missing of target variables as partial missing of combined auxiliary and target variables
16 Thank you for your attention! Statistics Norway in Oslo Statistics Norway in Kongsvinger
17 Empirical results – Computation time Farming activity 4 Farming activity 10
18 Empirical results – Computation time
19 Empirical results – Two-way classification Farming activity 10