1 Confidentiality and statistics on grids Vilni Verner Holst Bloch MSc. landscape ecology and natural resources Statistics Norway Otervegen 23 N Kongsvinger Tel : ++47 / Fax : ++47 / A proposal on common rules for handling confidentiality to the Board of Confidentiality at Statistics Norway The European Forum for Geostatistics workshop in Haag, Netherlands 5th – 7th of October 2009 “bridge the gap” between theory and practice in GeoStatistics Session 1: Small area statistics Bjørn Thorsdalen Population Statistics Statistics Norway Otervegen 23 N Kongsvinger Tel : ++47 / Fax : ++47 /
2 Overview of the presentation Background System of grids for national statistics Examples on confidentiality issues Different confidentiality rules Examples on use of todays confidentiality rules Guidelines for grid statistics Further work
3 Background A) Requests from users (insurance companies, scientists, companies with localisation or marketing issues, general public, education puposes) B) Internal drive within Statistics Norway (coming GIS and censuses) C) Partnership in National INSPIRE Forum (obligations) D) New possibilities (better presentation of spatial statistics, spatial analysis etc) The more users need, and we produce, the more crucial common rules for confidentiality becomes
4 4 Norwegian Mapping and Cadastre Authority FIREWALLFIREWALL Geodatabase Statistical base registers Statistics Statistics Norway ArcGIS coverages, shape files etc. WMSssb.no Statbank External WMS/WFS providers Local Copy ”Wall of confidentiality”
5 Statistical grids for Norway Grid nameCell size Number of cells SSB100m (1) 0.01 km cells SSB125m (1) 0.01 km cells SSB250m (1) km cells SSB500m (1) 0.25 km cells SSB1km1 km cells SSB5km25 km cells SSB10km100 km cells SSB25km (2) 625 km2 500 cells SSB50km (2) km2 150 cells SSB100km (2) km2 40 cells SSB250km (2) km2 10 cells SSB500km (2) km2 4 cells (1)Because of limitations in many software packages and for practical use, these grids are recommended as grids with a county coverage. (2) These grids might also cover sea territories. Number of cells refers to coverage of Norwegian mainland. One has however to be aware of deviations in grid cell areas for regions remote from the Norwegian mainland and Svalbard.
6 Number of farms. 1x1km 1 – 3 farms 4 or more farms Confidentiality examples
7 Building stock. 100x100m 1 – 3 buildings 4 or more buildings Confidentiality examples
8 Night time population. 1x1km. Year 2000 over 1 – 9 persons 2000 10 or more 2000 1 – 9 persons 2008 (new settlements) 10 or more 2008 (new settlements) Confidentiality examples
9 Number of enterprises. 1x1km 1 – 3 enterprises 4 or more enterprises
10 Leisure homes. 1x1km 1 – 3 leisure homes 4 or more leisure homes Confidentiality examples
11 Night time population. 1x1km. Year 2008 over 1 – 9 persons 2000 (abandond cell) 10 or more persons 2000 (abandond cell) Confidentiality examples
12 Confidentiality examples Number of grid cells and inhabitants by grid cell sizes. Per cent. Share of grid cells with less than N persons Share of persons in grid cells with less than N persons N > 101 N > 51 N > 31 N > 11 N > 4 N > 101 N > 51 N > 31 N > 11 N > 4
13 Confidentiality examples Frequency of agricultural enterprises by 1x1 km grid cells.2008 Grid cells by number of agricultural enterprises Number of agricultural enterprises
14 Previous and existing rules for handling confidentiality on grids are not adequate. Confidentiality rules should be handled at lowest reasonable geographical level. Official statistics should not be given at all geographical levels/grid sizes. One should have a set of limits/treshold values dependent of the sensibility of the topics for statistics or quality of sources for statistics. Recommondation given to the Board of Confidentiality at Statistics Norway
15 The following has been recommended to the Board Recommondation given to the Board of Confidentiality at Statistics Norway 1.Total figures (persons, enterprises, buildings, dwellings) and non- sensitive variables (age, sex, building type, NACE code) do not need to be anonymised. 2.Statistics on sensitive variables can be given if total figures exceed threshold values. Threshold value is to be set by responsible department for each statistics, dependending on quality, sensitivity and details. Threshold values are fixed to total figures of 10, 30 or 50. No further anonymisation is done. 3.Grid sizes of 125mx125m and 500mx500m shall not be used for official statistics.
16 Work within the Geostat to make guidelines for handling confidentiality issues Adoptation of European rules for handling confidentiality in grid statistics ? Further work Thank you for your attention