Download presentation
Presentation is loading. Please wait.
Published byKevin McDonald Modified over 8 years ago
1
Researchers’ Usage of Microdata The example of Statistics Finland Basic presentation Consultation Mission on Promoting the activity and Creating a positive image of the Ukrainian State Statistical bodies Kiev, Ukraine 9 – 12 December 2014 Petteri Baer, Marketing Manager, Statistics Finland Courtesy to Ms Satu Nurmi & Ms Marianne Johnson, Statistics Finland
2
Contents Background and challenges The importance of adhering to the UN Fundamental Principles Confidentiality of individual data in focus Research Services for microdata at Statistics Finland Development of the unit since 2010 Datasets, registers and linking microdata Access to microdata Rules Remote access system Future challenges Building a National Remote Access System Petteri BaerKiev 9-12 December 2014 2
3
Background (1) A really wide range of aggregate data available on Statistics Finland’s web site – including the databases “StatFin” Easy option to order tailor-made, customized tables As these are not financed by the State budget, they belong to Statistics Finland’s chargeable services For many research questions microdata is needed In Finland the Statistics Act states that confidential data, collected for statistical purposes, can be released for scientific research and statistical surveys concerning social conditions Petteri BaerKiev 9-12 December 2014 3
4
Background (2) The Statistical Law states that microdata can be released only in a form which does not disclose the individual person, enterprise or basic unit, i.e. in an unidentifiable form With a few exceptions, i.e. enlisted in the Law Some variables of the Business Register data Data on the age, gender, education, profession and socio- economic status The cause of death Enterprises must be protected directly and persons (individuals) also indirectly The legislation brings us a challenge: protection of confidentiality vs. providing researchers with the relevant data cost-efficiently Petteri BaerKiev 9-12 December 2014 4
5
Background (3) Until 2009, protected data on enterprises could only be accessed at specific premises of Statistics Finland at the Research Laboratory – Problematic data protection! From the beginning of 2010 there has finally been a possibility to use data via a remote access system Anonymized samples of person data can now also be released to researchers outside Statistics Finland Anonymizing is a must, de-identifying is not sufficient! In Statistics Finland a Committee for Statistical Ethics has a strong say in the most complicated microdata exchange questions It provides guidance to the directors, who issue the needed permits for usage of anonymized data Petteri BaerKiev 9-12 December 2014 5
6
Guidelines in the UN Fundamental Principles (1) The standards for national systems of official statistics are given as a set of principles, decided upon world widely as the United Nations Fundamental Principles of Official Statistics First adopted in the UNECE region by the UN European Commission for Europe in 1992 Adopted by the UN Economic and Social Council in 1994 Already in use as the basic principles for more than 20 years – Recently endorsed by the UN General Assembly fully and unchanged in 2013 Practically all National Statistical Agencies in all countries in the world have also endorsed these principles Ukraine did so already in the year 199X – well, you know better than me Petteri BaerKiev 9-12 December 2014 6
7
Guidelines in the UN Fundamental Principles (2) Paragraph 6. in the “Fundamental Principles”: “6. Confidentiality. Individual data collected by statistical agencies for statistical compilation, whether they refer to natural or legal persons, are to be strictly confidential and used exclusively for statistical purposes.” Kiev 9-12 December 2014 7Petteri Baer
8
Guidelines in the UN Fundamental Principles (3) An good practical guide for a more extensive explanation and interpretation what this inclines is given in the UNECE material “How Should a Modern National System of Official Statistics Look? - The relationship between international principles on systems of official statistics and national statistical legislation”, It is a UNECE material, available in English and in Russian at http://www.unece.org/fileadmin/DAM/stats/doc uments/applyprinciples.e.pdf http://www.unece.org/fileadmin/DAM/stats/doc uments/applyprinciples.e.pdf http://www.unece.org/fileadmin/DAM/stats/doc uments/applyprinciples.r.pdf http://www.unece.org/fileadmin/DAM/stats/doc uments/applyprinciples.r.pdf Petteri BaerKiev 9-12 December 2014 8
9
What does Paragraph 6. of the “Fundamental Principles” mean in practice? (1) Statistical confidentiality is aimed at protecting the privacy of individual units – both physical persons and legal units - about which data are collected and processed. It has two components: Producers of official statistics use data about protected individual units only for statistical purposes (official statistics in the first place) Producers of official statistics do not disclose, either directly or indirectly, characteristics about protected units to any third party in such a way that any user might derive additional information (information not known to the user before) about a protected unit Petteri BaerKiev 9-12 December 2014 9
10
What does Paragraph 6. of the “Fundamental Principles” mean in practice? (2) Borderline case: use of addresses from statistical registers for statistical surveys outside official statistics (research purposes) or for commercial or marketing purposes by private actors, if foreseen in the statistical legislation For instance in Finland, the Statistical Act enlists a number of items of the Business Register, which are by the law considered to be available information of a developed society and thus possible to be disclosed on an individual level Petteri BaerKiev 9-12 December 2014 10
11
What does Paragraph 6. of the “Fundamental Principles” mean in practice? (3) Possible exceptions for disclosure of microdata without identifiers to a third party (if law explicitly permits, and following strict protocols): To another producer of official statistics within the same country for its tasks To university or private researchers (incl. outside the country) against signature of a contract, strictly protecting confidentiality To a statistical department of an international or supranational organisation, providing that there are clear rules in these organisations for protecting confidentiality, especially against non-statistical use As public use files (risks of indirect disclosure eliminated) Petteri BaerKiev 9-12 December 2014 11
12
What does Paragraph 6. of the “Fundamental Principles” mean in practice? (4) Possible exceptions for disclosure of microdata without identifiers to a third party (cont.) Good practices for granting access to microdata for researchers are either a secure part of the NSA, organised either physically in the premises of NSA or permitting remote access to a special server, with strict control about what is visible, downloaded or taken away in other form Access to microdata that are the result of matching different sources (other than statistical registers) has to be treated with greater restraint, which may lead to a complete exclusion from the above access options. Petteri BaerKiev 9-12 December 2014 12
13
What does Paragraph 6. of the “Fundamental Principles” mean in practice? (5) If other producers than NSO carry out statistical surveys, they should have the right to receive list of addresses from the relevant statistical registers from the NSO according to the approved sample design All staff and other persons involved in handling confidential data have to sign confidentiality commitments on appointment The Statistical Law should provide for sufficient penalties for breaches of confidentiality Petteri BaerKiev 9-12 December 2014 13
14
Additional material on Statistical Ethics Guidelines on Professional Ethics Published by Statistics Finland and available for anybody in English at the web site http://tilastokeskus.fi/org/periaatteet/eettinenopas_en.pdf Petteri BaerKiev 9-12 December 2014 14
15
At Statistics Finland… Research Services for microdata (1) Research Services for microdata handle user licence requests for unit level data based on register and survey data of Statistics Finland Research Services has been operating since June 2010 Our services are located in the Standards and Methods -division The director of the division is Mr Timo Koskimäki and the head of unit is Mr Jussi Heino. The unit’s number of staff is 15 persons. 8 working with providing researchers with microdata as well as development of data and metadata work 3 working with microsimulation work 3 working on customer ordered research and statistics Still growing? Petteri BaerKiev 9-12 December 2014 15
16
Research Services for microdata (2) The main tasks of the Research Services are To produce both ready-made data and tailored data sets for researchers To participate in development work and research projects To develop and maintain a microsimulation model for income transfers and taxation The Research services are financed partly as a chargeable service, i.e. the customers and partly from the budget of Statistics Finland. Costs for data and technical guidance are included in the service prices The prices for the microdata services can be found at the web site http://tilastokeskus.fi/tup/hinnat/tutkimuspalvelut_en.html http://tilastokeskus.fi/tup/hinnat/tutkimuspalvelut_en.html Petteri BaerKiev 9-12 December 2014 16
17
For some researchers or assignments additional services may be needed Interview and Survey services The price for a data collection made by Statistics Finland's interview and survey services is comprised of the costs of the collecting and designing of the content of the data, costs of the fieldwork stage of the data collecting, and the costs of the processing and editing of the data The most important factors affecting the costs of the different survey implementation modes are always discussed in advance, after which preliminary cost estimates can be given for them. The final cost estimate is made after establishing the implementation alternative that suits the customer's needs and the survey details. Methodological services Hourly charging is applied varying between EUR 80 to 140 depending on the nature of assignment. Monthly charging may also be applied in large projects. Petteri BaerKiev 9-12 December 2014 17
18
Challenges for the Research Services Increased demand of comprehensive micro-level databases for research purposes during the last years Complaints about long delivery times, mainly from researchers Also some feedback about too small samples and too strict data protection All of our development work aims at satisfying the customers Customers need to know better the sources and variables, possibilities and restrictions to get data, prices, delivery times Customers prefer to have the services provided from one service point as opposite to having multiple contact persons in different statistical units of Statistics Finland Petteri BaerKiev 9-12 December 2014 18
19
New organisation: The staff for providing microdots services was centralised into one unit Improved clearness of pricing, rules and guidance Web site presentation of the service was updated New software to follow up the process of research projects Strongly increased amount of ready-made datasets The Research Service Unit itself programs the SAS-codes, earlier this was done by the IT-section Development of the Microsimulation Model Increased usage of remote access The renewal of the Statistics Act in 2013 Creation work for establishing a National Remote Access System Finnish Microdata Access Services is a joint project Petteri Baer Development since 2010 Kiev 9-12 December 2014 19
20
Datasets – More details will come up in the Advanced Presentation! Rich linkable administrative register databases and survey data are goldmines for e.g. empirical economic and health analysis Long time series based on register data Most of the enterprise-level annual microdata sets are ready-made and easily available for research purposes Tailor-made datasets take a longer time to produce and they also cost more due to substantial amounts of manual work. Tailor-made sets are normally based on register data related to persons, households and housing or interviewed data related to living conditions etc. All data sets sent to the outside researchers are mutually linkable by encrypted unit identifiers (personal identification number, enterprise number, establishment number) and can also be linked to researchers’ own data or data sets from other organisations Petteri BaerKiev 9-12 December 2014 20
21
Rules related to the access to microdata Application and research plan => decision for usage => License to use micro data Pledge of secrecy Agreement with the research project A Committee for Statistical Ethics handles the most complicated micro-data access questions and provides guidance to the directors, who issue the permits Data limitations in enterprise data: Encryption of unit identifiers Removal of sensitive information Can only be used online or on-site For data on individuals Encryption of direct identifiers if used online or on-site Anonymization if released to researchers Petteri BaerKiev 9-12 December 2014 21
22
The Remote access system The aim has been to create a remote access system to microdata in order to increase: + Regional equality + User-friendliness + Data protection + Efficient use of microdata Main practices are from Sweden, Denmark and the Netherlands Safe environment for authorized users only All microdata remains at Statistics Finland, the results of the data analyses are always checked Individual and enterprise level data protected by data disclosure rules Petteri BaerKiev 9-12 December 2014 22
23
The Remote access system – Short description Researchers use data on Statistics Finland’s server at their own workplace via a secured Internet connection Research organisations are responsible for their users On the server researchers can use a Windows desktop, where they have access to the data permitted and metadata Statistical programs used: STATA, SPSS, R, SAS Secure internet connection via a SMS passcode, servers are disconnected from the production network; All log files are saved Efficient use allows currently for 16-32 simultaneous users The researcher cannot copy or transfer any data to be taken out or to be inserted into of the system Output checking takes place always In 2012 14 institutes, 104 researchers and 42 projects served In 2014 23 institutes, over 150 researchers and 88 projects served Petteri BaerKiev 9-12 December 2014 23
24
Rules related to the Remote access service In addition to user license, pledge of secrecy and agreement with the research project: Agreement on remote access with the research organization (annex of agreement: data security practices of organization) Contact person responsible for communication and user training Research organizations are responsible for their users To prevent the identification of individual enterprises and individuals, output is manually checked (sometimes even in two phases): before leaving Statistics Finland (always) and before the publication of the results (sometimes) Petteri BaerKiev 9-12 December 2014 24
25
Future challenges Development of the ready-made databases (updating existing data and increasing new data) Development of metadata and documentation (publication by web) Continuation of centralising research services into Research services On-line use as the principal mode of use for microdata Development of public use files for education and general use Improved communication with the research community Co-operation with other register agencies in Finland International co-operation Petteri BaerKiev 9-12 December 2014 25
26
Time for a coffee break? Thank you for your attention petteri.baer@stat.fi www.stat.fi Kiev 9-12 December 2014 26Petteri Baer
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.