Download presentation
Presentation is loading. Please wait.
Published byRegina Haynes Modified over 9 years ago
1
Creating a National Remote Access System for Register-based Research Marianne Johnson, Statistics Finland Statistical Data Confidentiality Work Session Oct 2015
2
Finnish administrative registers several comprehensive national registers contain unit level data on individuals, families, housing, enterprises compiled and maintained for administrative or statistical purposes, e.g. –Population Register Centre (VRK) –Population information system –Social Insurance Institution (KELA) –Registers on obtained social benefits –National Institute for Health and Welfare (THL) –Medical Birth Register, –Care Registers for Social Welfare and Health Care (HILMO), –Finnish Cancer Register –Ministry of Labour (TEM) –Register over job seekers – Statistics Finland (Tilastokeskus) 21.9.2015Statitics Finland /Researcher Services2
3
Secondary usage of administrative registers Production of official statistics is to a large extent based on registers in Finland - the population and housing census has been based totally on register sources since 1990 - Handbook: Use of Registers and Administrative Data Sources for Statistical Purposes – Best Practices of Statistics Finland Register-based research –20 % of doctoral thesis’ within medicine in Finland include data from national registers 21.9.2015Statitics Finland /Researcher Services3
4
21.9.2015Statitics Finland /Researcher Services4
5
Prerequisites for register-based research Common personal identification number in all registers –first used in 1964 ( between 1964-1970 two different systems) –since 1971 a digital population register –all Finns have a PIN data from different registers can be linked by PIN e.g. for research purposes Legislation that allows the use of confidential personal data for scientific research Trust in register keepers and researchers Comprehensive, well documented registers 21.9.2015Statitics Finland /Researcher Services5
6
Legislative basis for research use of data from Statistics Finland - Statistics Act (280/2004) - In 2013 the Statistics Act was amended to better facilitate the use of data gathered at Statistics Finland for research purposes. - New objective of the Act –To extend the use of the data collected for statistical purposes in scientific studies and statistical surveys on social conditions. - Possibility for researchers to gain access to confidential data from which only the direct identifiers have been removed. –Before 2013 statistical authorities could not give permission to such confidential data from which the statistical unit could be indirectly identified. –Gain access = see and analyze data by a remote access - system 21.9.2015Statitics Finland /Researcher Services6
7
Remote access system (FIONA) - In use at Statistics Finland since 2009, development project 2014-2015 - Model taken from Sweden, Denmark and the Netherlands - Researchers use data on Statistics Finland’s server at their own workplace via a secured Internet connection, data remains at SF - Researchers use a Windows remote desktop, and have access to the data they have obtained permission to as well as to metadata - The researchers have access to wide range of statistical programs : STATA, SPSS, R, SAS, Python Anaconda, … - Each research project has its dedicated folders and storage space in the system - Technical maintenance of the FIONA-system transferred to CSC-It Centre for Science in 2015 - Number of users and data sets in the remote access system is growing steadily, currently about 150 active users 21.9.2015Statitics Finland /Researcher Services7
8
Confidentiality - Research data sets are stored on Statistics Finland’s /CSC’s servers - Only mouse, keyboard and graphic signals are transferred - Access to the system only from preapproved IP-addresses - A disposable SMS password is sent each time the researcher logs in to FIONA - All data transfers from and to FIONA are handled by personnel at the Researcher Services of SF –Outputs are checked so that direct or indirect identification is not possible and files are saved for possible future reference - Access to data is terminated when the permit for the project expires - FIONA environment is separated from the production network - The system will be audited in fall 2015 after being transferred to CSC 21.9.2015Statitics Finland /Researcher Services8
9
A typical process in applying for sensitive research data A researcher applies for a licence to access data for a research project The application must include a research plan and a pledge of secrecy The Ethics Committee is consulted in cases involving large datasets with confidential data If the data can be given out the licence is granted (possibly with modifications) A contract is signed specifying the dataset and the fee as well as the date of delivery The data is put together, edited and uploaded to the remote access system The researcher uses a remote connection to analyse the data and sends the results to Research Services The results are checked to make sure that no units (persons, companies) can be identified The results are sent to the researcher and they can be used in publications 21.9.2015Statitics Finland /Researcher Services9
10
Present process for obtaining register data for research RESEARCHER Authority Statistics Finland Authority § § § § Handling permit applications Control and specification Compiling data-sets § § @ @ @ @ Researcher responsible of data security and disposal of data sets Searching for data sets and applying for permits from several different authorities, with varying practices Delivering data using varying practices § Possible corrections and re-sending Data protection Authority 21.9.2015Statitics Finland /Researcher Services10 Internet
11
FMAS Remote access system Services that require permit Remote desktop for analysing data (programs and tools) Separated server space for data and metadata Output service for results, Input service for researcher’s data Services that require permit Remote desktop for analysing data (programs and tools) Separated server space for data and metadata Output service for results, Input service for researcher’s data Services that require registration Centralized digital permit application service Services that require registration Centralized digital permit application service Public services Data catalogue Helpdesk for research and tuition Public services Data catalogue Helpdesk for research and tuition Interface service for data and meta data, Pseudonymization Administration services for user rights Organiza- tion A Organiza- tion C Organiza- tion E - Commonly agreed metadata standards – Data warehouse - Archive of multiple user files Researcher Organiza- tion B Organiza- tion D 21.9.2015Statitics Finland /Researcher Services11
12
Linking data from different sources - Present method –Register keepers send the data requested by the researcher over a secure connection, by recommended mail, with courier services etc. to Statistics Finland –The data includes the Finnish PIN or BIN ( or a pseudocode created by the register keeper and the key is sent separately) –Statistics Finland creates a project specific pseudocode, changes the PIN (BIN) in the research data sets and uploads the data in the remote access system - Aim –Pseudocodes should be used in all data deliveries –Register keepers should be able to upload their data direct to the remote access system using a standard pseudonymization method 21.9.2015Statitics Finland /Researcher Services12
13
Pseudonymization –project specific Statitics Finland /Researcher Services Project 211 Statistics Finland FIONA Other registerkeeper Common9843 Project 211 123456-111A, woman 234567-222C, man nvaoepanwzl, woman bleokldawgs, man 123456-111A, age 15 234567-222C, age 44 nvaoepanwzl, age 15 bleokldawgs, age 44 Common984 3 Project 211 De-identification nvaoepanwzl, age 15 bleokldawgs, age 44 nvaoepanwzl, woman bleokldawgs, man 21.9.201513
14
To be developed…. - We see a problem with the set pseudocodes of the ’ready-made’ data files Solution 1: Create project specific pseudocode also for projects that use the ’ready made’ –Problem: A copy of ’ready made’ data sets has to be made for each project -> much excessive disc space is needed Solution 2: Send the seed code that has been used for the ’ready made’ files to the other register keepers –Problem: The key PIN /BIN - pseudocode used by Statistics Finland will be widely known 21.9.2015Statitics Finland /Researcher Services14
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.