Download presentation
Presentation is loading. Please wait.
Published byFrank Crawford Modified over 6 years ago
1
Researcher Credentialing: A Proposed System for Improving Access to Restricted Data
Margaret Levenstein with Linda Detterman, Peter Granda, Jared Lyle, John Marcotte, Amy Pienta, Jukka Savoilanen RDA 2016 16 September 2016
2
New sources, types, quantities of data
Geo-spatial Video Administrative data Transactions Clicks Sensors Bottom line: re-identification is easier than ever Use of all this cool new stuff is going to require restrictions on access But more and more traditional data now have disclosure risk, and other types of data are emerging with disclosure risks (e.g., quantity of data, new methodologies), which increases the challenges (and opportunities) to disseminate…
3
Opportunities and Challenges
New types of data New research opportunities New confidentiality (and proprietary) challenges New methods for protecting data Barriers to data should be low Access to confidential data is not “free” from a societal point of view, so it shouldn’t be from the researcher’s point of view either Ideas: Proposed researcher credentialing system, the interaction of safe data/safe people/safe places, increased risk associated with data linkages and new types of data (e.g., geospatial), and other approaches such as differential privacy and a privacy budget for managing access to/release of analysis of confidential data). We could also talk about technological improvements, such as two-factor authentication, for the VDE, that we are already doing, but which indicate how technology can increase protection and decrease the barriers to using confidential data. Remote job submission (differential privacy). Researcher credentialing (how do you enforce agreements).
4
Sharing confidential data
Safe data: Modify the data to reduce the risk of re-identification Safe places: Physical isolation and secure technologies Safe people: Training and data use agreements; bonds More and more data are coming to us with disclosure risks. The data are harder to evaluate. And they’re much harder to anonymize. Naturally, the ideal system is for the data producer to reduce or remove disclosure risk upon data collection. But we also understand that many data producers do not identify or understand disclosure issues before submitting to an archive. We see all parties who touch data (producers, archives, and users) responsible for responsible handling of confidential data. Here, we focus on the archive’s role in processing and sharing confidential data. Our system for archiving and sharing confidential data involves a three-pronged approach: Safe data, Safe places, and Safe people…
5
Interaction of safe data, places, people
Tradeoff between information content and confidentiality protection Can always – and only – assure 100% confidentiality protection preventing access Make data safer through transformation, aggregation, suppression, noise infusion “Less” safe data can be made safer through safe places and safe people Technological change, broadly speaking, can move out the frontier of these tradeoffs Ideas: Proposed researcher credentialing system, the interaction of safe data/safe people/safe places, increased risk associated with data linkages and new types of data (e.g., geospatial), and other approaches such as differential privacy and a privacy budget for managing access to/release of analysis of confidential data). We could also talk about technological improvements, such as two-factor authentication, for the VDE, that we are already doing, but which indicate how technology can increase protection and decrease the barriers to using confidential data. Remote job submission (differential privacy). Researcher credentialing (how do you enforce agreements).
6
Privacy Budget Every time you make data, or analysis of data, available, it increases risk of re-disclosure Goal: maximize social value of data without spending ‘privacy budget’ that captures acceptable level of risk
7
Researcher credentialing
Goal: establish community-recognized system of researcher qualities Training Institutional affiliation and position Prior experience with restricted data Background checks Reduce cost to researcher of obtaining access Reduce cost to data providers of vetting and training researchers
8
Researcher credentialing
Build on prior efforts UK Data /Administrative Data Research Service DwBs Circle of Trust Effectively, US Census Bureau establishing Special Sworn Status with Title 13 and 26 training as credential to access restricted data in the US Federal Statistical System Advantage: (eventually) will be able to link data sets from different agencies Disadvantage: very high barrier
9
Credentialing levels Platinum credentials Trusted, trained researchers
Special sworn status with FBI background check and multiple, annual training Tin credentials Undergraduate doing summer independent study from home
10
Credentialing levels Goal: researcher criteria matrix
Associated with established IDs Maintained and vetted by software that controls access or points to methods for applying for access, given the researcher’s credentials
11
Researcher reputations
Credentialing allows researchers to establish a reputation for responsible data stewardship Credentialed researchers offer reputations as bond Existence of credential and reputation gives researchers something they can lose Makes it possible for them to access data not possible if there were no credible bond
12
Credentialing can increase data sharing
Assurance to data providers Reduce cost of providing confidentiality protection Existence of community norms creates a mechanism for data producers to share data. Rather than “I can’t share the data,” instead “I’m willing to share data with people with such and such a credential.” Data management plans can protect confidentiality by relying on established community norms Feasible to access data from multiple custodians, even when data are kept separate
13
Key Points Community norms for credentialing will facilitate
Importance of multiple levels of credentialing Key Points Community norms for credentialing will facilitate Data access Data sharing Creates mechanism for durable researcher reputation Confidentiality protection requires examination of interaction of Data sensitivity and re-identification risk Researcher credentials
14
Questions and Suggestions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.