Biolink NL A national infrastructure for linkage of biobanks to medical and socioeconomic registries Adelaide Ariel SHIP Conference 28th-30th August 2013.

Slides:



Advertisements
Similar presentations
Generation R Study Claudia Kruithof, MSc datamanager Generation R EUCCONET workshop june 2011 – Record Linkage.
Advertisements

LINKING RECORDS TO ADVANCE CHILD PROTECTION: A CALIFORNIA CASE STUDY Emily Putnam-Hornstein, PhD University of Southern California Barbara Needell, PhD.
RURAL HEALTH NETWORK DEVELOPMENT PLANNING PROGRAM FUNDING OPPORTUNITY ANNOUNCEMENT HRSA PRE-REVIEW CONFERENCE CALL FEBRUARY 7, 2014 PRESENTER: AMBER.
11/19/2014 “Perceived” severity reported by individuals and “actual” disability as measured by clinical testing Washington Group on Disability Statistics.
1 Cohort management and the Secondary Uses Service (SUS) Nirupa Dattani Office for National Statistics.
US Berkeley 2/12/2013 linking population-based data to child welfare records: a public health approach to surveillance Emily Putnam-Hornstein, PhD University.
Wisconsin Department of Health Services Richard Miller Research Scientist Wisconsin Office of Health Informatics October 28, 2014 Matching Traffic Crash.
Louisiana Cancer Control Partnership Evaluation of State Partners Donna L. Williams, MS, MPH And Melody M. Robinson, MPH.
Boosting policy-relevant research using linked administrative data Louisa Jorm University of Western Sydney The Sax Institute.
Record Linkage Simulation Biolink Meeting June Adelaide Ariel.
Bosna i Hercegovina Agencija za statistiku Bosne i Hercegovine Bosna i Hercegovina Agencija za statistiku Bosne i Hercegovine Post-enumeration Survey-A.
Using ICD Codes and Birth Records to Prevent Mismatches of Multiple Births in Linked Hospital Readmission Data Alison Fraser 1, MSPH, Zhiwei Liu 2, MS,
Counting the Dutch, The Future of the Virtual Census in the Netherlands Presentation at the seminar Counting the 7 Billion 24 February 2012 * Geert Bruinooge.
Capturing Sensitive Data & Data Linkage. Capturing Sensitive Data Data Protection Act 1998 (Section 33) – Allows data to be used for research purposes.
Turning Junk Data into Value Yukiko Yoneoka, MS UDOH Public Health Informatics Brown Bag July 22, 2009 Using 9-digit Mixed Identifiers to Enhance Linkage.
March 2013 ESSnet DWH - Workshop IV DATA LINKING ASPECTS OF COMBINING DATA INCLUDING OPTIONS FOR VARIOUS HIERARCHIES (S-DWH CONTEXT)
Michigan Newborn Screening & Live Births Records Linkage and Follow-Up of Potentially Un-Screened Infants Steven J. Korzeniewski, MA, MSc, Maternal & Child.
Project Update : Claims/Clinical Linkage Project MHDO Board of Directors June 6, 2013.
RESEARCHERS‘ ACCESS TO HEALTH DATA – FACTS AND CHALLENGES Metka Zaletel National Institute of Public Health 24 March 2015.
The Dutch Censuses of 1960, 1971 and 2001 Producing public use files in the IPUMS project Wijnand Advokaat Statistics Netherlands Division Social and Spatial.
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Anna Font-Gonzalez PhD Student Emma Children’s Hospital/Academic Medical Center Amsterdam, The Netherlands Burden of disease in childhood cancer survivors.
Cervical cancer screening in Estonia: present situation Piret Veerus Department of Epidemiology and Biostatistics National Institute for Health Development.
Legal and ethical issues EHES Training Material. Definition of “legislation” and “ethics” and their relationship Legislation A law or legal regulation.
Prototype Evidence-based Database for Transportation Asset Management Janille Smith-Colin, Infrastructure Research Group 2014 UTC Conference for the Southeastern.
Aspects of the National Health Interview Survey (NHIS) Chris Moriarity National Conference on Health Statistics August 16, 2010
Improving Data Quality and Quality Assurance in Newborn Screening by Including the Bloodspot Screening Collection Device Serial Number on Birth Certificates.
ENABLING DATA LINKAGE TO MAXIMISE THE VALUE OF PUBLIC HEALTH RESEARCH DATA Presentation of findings to the Public Health Research Data Forum University.
1 Final Version© Ipsos MORI Final Version Evaluation of Adult Cancer Aftercare Services Quantitative and Qualitative Service Evaluation for NHS Improvement.
1 Health Information Security and Privacy Collaboration (HISPC) National Conference HISPC Contributions to Massachusetts HIE Privacy and Security Progress:
Dutch Virtual Census Presentation at the International Seminar on Population and Housing Censuses; Beyond the 2010 Round November, 2012 Egon Gerards,
Results from eHI & CHIME Survey Use of Data and Analytics by Providers Jennifer Covich Chief Executive Officer August 30, 2012.
Confidentiality and Security Issues in ART & MTCT Clinical Monitoring Systems Meade Morgan and Xen Santas Informatics Team Surveillance and Infrastructure.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
All the answers? Statistics New Zealand’s Integrated Data Infrastructure Paper by Felibel Zabala, Rodney Jer, Jamas Enright and Allyson Seyb Presented.
JSM, Boston, August 8, 2014 Privacy, Big Data and The Public Good: Statistical Framework Stefan Bender (IAB)
Measuring maternal alcohol consumption and Fetal Alcohol Spectrum Disorder in Canada: A model for national prevalence estimation Ariel Pulver Jocelynn.
Longitudinal Data Recent Experience and Future Direction August 2012.
The relationship between error rates and parameter estimation in the probabilistic record linkage context Tiziana Tuoto, Nicoletta Cibella, Marco Fortini.
Geneva, 21 May 2012 Snezana Lakcevic Statistical Office of the Republic of Serbia Head of Population Census Division Workshop on Censuses Using Registers.
Assumes that events are governed by some lawful order
Health Information Solutions Gaining Insights through Data Linkage: The VS-PDD Linked Data Files Presenters: Beate Danielsen & Jan Morgan.
Morbidity-data sources and measures Farid Najafi MD PhD Kermanshah Health Research Center (KHRC) Kermanshah University of Medical Sciences.
The Dutch Virtual Census of 2001 A New Approach by Combining Different Sources Eric Schulte Nordholt ECE Census meetings Geneva, November 2004.
1 Data Linkage Project Florida’s Newborn Screening Program Gary Sammet Bureau of Vital Statistics.
Assessing SES differences in life expectancy: Issues in using longitudinal data Elsie Pamuk, Kim Lochner, Nat Schenker, Van Parsons, Ellen Kramarow National.
1 For a Population Statistical Register Characteristics and Potentials for the Official Statistics Central department for administrative data and archives.
Evaluating the New Technologies Ann Sefton Faculties of Medicine and Dentistry University of Sydney.
Program Evaluation Principles and Applications PAS 2010.
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics Improving Estimates of the.
PCOR Privacy and Security Research Scenario Initiative and Legal Analysis and Ethics Framework Development Welcome and Please Sign In »Please sign into.
An Overview of Community Emergency Preparedness Planning Module 2 Session 2.1 National Disaster Management Practitioners, Islamabad, Pakistan.
Using administrative data to produce official social statistics New Zealand’s experience.
Synthetic Approaches to Data Linkage Mark Elliot, University of Manchester Jerry Reiter Duke University Cathie Marsh Centre.
Linking with Birth Certificate Data to Improve Patient Follow-up in Central Cancer Registries Daixin Yin, Janet Bates, Mark Allen, Lilia O’Conner California.
Introduction to NCHS Rob Weinzimer, Special Assistant for Outreach Centers for Disease Control and Prevention National Center for Health Statistics.
Roland Gamache, Ph.D., MBA Director, State Health Data Center Indiana State Department of Health.
Regional DLI Training: Introduction to PCCF St. John’s Newfoundland Berenica Vejvoda May 5-6, 2016.
Chapter 11: Test for Comparing Group Means: Part I.
Working More Productively: Tools for Information-Rich Environments Ruth-Ann Soodeen, Leslie L. Roos, Ruth Bond, Charles Burchill & Karen Roberts Manitoba.
PRAGMATIC Study Designs: Elderly Cancer Trials
HCS 465 OUTLET Experience Tradition /hcs465outlet.com FOR MORE CLASSES VISIT
Record linkage approaches in pSCANNER Toan Ong, PhD Assistant Professor Department of Pediatrics University of Colorado, Anschutz Medical Campus.
eHealth Standards and Profiles in Action for Europe and Beyond
POTENTIALS OF FOR DATA LINKAGE
Using the Registry to Conduct WinCASA Assessments: Lessons Learned
National Immunization Conference
Pnina ZADKA Central Bureau of Statistics Israel
Pnina ZADKA Central Bureau of Statistics Israel
Presentation transcript:

Biolink NL A national infrastructure for linkage of biobanks to medical and socioeconomic registries Adelaide Ariel SHIP Conference 28th-30th August 2013

2 The Dutch Biolink Project (Biolink NL) Main goals: To improve the efficiency and quality of linkage of biobanks to medical and socioeconomic registries, in conformity with statutory and consent obligations to participants; To set up a national infrastructure to enable these linkages The Biolink Project is a collaboration project of Dutch universities, University Medical Centers, Statistics Netherlands, and health care institutions.

3 Linking Challenges in the Biolink NL Unique identifier is lacking Linking would be performed on personal identifiers Privacy concerns Surname might not be allowed for use Personal identifiers have to be encrypted Both availability and quality of the personal identifiers may vary across registries

4 Linking Approaches in the Biolink NL Personal identifiers as linking variables: Surname, the date of birth, sex, postal code Take into consideration: Surname might not be allowed for use Research questions: which personal identifier would be a ‘must’ in which situation a deterministic/probabilistic method would perform best

5 Project Approach Development Evaluation Testing Conduct a literature survey on record linkage methodology & applications Develop a prototype for the linkage strategy by using simulated data Test the linkage strategy on real data Evaluate the linking results by means of other identifier (encrypted Dutch-ID) content variable (content-validation)

6 Current Presentation Development Evaluation Testing Develop a prototype for linkage strategy by using simulated data. Real data were used as blueprints for simulated data. Overview: Our motivations Factors considered in the simulation Findings Prototype for the linkage strategy

7 Our motivations: We want to experiment with different approaches, without violating privacy concerns. The simulated data sets are modelled after the real data sets. We want to include “what-if” scenarios: What if not all identifiers are available for linking? What if the amount of shared records is small? What if the error rate is high? Using Simulated Data

8 Factors Considered for the Simulation The linkages in the Biolink NL deal with registries of varying size and population covered Pathology Data Cancer Registry General Population Registry Female Cohort Children Cohort

9 Factors Considered for the Simulation The amount of shared records (overlap) may vary Cancer Registry General Population Registry Cancer Registry Female Cohort Large Overlap Small Overlap

10 Factors Considered for the Simulation Personal identifiers are not 100% accurate or consistent; for instance due to: Typing errors Changing address Using different surnames (married vs maiden name) We vary the amount of errors up to 30%

11 Linking Methods Preferably practical and applicable for encrypted identifiers. Deterministic linkage method Partial matching Probabilistic linkage method Simple probabilistic Jaro-Winkler Bigram Implemented in SAS 9.2 and RecordLinkage (R package)

12 Simulation Findings (1) The identifier date of birth should be included.

13 Simulation Findings (2) Together, deterministic and probabilistic method can be used to help detect possible overlap size.

14 Simulation Findings (3) Deterministic method appears to be particularly more suitable for: Small overlap size (< 60%) Probabilistic method appears to perform best when the following conditions are met: Large overlap size (more than 60%) All identifiers are taken as linkage variables

15 Linking Strategy 15 Less than 20,000 records? Include surname? Deterministic Probabilistic Possible overlap size < 50%? Deterministic Probabilistic Include surname? Yes No

16 Next Steps The following linkages will be chosen for testing and evaluation: A Dutch female cohort – the Dutch Cancer Registry Dutch twin-children cohort – Health Insurance Database Dutch children cohort – the Dutch National Pharmacy Database