Cancer in Young People in Canada: Broadening the research and surveillance potential by linking to other administrative databases. J. Kaur1, M. Frechette1,

Slides:



Advertisements
Similar presentations
Statistics Canada Statistique Canada mai 2005 / 1.
Advertisements

The ONS Longitudinal Study. © London School of Hygiene and Tropical Medicine The Office for National Statistics Longitudinal Study (LS) o What is it o.
2014 Survey on Living with Chronic Diseases in Canada (SLCDC): Mood & Anxiety Disorders National Mental Health and Addictions Information Collaborative.
A Perspective on Canadian Initiatives in Health Care Quality HL7 Clinical Quality Work Group June 26,
Presented to the Child Welfare Council Data Linkages Committee 3/6/2013 CHILDREN’S DATA NETWORK : THE PROPOSED DEVELOPMENT OF A STATEWIDE INTEGRATED DATA.
Institute of Cancer Research - Institut du cancer ICR’s Activities in Cancer Imaging.
Becoming Canadian Citizens: Intent, process and outcome Kelly Tran, Tina Chui: Statistics Canada Stan Kustec, Martha Justus: Citizenship and Immigration.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Chuck Humphrey Data Library Co-ordinator University of Alberta May 16, Capitalising on Metadata Tool development plans IASSIST 2007.
ESDS resources for managing data Jack Kneeshaw Economic and Social Data Service University of Essex, 27 January 2009.
Components of HIV/AIDS Case Surveillance: Case Report Forms and Sources.
Locating EPIC/PHSI in a Sea of Acronyms
The Census of Canada and Immigration & Ethno-cultural Data Chuck Humphrey University of Alberta February 10, 2006.
The Research Data Centre Program Microdata Access Division Heather Hobson April 23, 2009.
Data Integration Efforts and Challenges Scott M. Bilder, Ph.D. Institute for Health, Health Care Policy, and Aging Research Rutgers, The State University.
Information network on rare cancers Work Package 4 Information on epidemiology of rare cancers Riccardo Capocaccia and Sandra Mallone Cancer Epidemiology.
CIHC is a 2-year initiative funded by Health Canada Interprofessional Education and Collaborative Practice Request for a Special CIHR Competition.
Chapter 7: Indexes, Registers, and Health Data Collection
Gateway to the Future: Improving the National Vital Statistics System St. Louis, MO June 6 th – June 10 th, 2010 Abortion Surveillance in the United States:
Opportunities: Administrative Data, Data Linkage, New Initiatives, Expanding access Health, Justice, Special Surveys Branch October 2015.
Using administrative data to produce official social statistics New Zealand’s experience.
Overview of National Center for Health Statistics (NCHS) Data Systems Mary Burgess
Health data in Ontario Susan Bondy, U. of Toronto Dalla Lana School of Public Health Presented at: Health Over the Life Course, Pre-conference Workshop.
The Cancer Registry of Norway Jan F Nygård Head of the IT-department.
Linking with Birth Certificate Data to Improve Patient Follow-up in Central Cancer Registries Daixin Yin, Janet Bates, Mark Allen, Lilia O’Conner California.
Marc Hamel and Julie Trépanier May 21, 2014 Canadian Statistical Demographic Database: A research project.
1 Overview of presentation 1.Context 2.Objectives 3.Methods 4.What has been achieved 5.What has to be done NCSI-CYP – Risk Stratification Investigation.
The Research Data Centre Program October 14, 2015 DLI- EAC Donna Dosman.
Institute for Clinical Evaluative Sciences HEALTH ANALYTICS FOR INFORMED DECISION MAKING: HEALTH SYSTEM USE SUMMIT Concurrent Track #2, February 11, 2016.
Working More Productively: Tools for Information-Rich Environments Ruth-Ann Soodeen, Leslie L. Roos, Ruth Bond, Charles Burchill & Karen Roberts Manitoba.
THE CREATION OF A CANADIAN LIMITED USE DATA FILE June 2009 North American Association of Central Cancer Registries San Diego, CA Les Mery Program Director,
Malawi National Cancer Registry update SJD Chasimpha Malawi Cancer Symposium Lilongwe August, 2016.
A FRUIT AND VEGETABLE PRESCRIPTION PROGRAM
The Difference Filing Taxes Can Make
Rory M Marks, John Z Ayanian, Brahmajee K Nallamothu
CPHA – Collaborator Session Measuring Health Inequality in Canada
Discharge Abstract Database (DAD) Product Review
Vital Statistics Institute for Implementation Science In population health at cuNY CUNY Graduate School of Public Health and Health Policy June 2016 Disclaimer:
DAD Research Analytic Files
Navigating Your Way Through the EFT, Nesstar and Beyond 20/20 (WDS)
Challenges in data linkage: error and bias
CHS Community Health Survey
Public Health Data Sharing: A New Opportunity for UA Faculty & Staff
The Research Data Centre Program
NYSDOH AIDS Institute Quality of Care Program eHIVQUAL
Research Data Centre DLI Workshop (December, 2001)
Improving Data, Improving Outcomes Conference Aug , 2016
Illustrating HIV/AIDS in the United States
Patient Medical Records
What is Administrative Data?
Agenda Welcome and Introductions Purpose of Investment
Presentation 2b 2018 Census Products & Services Engagement.
Semiannual Report, March 2015
Civil Registration Process: Place, Time, Cost, Late Registration
Part F-I The Economic Theory of Crime and Punishment
Brisbane Accord Group SESSION 15. APPLICATION AND UTILIZATION OF CIVIL REGISTRATION AND VITAL STATISTICS INFORMATION Civil Registration Process: Place,
SCHS and Health Statistics
Capacity building on the use of Geospatial Data and Technologies
Cindy Murray NP Princess Margaret Cancer Centre
Capitalising on Metadata
Andrew Jenkins and Rosalind Levačić
Megan Eguchi, MPh Sana karam, md, phd
NAACCR /IACR Combined Annual Conference
Berrien County FACT SHEET
3.7.1 Definition The percentage of Ontario men and women of screen-eligible age (ages 50-74), who have completed at least one FOBT in the prior two years.
Costs of Operating Population-Based Cancer Registries: Results from Four Sub-Saharan African Countries Florence Tangka, PhD Senior Health Economist, Division.
Investigating the effects of socioeconomic factors on cancer treatment patterns and outcomes in Canada using individual-level linked national data Presenters:
Adriene Harding, Christine Laporte
State Consumer Health Information and Policy Advisory Council Meeting
REACHnet: Research Action for Health Network
Presentation transcript:

Cancer in Young People in Canada: Broadening the research and surveillance potential by linking to other administrative databases. J. Kaur1, M. Frechette1, L. Xie 1, J. Onysko1 S. Bryan2, Y. Decady2, R. Barber3, L. Sung4 1 Public Health Agency of Canada; 2 Statistics Canada; 3 C17 Council; 4 Division of Haematology/Oncology, The Hospital for Sick Children, Toronto

Outline Overview of the Cancer in Young People in Canada (CYP-C) database Record linkage at Statistics Canada Linkage plans: CYP-C to other administrative data Access to linked data for researchers Research potential of the linked data

What is CYP-C? An enhanced cancer registry and research database focused on children aged <15 years AND diagnosed or treated in a Canadian pediatric cancer centre CYP-C database captures: Demographics (sex, date of birth, postal code, ethnicity) Diagnostic details (cancer type and stage, risk category) Treatment plans / details (chemo, surgery, transplant, trial participation) Dates (time intervals between events) Short/medium-term outcomes (5-year follow-up: relapse, complications, and survival) Longer-term health outcomes through data linkage Longer follow-up - survival, late effects, causes of death Canadian Vital Statistics: death database CIHI hospitalization data

CYP-C – How it works Data are collected from the 17 pediatric oncology centres across Canada C17 Ontario data provided by POGONIS are integrated into CYP-C. Data abstraction is obtained through chart review from each pediatric oncology centre by a Clinical Research Associate (CRA) at each centre. Detailed information on eligible cases is collected from diagnosis through to five years post diagnosis. Contains over 16,595 cases (2001 - 2015)

Obtaining CYP-C data Application process: Go online: www.c17.ca Complete a Research Proposal Template Data Elements Checklist Chemotherapy Checklist Data requests can be for aggregate or patient level data Submit your application Reviews by CYP-C Management Committee & PHAC Privacy (6-8 weeks) CYP-C Research Champion Webinars Website lists the data elements and chemotherapeutic agents used

Record Linkage at Statistics Canada The Social Data Linkage Environment (SDLE) at Statistics Canada facilitates the creation of linked administrative and survey data files for social analysis. The SDLE is designed to bring person-level data together in one virtual linkage environment. These data may come from either administrative sources or surveys (large circle). Inside this large circle are seven interconnected circles which display the range of subjects possible in the SDLE. In addition, the colors show the approximate proportion of administrative (blue) and survey (green) data available at Statistics Canada. These circles point towards the image of a person in the middle, which represents the person-oriented nature of SDLE linkages and the ability to link individuals across multiple sources. Finally, data sources exit the environment on the right side as linked outputs. Data files are linked together through linkage keys that are stored in the Derived Record Depository (DRD) and the Key Registry. The DRD is essentially a registry of Canadians containing only basic personal identifiers. The DRD is built using core administrative data files (vital statistics, tax and immigration files) and each individual is assigned a unique SDLE identifier. We have done this because we do not have a registry of the population as you might find in provinces through health card files or other administrative files. Every file linked in the SDLE is linked to the DRD. The Key Registry stores only the SDLE identifiers and associated record identifiers from all survey and administrative data files linked to the DRD.  Once a study requiring data linkage has been approved and the linkage has been conducted, the Key Registry identifiers are used to find the individual records in the source data files (analysis variables without personal identifiers). Selected records and variables from these sources can then be integrated into a linked analysis file. Identifiers are dropped and an analytical file is created Access to the deidentified data occurs in a secure centre

Planned linkages Data files to be linked to CYP-C Cancer incidence information from the Canadian Cancer Registry (CCR) Death information (CVSD) Hospitalization data (DAD, NACRS, OMHRS) Birth information (CVSB, CVSBD) Income-personal and family (T1FF)

Overview of the Linkage Process Personal identifiers from all datasets are linked into the SDLE Each person level record is assigned a unique ID Unique IDs are matched between datasets of interest to create keys that identify a person across multiple data sources Keys are used to create analytical datasets Cohort file (CYP-C) analytical information is merged to analytical information from other datasets, no personal identifiers E.g., CYP-C key is merged to CYP-C analytical variables and then merged with other administrative datasets such as the Canadian Cancer Registry and death information Deterministic record linkage Matching exactly on a unique key composed of one or more variables The unique key should be highly discriminatory for any person Probabilistic record linkage The likelihood that records being linked refer to the same person is estimated using the generalized record linkage system (G-LINK) A set of linkage variables selected based on their collective discriminating power and their availability on both the input data file and the DRD A conservative linkage method is applied so as to attain the highest quality linkage by minimizing false positive links at the expense of generating false negative links Overview of the linkage process Source index files are linked to the DRD Key registry is populated with the paired SDLE IDs and source index file record IDs Linked keys files are extracted from the Key Registry Assign randomly generated unique IDs to the records in the linked keys files Linked datasets are produced by merging linked keys files to the source data files Validation of linked datasets Bias assessment of linked and not-linked records (e.g., distributions by key variables such as sex, geography, vital status, etc.) Linkage rates Access through the secure the RDC network and having all output vetted prior to release outside of the RDCs: The Research Data Centres RDCs are part of an initiative by Statistics Canada, the Social Sciences and Humanities Research Council (SSHRC) and university consortia to help strengthen Canada's social research capacity and to support the policy research community. The program would like to acknowledge the generous support of the Canada Foundation for Innovation (CFI) and the Canadian Institutes of Health Research (CIHR). RDCs provide researchers with access, in a secure university setting, to microdata from population and household surveys, administrative data holdings and linked data. The centres are staffed by Statistics Canada employees. They are operated under the provisions of the Statistics Act in accordance with all the confidentiality rules and are accessible only to researchers with approved projects who have been sworn in under the Statistics Act as 'deemed employees.' RDCs are located throughout the country, so researchers do not need to travel to Ottawa to access Statistics Canada microdata.

Validation of linked dataset Validation of the linkage of each dataset into the SDLE Bias assessment of linked and not-linked records Linkage rates by various factors Validation of the linked analytical file Fitness for use—do the data make sense in the world? Comparison of like concepts across all linked datasets (e.g., birthdate, sex, date of death) Validation report with data limitations Preparation of user documentation User guide Vetting rules Data access through the Research Data Centres network. Deterministic record linkage Matching exactly on a unique key composed of one or more variables The unique key should be highly discriminatory for any person Probabilistic record linkage The likelihood that records being linked refer to the same person is estimated using the generalized record linkage system (G-LINK) A set of linkage variables selected based on their collective discriminating power and their availability on both the input data file and the DRD A conservative linkage method is applied so as to attain the highest quality linkage by minimizing false positive links at the expense of generating false negative links Overview of the linkage process Source index files are linked to the DRD Key registry is populated with the paired SDLE IDs and source index file record IDs Linked keys files are extracted from the Key Registry Assign randomly generated unique IDs to the records in the linked keys files Linked datasets are produced by merging linked keys files to the source data files Validation of linked datasets Bias assessment of linked and not-linked records (e.g., distributions by key variables such as sex, geography, vital status, etc.) Linkage rates Access through the secure the RDC network and having all output vetted prior to release outside of the RDCs: The Research Data Centres RDCs are part of an initiative by Statistics Canada, the Social Sciences and Humanities Research Council (SSHRC) and university consortia to help strengthen Canada's social research capacity and to support the policy research community. The program would like to acknowledge the generous support of the Canada Foundation for Innovation (CFI) and the Canadian Institutes of Health Research (CIHR). RDCs provide researchers with access, in a secure university setting, to microdata from population and household surveys, administrative data holdings and linked data. The centres are staffed by Statistics Canada employees. They are operated under the provisions of the Statistics Act in accordance with all the confidentiality rules and are accessible only to researchers with approved projects who have been sworn in under the Statistics Act as 'deemed employees.' RDCs are located throughout the country, so researchers do not need to travel to Ottawa to access Statistics Canada microdata.

Access to linked data 30 Research Data Centres across Canada Researchers at different RDCs can collaborate on joint projects Centre Branch / Antenne Yellowknife Prince George St. John’s Edmonton Burnaby Saskatoon Vancouver Calgary Moncton Victoria Winnipeg Québec Lethbridge Montréal Fredericton Halifax North Bay Ottawa Sherbrooke Guelph Kingston Waterloo Toronto Hamilton Windsor London

How to access the data Research Data Centres www.statcan.gc.ca/eng/rdc/process On average, researchers can be granted access to data already in the RDC within 2 months. No fees for student research projects. Generally no fees for academic researchers from participating institutions.

Linked data: analytical plans Phase 1. Assessing completeness and death clearance in CYP-C CYP-C – CCR case completeness and concordance CYP-C – CVS:D death completeness and concordance Phase 2. 5,10,15-year survival of childhood cancer patients 5,10,15-year risk of recurrence and subsequent malignancy 5,10,15-year survivor hospitalization rates overall and associated with late effects: Preventable chronic conditions Mental disorders Other Hospitalization for other conditions Pregnancy/birth Infection Emergency visits Economic impact of childhood cancer diagnoses: on patients and families

Thank you and thanks to the: CYP-C Partners Group CYP-C Management Committee CYP-C Steering Committee PHAC & Statcan teams Contact: jay.Onysko@canada.ca / shirley.bryan@canada.ca