CHEPA & Health Policy PhD Program

Slides:



Advertisements
Similar presentations
DLI Orientation: Concepts
Advertisements

DLI & Research Data Centres Creating a better understanding of these two programs Chuck Humphrey Data Library University of Alberta April 2004.
DLI Orientation: Concepts A Framework for Thinking about Statistical Information Train the Trainers Montreal, March 9, 2004 Chuck Humphrey Data Library.
Data Access and Data Use: the Missing Link? Elizabeth Hamilton University of New Brunswick Chuck Humphrey University of Alberta Data and Knowledge Transfer.
Chuck Humphrey Data Library University of Alberta.
First Year in Focus at Canadian Colleges and Universities.
Meeting the Challenge The National Population Health Survey and Data Access E. Hamilton UNB Libraries IASSIST 2003.
Introducing Statistics and Data Geographic, Statistical and Government Information Centre, Susan Mowers.
Statistics Canada Statistique Canada mai 2005 / 1.
Quantitative Evidence for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library March 6, 2009.
Statistics and Data for Marketing Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 27, 2008.
EAS 293 Data Library, Rutherford North 1 st Floor Chuck Humphrey Data Library October 14, 2008.
PUBH 898: Health Economics Finding data and statistics.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Chuck Humphrey, University of Alberta Atlantic DLI Training, 2008 DLI Orientation: Concepts A Framework for Thinking about Data and Statistics.
Ontario Data Documentation, Extraction Service and Infrastructure IASSIST 2008 Palo Alto, California.
The Census of Canada and Immigration & Ethno-cultural Data Chuck Humphrey University of Alberta February 10, 2006.
The Research Data Centre Program Microdata Access Division Heather Hobson April 23, 2009.
Framework of Statistical Information. This is a typology of the categories or classes of statistical information. Remember the relationship between statistics.
Innovations in Data Dissemination Thomas L. Mesenbourg, Jr. Acting Director U.S. Census Bureau United Nations Seminar on Innovations in Official Statistics.
Creating Something from Nothing: Synthetic and Dummy files Bo Wandschneider University of Guelph Chuck Humphrey University of Alberta DLI Training: Ottawa,
2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008.
Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada
Project? Microdata? Say what? TRY Conference May 5, 2008 Suzette Giles, Ryerson University Laine Ruus, University of Toronto.
DATA and STATISTICS … at your service! S.Mowers & the GSG team ©2009, University of Ottawa.
Developing and applying business process models in practice Statistics Norway Jenny Linnerud and Anne Gro Hustoft.
Creating Something from Nothing: Working with Synthetic Files ACCOLEDS /DLI Training: December 2003 Chuck Humphrey University of Alberta.
Ontario Data Documentation, Extraction Service and Infrastructure.
Handling Reference Questions DLI Orientation Session Kingston, Ontario April 5, 2004.
David Price October 2011 Real Time Remote Access (RTRA) #10.
DLI and EQUINOX Question 1 How do I find out what survey datasets are available from Statistics Canada ?
Health Statistics 2016 DLI Atlantic Training
Rural Development Finding data and statistics.  Statistics Canada: Federal statistical agency  Data released under the Data Liberation Initiative (DLI)
Real Time Remote Access: Educational resources Susan Mowers, University of Ottawa.
Finding Data Files at the U of S Library Sociology 398, Social Inequality and Health Kiran Doranalli Lucy Li Data & GIS Library Services, U of S Library.
Expanding the Role of Synthetic Data at the U.S. Census Bureau 59 th ISI World Statistics Congress August 28 th, 2013 By Ron S. Jarmin U.S. Census Bureau.
Data Access North of the (US) Border
Small Area Data and Geography For the 2017 DLI Training Workshop
“Data from national surveys: access, analysis, and sharing”
Geo-referenced data and DLI aggregate data sources
Tracking and Sharing Survey Data: Findings from the Field
Jeff Moon Data Librarian &
DAD Research Analytic Files
Navigating Your Way Through the EFT, Nesstar and Beyond 20/20 (WDS)
Accessing data – a user’s perspective
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
DLI Website.
Creating Something from Nothing: Working with Synthetic Files
1 3.
The Research Data Centre Program
Research Data Centre DLI Workshop (December, 2001)
2001 Census of Population Products and Services Presentation to ACCOLEDS December 6, 2001.
Beyond 20/20 for Beginners.
Health Statistics Information on STC website
Susan Mowers, Data Librarian, GSG Centre - UOttawa
Presentation 2b 2018 Census Products & Services Engagement.
Assessing Quality of Paradata to Better Understand the Data Collection Process for CAPI Social Surveys François Laflamme Milana Karaganis European Conference.
Finding and Using Health Statistics and Data Files Epidemiology 1
Health and Human Services Information for Rural America
University of Regina Library
Disclosure Avoidance: An Overview
Queen’s University Library: Open Scholarship Services
Telling Canada’s story in numbers Marie-Josée Major
Mapping Data Production Processes to the GSBPM
The role of metadata in census data dissemination
Data Liberation Initiative (DLI)
Investigating the effects of socioeconomic factors on cancer treatment patterns and outcomes in Canada using individual-level linked national data Presenters:
Exploring the DLI Product line
DLI Annual Report Spring 2019
Creating Something from Nothing: Working with Synthetic Files
Presentation transcript:

CHEPA & Health Policy PhD Program The Research Data Centre (RDC) PROGRAM www.statcan.gc.ca Telling Canada’s story in numbers Mustafa Ornek rdc2@mcmaster.ca Vivek Jadon vivek@mcmaster.ca CHEPA & Health Policy PhD Program March 15, 2016

What is an RDC ? An RDC is a secure Statistics Canada (StatsCan) research laboratory physically located on a university campus to ensure that statistical microdata can be extensively analyzed by the research community across Canada without compromising the confidential nature of the data / information

Master File versus PUMFs Compared to PUMFs, a master file: contains the full sample of respondents, not a sub-set of them includes additional categories in variables; more detailed information allows research on lower levels of geography provides discrete values for certain variables, such as age or body weight, instead of categories may offer other concepts that are not available in PUMFs Moreover, master files contain derived variables and bootstrap weights RDC access is useful when PUMF does not exist or provide adequate level of details for quality research or when longitudinal data linkage is required

Continuum of data access Statistical Products Custom Requests Microdata www.statcan.gc.ca Daily releases Cansim tables Analytic articles Aggregate tables Data tabulations PUMFs (DLI) Real Time Remote Access Remote job submissions Research Data Centres Federal RDC @ McMaster University Mills Library 1st floor for Public use master files (PUMFs) @Statistics Canada RDC Mills Library 2nd floor for Confidential microdata master files

Microdata Access for Researchers Public Use Microdata Files (PUMF) Each Public Use Microdata File is based on a corresponding master data file. The modifications performed by Statistics Canada before the PUMF is released ensure that the risk of breaching confidentiality has been removed. Since the results of any analysis performed do not have to be scrutinized before they are released, the file is considered “Public”. Modifications made to master files to convert them to PUMFs may include: collapsing of variables (e.g., age groups instead of individual years of age); collapsing variables into one variable (e.g., multiple language questions collapsed into one language variable for analysis); suppressing variables (although the variable is part of the master file, it will not show up in the public file); and removing outliers (removing cases that are extremes - often used with income). By using these techniques to anonymise the files, combining variables will not result in the user identifying a respondent.

Microdata Access for Researchers Public Use Microdata Files (PUMF) Benefits Free Very few conditions to access & use the data No approval process to access the data Limitations Content is limited (screened and grouped for confidentiality) Not all surveys have a PUMF Almost all PUMFs are cross-sectional, i.e., represent data collected at one point in time

Microdata Access for Researchers Programs related to accessing PUMFs Data Liberation Initiative (DLI) Microdata Access Division Programs related to accessing microdata Real Time Remote Access (RTRA) Research Data Centres (RDC) Federal Research Data Centre (FRDC) Canadian Centre for Data Development and Economic Research (CDER)

Data Liberation Initiative (DLI) DLI provides access to Statistics Canada standard products, databases, 350 public-use microdata and geographic information files. DLI metadata are coded for search ability DLI members have support through a very active listserv Currently 77 subscribing institutions; McMaster University Library is also part of Stat Can DLI program.

DLI PUMF Collection The DLI offers access to all public data products such as: The public use files for over 200 surveys including, the Canadian Community Health Survey, the National Population Health Survey, the General Social Survey, the Labour Force Survey and the Census of Population Databases such as the Small Area Business and Labour Database, Inter-Corporate Ownership etc An enhanced line of Census products Aggregated data on subject such as Justice and Education All standard geographic files and databases

Health-related DLI PUMFs National Population Health Survey (NPHS) Canadian Community Health Survey (CCHS) Canada Health Survey (CHS) General Social Survey (GSS) Joint Canada/US Survey of Health (JCUSH) National Longitudinal Survey of Children and Youth (NLSCY) Participation and Activity Limitation Survey (PALS) Discharge Abstract Database (DAD) – Conducted by CIHI

Microdata Access for Researchers Synthetic Files These microdata do not contain actual “real” cases but contain pseudo-cases that provide aggregate results close to the “real” cases These files have been prepared to create analysis runs with the master file without possibly disclosing or identifying any of the cases The results are NOT to be reported, but are strictly to be used to prepare analysis of master files Usually associated with longitudinal files, e.g. NLSCY and CCHS

Real Time Remote Access (RTRA) On-line remote access facility allowing users to run SAS programs, in more or less in real-time, against microdata located in a central and secure location. Researchers using the RTRA system do not gain direct access to the microdata and cannot view the content of the microdata. Instead, users submit SAS programs to extract results in the form of frequency tables. As RTRA researchers cannot view the microdata, becoming a deemed employees of Stat Can is no longer necessary. Hence, rapid access to microdata files. Using a secure username and password, the RTRA provides around the clock access to survey results from any computer with internet access. Confidentiality of the micro data is automated in the RTRA system, eliminating the need for manual intervention and allowing for rapid access to results.

Real Time Remote Access (RTRA) Benefits: Access from any computer with internet access, using a secure username and password - No travel to RDCs Few conditions on access and use of data Full Master Files available; Confidentiality automated Deemed employee status not required Limitations: Tabular frequencies only - SAS only Only certain statistics available Not all data sets available Costs http://www.statcan.gc.ca/rdc-cdr/rtra-adtr/rtra-adtr-eng.htm

Access DLI Data: Odesi Data Portal Odesi is... a web-based data extraction system delivered through  Scholars Portal, provides access to diverse, quality, numeric data sets including microdata survey collection from DLI, demographic data from Statistics Canada and polling data from Gallup. facilitates the exploration (searching, data manipulation, creation of summary statistics, graphing and export) of multiple, sophisticated data sets. Access is open to all OCUL institutions and is controlled by IP address http://search1.odesi.ca/

Research Data Management (RDM) RDM is the active organization and maintenance of data throughout its lifecycle, from its collection, interpretation, dissemination, and the archiving of valuable results. Application of best practices to ensure data security, accessibility, usability, and integrity throughout the project and after its completion. Research Design Data Collection & Creation Processing & Analyses Storage & Preservation Publication, Sharing & Reuse

Research Data Management (RDM) Why RDM: Rewards for RDM practices are manifold: Granting Agencies Researchers Universities Research Output The Public RDM + Efficiency + Impact Public Good Transparency Compliance + Reuse

Research Data Management (RDM) Scholars Portal Dataverse A multidisciplinary data repository for researchers at Ontario’s universities, available as a service through Scholars Portal – an initiative from Ontario Council of University Libraries (OCUL). With SP Dataverse, researchers can share, preserve, cite, explore and analyze research data and allows them to control data access and distribution. Supports data DOI registration through Datacite Canada. http://dataverse.scholarsportal.info/dvn/

Portage DMP Assistant A web-based, bilingual data management planning tool. Available to all researchers across Canada. A guide for best practices in data stewardship. Exportable data management plans. https://assistant.portagenetwork.ca/

Research Data Management (RDM) http://library.mcmaster.ca/rdm

Contact Information Location: Mills Memorial Library, Room L104/C Telephone: 905 525-9140 Ext. 23848 E-mail: vivek@mcmaster.ca Hours of Service: 9:30 am – 1:00 pm 2:00 pm – 5:00 pm

RDC Micro data Resources / Links Google: “McMaster RDC” for: Summary of data sets classified by type and status Resources for selected surveys Monthly Newsletter: goings-on in the McMaster RDC (new data sets, library hours / closures, featured survey) and contact information

A Brief List of Datasets Census National Household Survey (NHS) Canadian Community Health Survey (CCHS) Special cycles on: Mental health, Nutrition intake and Health Aging Canadian Health Measures Survey (CHMS) National Population Health Survey (NPHS) Vital Statistics Database: Births and Deaths Canadian Cancer Registry (CCR) Canadian Survey on Disability 2012 (CSD) Canadian Tobacco Use Monitoring Survey: 1999-2012 (CTUMS) Survey on Living with Chronic Diseases in Canada: 2009, 2011, 2014 (SLCDC) Survey on Living with Neurological Conditions in Canada 2011 (SLNCC) CNICS (Childhood National Immunization Coverage Survey) ALSO: Administrative (linked) datasets For full list, please visit CRDCN’s website

Access to Centre Application Process: Academic, government funded or public sector access Link for “How to Apply” http://www.statcan.gc.ca/eng/rdc/process Security Clearance and Orientation McMaster RDC offers: Two analysts on-site for orientation and support 12 workstations with various statistical software Conference room

McMaster RDC Information LOCATION Mills Memorial Library, Rm 217 McMaster University 1280 Main Street West Hamilton ON, L8S 4L6 Note: We will be relocating to new Wilson Building in near future Academic Director: Byron Spencer (spencer@mcmaster.ca) Analysts: Peter Kitchen (rdc@mcmaster.ca) 905-525-9140 ext. 27968 Mustafa Ornek (rdc2@mcmaster.ca) 905-525-9140 ext. 27967 Statistical Assistant: Anna Kata (rdc3@mcmaster.ca) 905-525-9140 ext. 27968 Statistics Canada at McMaster RDC Website : http://socserv.mcmaster.ca/rdc/

Thank you See you in the RDC … SOON! Telling Canada’s story in numbers