NORTHWEST CENSUS RESEARCH DATA CENTER (NWCRDC) Mark Ellis Director, Northwest Census Research Data Center (NWCRDC) Director, Center for Studies in Demography.

Slides:



Advertisements
Similar presentations
DLI & Research Data Centres Creating a better understanding of these two programs Chuck Humphrey Data Library University of Alberta April 2004.
Advertisements

ONS Research Data Access Strategy AGENDA Background and context Confidentiality The Strategy.
The Microdata Analysis System (MAS): A Tool for Data Dissemination Disclaimer: The views expressed are those of the authors and not necessarily those of.
National Science Foundation Division of Science Resources Statistics May The Confidential Information Protection and Statistical Efficiency Act.
National Center for Health Statistics DCC CENTERS FOR DISEASE CONTROL AND PREVENTION Changes in Race Differentials: The Impact of the New OMB Standards.
Dissemination of U.S. Census Data and Results: The role of ICPSR First Conference of Al-Khawarezmi Committee on Statistics Doha, Qatar 6-8 December 2010.
U.S. Vital Statistics Mortality Data: Past Uses and Future Directions Irma T. Elo Director, Population Studies Center Professor of Sociology University.
Using synthetic data to improve the accessibility of the SLS Susan Carsley, SLS Project Manager.
Access routes to 2001 UK Census Microdata: Issues and Solutions Jo Wathan SARs support Unit, CCSR University of Manchester, UK
Introduction to the State-Level Mitigation 20/20 TM Software for Management of State-Level Hazard Mitigation Planning and Programming A software program.
IASSIST 2003 Changes in the Way Data Archives Process Data Data Processing at ICPSR Darrell Donakowski.
Semi-Permeable Boundaries Among Institutions: Non-Public Data and the Census RDC at Berkeley IASSIST 2009 – Tampere, Finland Jon StilesMay 27, 2009.
Measures of Income, Poverty and Health Insurance Wesley Basel, U.S. Census Bureau Presented at the Walter Cronkite School of Journalism June 17, :00.
INFO 7470/ILRLE 7400 Access to restricted data John M. Abowd and Lars Vilhuber February 8, 2011.
© John M. Abowd 2005, all rights reserved Statistical Programs of the Federal Government John M. Abowd February 2005.
© John M. Abowd 2005, all rights reserved Introduction John M. Abowd January 2005.
On Site Review Process Office of Field Services.
Educational Characteristics of Prisoners: Data from the ACS Stephanie Ewert & Tara Wildhagen U.S. Census Bureau Population Association of America Washington,
Treasure Trove of Data: Conducting Research Using Federal Statistical Surveys.
Country Paper on: Census Data Accessibility, Confidentiality and Copyright Policy: Ethiopia’s Experience Seminar United Nations Regional Seminar on Census.
United Nations Economic Commission for Europe Statistical Division Applying the GSBPM to Business Register Management Steven Vale UNECE
Creating a Business Plan, Budget Development, and Fundraising Amy D. Miller, MPH Executive Director, Mobile C.A.R.E. Foundation Coordinator, Mobile Health.
11 The American Community Survey Steve Murdock, Ph.D. Director, Hobby Center for the Study of Texas Rice University.
Aspects of the National Health Interview Survey (NHIS) Chris Moriarity National Conference on Health Statistics August 16, 2010
U.S. Census Bureau census.gov Census Data Immersion From A Novice to A Skilled Data Miner Infopeople Webinar August 7,
Labor Market Information in the Americas: the United States Workshop On Labor Migration and Labor Market Information Systems Inter-American Network for.
Regional Seminar on Census Data Archiving for Africa, Addis Ababa, Ethiopia, September 2011 Overview of Archiving of Microdata Session 4 United Nations.
American Community Survey Presented at the Meeting of the National Neighborhood Indicators Partnership Susan Schechter May
Issues Related to Data Dissemination in Official Statistics Presented at the European Conference On Quality in Official Statistics Helsinki, Finland May.
111 American Community Survey Fundamentals 2009 Population Association of America ACS Workshop April 29, 2009.
11 The Census for School Districts: American Community Survey from the Census Bureau and School District Tabulations from the US Department of Education.
Saadia GreenbergElena Fazio Office of Performance and Evaluation Administration on Aging US Department.
Tabulate, chart, map, download: Pre-tabulated health indicators.
The California Census Research Data Center Data Oct 22, 2012.
American Community Survey ACS Content Review Webinar State Data Centers and Census Information Centers Gary Chappell ACS Content Review Project Lead April.
Using AHRQ Data at Census RDC’s Health Data Workshop May 6, 2009 Doris Lefkowitz.
Kern Grant Summit - January 30, 2015
Statistics Canada’s Real Time Remote Access Solution 2011 MSIS Meeting – Karen Doherty May 2011.
Using IPUMS.org Katie Genadek Minnesota Population Center University of Minnesota The IPUMS projects are funded by the National Science.
Guidelines for NHANES Research Proposal Submission by Natalie Dupree Margaret McDowell.
Best Practices: Financial Resource Management February 2011.
Center for Economic Studies Research Data Centers Arnold P. Reznek Research Data Center Administrator Center for Economic Studies U.S. Census Bureau Room.
American Community Survey (ACS) 1 Oregon State Data Center Meeting Portland State University April 14,
Methods for Evaluating Within-State Variations Using the National Survey of Children with Special Health Care Needs Virginia Sharp Center for Children.
Framework of Statistical Information. This is a typology of the categories or classes of statistical information. Remember the relationship between statistics.
Developing Survey Handbooks as Educational Tools for Data Users Presented at the European Conference on Quality in Official Statistics May 2010 Deborah.
Creating Something from Nothing: Synthetic and Dummy files Bo Wandschneider University of Guelph Chuck Humphrey University of Alberta DLI Training: Ottawa,
MCRDC Michigan Census Research Data Center The MCRDC is a joint project of the U.S. Bureau of the Census and the University of Michigan to enable qualified.
Data on the Foreign Born in 2010: Accessing Information on Immigrants and Immigration from the U.S. Census Bureau’s American Community Survey Thomas A.
3/18/14 SERIES 4, SESSION 3 OF AAPLS – TITLE APPLICANTS & ADMINISTRATORS PREAWARD LUNCHEON SERIES Limited Submissions.
2008 NCHS Data Users’ Conference Omni Shoreham Hotel Washington, DC Wednesday, August 13, 2008.
Disclosure Avoidance at Statistics Canada INFO747 Session on Confidentiality Protection April 19, 2007 Jean-Louis Tambay, Statistics Canada
On Site Review Process Office of Field Services Last Revised 8/15/2011.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES GENEVA, MAY 2009 DETERMINING USER NEEDS FOR THE 2011 UK CENSUS IAN WHITE, Office.
Creating Something from Nothing: Working with Synthetic Files ACCOLEDS /DLI Training: December 2003 Chuck Humphrey University of Alberta.
Creating Open Data whilst maintaining confidentiality Philip Lowthian, Caroline Tudor Office for National Statistics 1.
Eve Powell-Griner National Center for Health Statistics Centers for Disease Control and Prevention National Center for Health Statistics Microdata Release.
David Price October 2011 Real Time Remote Access (RTRA) #10.
IOM Review: VSD Data Sharing Program Melinda Wharton, M.D. National Immunization Program, CDC NVAC Vaccine Safety Subcommittee October 5, 2004.
Overview of National Center for Health Statistics (NCHS) Data Systems Mary Burgess
The LEHD Program and Employment Dynamics Estimates Ronald Prevost Director, LEHD Program US Bureau of the Census
Introduction to NCHS Rob Weinzimer, Special Assistant for Outreach Centers for Disease Control and Prevention National Center for Health Statistics.
Using Census Data at the Federal Statistical Research Data Centers Barbara A. Downs Director, FSRDC Center for Economic Studies U.S. Census Bureau.
Expanding the Role of Synthetic Data at the U.S. Census Bureau 59 th ISI World Statistics Congress August 28 th, 2013 By Ron S. Jarmin U.S. Census Bureau.
Michigan Census Research Data Center
The Rocky Mountain Research Data Center
Secure Data Laboratories: The U.S. Census Bureau Model
Research Opportunities at Federal Statistical Research Data Centers
UT-Austin FSRDC Grand Opening December 13, 2017
Presentation transcript:

NORTHWEST CENSUS RESEARCH DATA CENTER (NWCRDC) Mark Ellis Director, Northwest Census Research Data Center (NWCRDC) Director, Center for Studies in Demography and Ecology (CSDE) Professor of Geography University of Washington, Seattle. 1

What is the NWCRDC? Part of a network of Census Research Data Centers NWCRDC is the most recent to open (June 2012): Atlanta Boston California, Berkeley California, Los Angeles California, Stanford Census Headquarters Chicago Michigan Minnesota New York, Baruch New York, Cornell Northwest Texas (Coming Soon) Triangle, Duke Triangle, RTI International 2

What are RDCs? RDCs provide secure access to restricted-use microdata from a range of federal agencies Census Bureau, IRS, National Center for Health Care Statistics, etc Qualified researchers with approved projects can conduct research in RDCs that benefits Census Bureau programs RDCs operate as joint partnerships between a host institution (university or research organization) and the Census Bureau RDC network is managed by the Census Bureau’s Center for Economic Studies: On site RDC administrator is a Census Bureau employee paid by the host institution NWCRDC administrator: Mike Babb 3

What is available in restricted access or non-public data? Access to full population samples (e.g. full ACS or census 1 in 6 long form data) Access to microdata not released in any public version No top coding (e.g. income) Much finer geographies (e.g. microdata by census tract) Ability to link observations via non public link keys to create new datasets Can link external data source to restricted data through geography, address matching etc 4

Sample data sets Demographic e.g. American Community Survey, Decennial Census, SIPP, NLS etc Economic e.g. Census Survey of Manufactures, Census of Services, Commodity Flow Survey etc Linked Demographic and Economic e.g. LEHD Public Health e.g. National Health Interview Survey, National Health and Nutrition Examination Survey, National Longitudinal Mortality Study, Medical Expenditure Panel Study etc Go here for more information on data:

Technical details: how do RDC provide access? Thin clients - encrypted VPN to secure servers at the bureau Linux OS, with SAS, Stata, R, etc No downloads possible, nothing stored locally No printing unless RDC admin is present and printouts can never be removed No laptops, cameras in the lab To release output researchers submit completed analyses for disclosure review to ensure confidentiality is maintained 24/7 access with keycode for door entry and security system Researchers can work when they want without administrator present Researchers can only use datasets requested in their proposal Can only use requested data for the purposes outlined in the proposal for the specified length of the project 6

Procedures for accessing an RDC Contact RDC administrator, discuss idea Submit preliminary proposal Outline idea Specify datasets needed, show clear need for restricted-use data Outline benefits to the bureau – more on this in upcoming slide Follow preliminary proposal template Preliminary proposal will be fine-tuned based on local RDC input then passed onto Census for evaluation Full proposal development single-spaced page proposal outlining problem, science, need for data Predominant purpose statement describing how project will benefit bureau Submitted through CES website Submit proposal, apply for Special Sworn Status (security clearance) Aim is for proposals to be reviewed in 90 days; those using FTI require IRS review which takes (sometimes much) longer 7

Useful links CES proposal writing webpage with templates and guidelines for writing preliminary proposal, full proposal and benefits/predominant purpose statement: Sample proposals and benefits statements are available here: Always consult with your RDC administrator beforehand 8

Benefits Title 13 US Code requires any access to confidential data benefit the bureau’s data collection programs These 13 possible benefits – need to pick one or more – are the predominant purpose of RDC research (first five): 1. Evaluating concepts and practices underlying Census Bureau statistical data collection and dissemination practices, including consideration of continued relevance and appropriateness of past Census Bureau procedures to changing economic and social circumstances. 2. Analyzing demographic and social or economic processes that affect Census Bureau programs, especially those that evaluate or hold promise of improving the quality of products issued by the Census Bureau. 3. Developing means of increasing the utility of Census Bureau data for analyzing public programs, public policy, and/or demographic, economic, or social conditions. 4. Conducting or facilitating Census Bureau census and survey data collection, processing or dissemination, including through activities such as administrative support, information technology support, program oversight, or auditing under appropriate legal authority. 5. Understanding and / or improving the quality of data produced through a Title 13, Chapter 5 survey, census or estimate. etc. 9

Health care data – some differences in procedure Projects using public health data National Center for Health Statistics (NCHS) data Go here for details on application: No benefits statement Contact and work with NCHS staff to ensure successful proposal Work with Agency for Health Care Research (AHRQ) data Go here for details on application No benefits statement Contact and work with AHRQ staff to ensure successful proposal 10

Getting research out – disclosure process Researcher writes a report describing research outputs requested, listing variables, and models, how they were estimated or constructed Some key issues: Cell sizes for categorical variables in models Tabular output can be a problem (cell size, amount of data, etc) and is discouraged, but it can be requested Models, tables, based on small groups in small areas may lead to numbers below disclosure threshold – output will be blacked out Model estimates based on cell counts below threshold will be reported with significance and sign only Complementary disclosure issues Prior release Public data It may take time to get your results out, especially if the risks are high 11

NWCRDC access and fee policies UW researchers without grants get access without fees UW researchers with grants pay $20000 a year for a seat (assumes about 40% time – roughly 2 days a week on average). Exception: those with NSF grants pay before Sept 2014 UW grad students get access without fees, must apply with adviser/faculty member OFM users gets access for one year from opening All other users pay fees – typically $20000 a year per seat, can be prorated, but minimum fee is $10000 More information at NWCRDC website: 12

These people, institutions made the NWCRDC possible UW College of Arts and Sciences Associate Dean of Research, Steve Majeski Associate Dean of Social Science, Judy Howard Dean Ana-Mari Cauce School of Social Work Associate Dean for Research, David Takeuchi Dean, Edwina Uehara Central Administration Vice-Provost for Research, Dave Eaton State of Washington Office of Financial Management, Marc Baldwin National Science Foundation THANK YOU! 13

My own experience as an RDC user as an illustration… Two RDC projects – both requiring extra sample size and census tract information from census long form data Segregation at work and home for immigrants and US-born racial and ethnic groups Urban geography and segregation of mixed-race couples Both conducted at UCLA RDC, with some revisions at Berkeley RDC Both with parallel submissions to NSF/Russell Sage Foundation for support and to RDC for access For both, RDC proposal was a slightly modified version of NSF/RSF proposal with an additional 2-3 page benefits statement NSF/RSF review and funding decision faster than RDC approval My impression is that our RDC reviews were problematic – slower - in both cases because we wanted lots of tabular and mapping output in addition to model estimates 14

My own experience as an illustration… We used the uncertainty about mapping output – which is really table output – in our benefits statement. We explored ways to release mapping output that minimized risk of disclosure – a benefit under criterion 6 (Leading to new or improved methodology to collect, measure, or tabulate a Title 13, Chapter 5 survey, census or estimate). Other benefits too : criterion 5 (understanding and or improving data…) – how do new census race categories affect segregation measurement; criterion 11 (Preparing estimates of population and characteristics of population…) – how to count mixed-race households under new census race categories Some maps… Sample disclosure request memos