Connecting Researchers with Data: Discovery, Documentation, Access and Security Cornell Institute for Social and Economic Research (CISER); German Institute.

Slides:



Advertisements
Similar presentations
Enabling Secure Internet Access with ISA Server
Advertisements

CLEARSPACE Digital Document Archiving system INTRODUCTION Digital Document Archiving is the process of capturing paper documents through scanning and.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
SOFTWARE SOLUTIONS Identification, Server-Side Printing, Tracking & Mobility Software TEKLYNX CENTRAL Bundled Solutions.
Computer networks Fundamentals of Information Technology Session 6.
1. The Digital Library Challenge The Hybrid Library Today’s information resources collections are “hybrid” Combinations of - paper and digital format.
IAB & DwB WP4 (Eu-RAN) David Schiller (IAB) Lausanne, Workshop on Accreditation for Transnational Access to Official Microdata for Research Purposes, March.
New solutions for transnational access to secure use files David Schiller (IAB) Richard Welpton (UKDA) Microdata Access in European Countries – Cooperation.
1 Configuring Internet- related services (April 22, 2015) © Abdou Illia, Spring 2015.
Module 5: Configuring Access for Remote Clients and Networks.
Access to and specifics of detailed national LFS data – the case of Slovenia Sebastian Kočar Social Science Data Archives University of Ljubljana 4th DwB.
California Digital Library Applications in the Real World: The Counting California Experience with the DDI Patricia Cruse Ilona Einowski Juri Stratford.
Content Management System (CMS) - An overview. Project Organisation.
IT PLANNING Enterprise Architecture (EA) & Updates to the Plan.
1 Configuring Web services (Week 15, Monday 4/17/2006) © Abdou Illia, Spring 2006.
A. Frank 1 Internet Resources Discovery (IRD) Peer-to-Peer (P2P) Technology (1) Thanks to Carmit Valit and Olga Gamayunov.
File sharing. Connect the two win 7 systems with LAN card Open the network.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
World Bank: Microdata Library Development Data Group.
1 The planned use of DDI 3.0 within a German Research Data Center IASSIST, Session “Tools and Implementations of DDI 3.0”, May 27, 2009 Dana Müller.
Chapter 7: Using Windows Servers to Share Information.
©Kwan Sai Kit, All Rights Reserved Windows Small Business Server 2003 Features.
1 Benjamin Perry, Venkata Kambhampaty, Kyle Brumsted, Lars Vilhuber, William Block Crowdsourcing DDI Development: New Features from the CED 2 AR Project.
Chapter 9: Novell NetWare
© 2011 Delmar, Cengage Learning Chapter 7 Managing a Web Server and Files.
Remote Administration Remote Desktop Remote Assistance Remote Server Administration Tools.
Microsoft Active Directory(AD) A presentation by Robert, Jasmine, Val and Scott IMT546 December 11, 2004.
Research Data Centre network for transnational access - four years of experiences by seven European RDCs Karen Dennison (UK Data Archive) and David Schiller.
Chuck Humphrey Data Library Co-ordinator University of Alberta May 16, Capitalising on Metadata Tool development plans IASSIST 2007.
POPULATION AND HOUSING CENSUSES IN SLOVAKIA ON THE WEBSITE Miroslav Hudec Pavol Büchler INFOSTAT – Bratislava MSIS Geneva
Real World Case Study KM Summer Institute June Rano Joshi, Vorsite.
Computer Emergency Notification System (CENS)
The RDC in RDC approach – IAB data for the US First European Data Access Forum (EDAF) 2012/03/28, Luxembourg Stefan Bender, IAB David Schiller, IAB Jörg.
Michael Witt Interdisciplinary Research Librarian & Assistant Professor Purdue Libraries & Distributed Data Curation Center (D2C2) Eliciting.
1 Administering Shared Folders Understanding Shared Folders Planning Shared Folders Sharing Folders Combining Shared Folder Permissions and NTFS Permissions.
Access to environmental microdata in Germany IAOS Conference, Chile, 2010 Markus Zwick Federal Statistical Office Germany.
Administering Microsoft Windows Server 2003 Chapter 2.
CSI 3125, Preliminaries, page 1 Networking. CSI 3125, Preliminaries, page 2 Networking A network represents interconnection of computers that is capable.
Intro to Web Services Dr. John P. Abraham UTPA. What are Web Services? Applications execute across multiple computers on a network.  The machine on which.
An Early Prototype of the Comprehensive Extensible Data Documentation and Access Repository (CED 2 AR) William C. Block and Jeremy Williams, 1 John Abowd.
Improving User Access to Metadata for Public and Restricted Use US Federal Statistical Files William C. Block Jeremy Williams Lars Vilhuber Carl Lagoze.
XACML Showcase RSA Conference What is XACML? n XML language for access control n Coarse or fine-grained n Extremely powerful evaluation logic n.
Rights Management for Shared Collections Storage Resource Broker Reagan W. Moore
Joint UNECE/Eurostat work session on statistical data confidentiality October 2015 Helsinki, Finland Circle of trust Maurice Brandt DESTATIS.
INFORMATION ASSURANCE POLICY. Information Assurance Information operations that protect and defend information and information systems by ensuring their.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
1 Remote Installation Service Windows 2003 Server Prof. Abdul Hameed.
Development of UK Virtual Microdata Laboratory
Secure Data Laboratories: The U.S. Census Bureau Model
Country report Germany
practice-questions.html If you Are Thinking about your dumps? Introduction:
Chapter 3: Windows7 Part 4.
IASSIST , Toronto (Canada)
XSEDE’s Campus Bridging Project
Connecting Researchers with Data: Discovery, Documentation, Access and Security Cornell Institute for Social and Economic Research (CISER); German Institute.
Unit 27: Network Operating Systems
IIS.
Advancing Access to Restricted Data:
ICPSR Census Metadata Repository
Chapter 2: The Linux System Part 1
Configuring Internet-related services
Managing a Web Server and Files
Anja Burghardt, Institute for Employment Research (IAB)
CLOSER Discovery Alison Park, UCL Institute of Education
The Beginnings of a European Remote Access Network
Capitalising on Metadata
Designing IIS Security (IIS – Internet Information Service)
EDDI Copenhagen (Denmark)
Data Liberation Initiative (DLI)
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Presentation transcript:

Connecting Researchers with Data: Discovery, Documentation, Access and Security Cornell Institute for Social and Economic Research (CISER); German Institute for Employment Research (IAB); and the Cornell Labor Dynamics Institute (LDI)

Researchers Searching for Data

Connecting Researchers with Data CED2AR CRADC FDZ, IAB

Connecting Researchers with Data CED2AR CRADC FDZ, IAB DDI

Discovering Data File – Do Data Exist?? First barrier to research: A research question is formulated, but are their existing data files to address the question? Discovering data files by searching descriptions of studies, files, and variables The first barrier is discovery

Restricted Data – What’s in that Dataset?? Second barrier to research with restricted data: Researchers need to examine the metadata of a dataset in order to prepare a research plan for the data application Data Providers want research plans to specify why restricted data files are necessary, and how the variables will be analyzed Enclaves and libraries want to assist researchers to search and discover what relevant data files exist in greater detail than abstracts The second barrier of restricted data research is to discover what’s in the file!

Searching Restricted Labor Market Data Labor market datasets for longitudinal employment data National Longitudinal Surveys (NLSY) Longitudinal Employer-Household Dynamics (LEHD) Sample of Integrated Labor Market Biographies (SIAB) That was the study level, but how do we compare the variables? Searching across variables of restricted data is painful Codebooks Abstracts Having seen the dataset during a prior project The majority of these resources are not machine readable There needs to be an accessible discovery tool

CED2AR Comprehensive Extensible Data Documentation and Access Repository Developed by the Cornell Node of the NSF Census Research Network (NCRN) Enables search and browsing across codebooks Designed to improve the discoverability of both public and restricted data Lightweight, DDI driven web application Based upon leading metadata standards Ingests data from a variety of sources Study levels include the Abstract Variables Values Citation Terms of Use, as applicable

Utilizing CED2AR to Locate SIAB Variables The addition of IAB SUF SIAB metadata to CED2AR will enable researchers to search across multiple labor market data series* CISER is currently collaborating with IAB to develop the DDI necessary to add the SIAB metadata to CED2AR Promote use of the SIAB data through discoverable documentation Facilitate ease in research plan creation *https://www2.ncrn.cornell.edu/ced2ar-web/about

Restricted Data Access The big barrier to restricted data access is balancing researcher access with data security It’s ideal when data is housed within a controlled environment It’s also ideal when researchers have the most flexible, secure access to the data

IAB SUF Access and Data Security IAB Highly Restricted Data Originally only accessible at Nuremberg, Germany Implemented a Research Data Center – in – Research Data Center (RDC-in- RDC) approach Allowed for a controlled remote access environment from the established Nuremberg research data center to comparably secure research data centers using Citrix-thin client technology IAB SUF Data Designed specifically by IAB for off-site access Originally sent to authorized researchers in CD format, later by transmission, to be destroyed at the end of the project These methods provided access, but posed a security threat via individual accountability and system back-ups Needed a new, more secure off-site access mode - CRADC

Cornell Restricted Access Data Center (CRADC) Cornell Institute for Social and Economic Research (CISER)’s remote access virtual data enclave Controlled remote access environment accessible worldwide As a secure computing environment with remote access, CRADC exists to: House and protect restricted research data Help PIs comply with requirements of Data Providers Provide a computing platform as flexible as data use agreements permit Any approved researcher can securely access restricted project files Restricted data is available to the researcher on the CRADC server Restricted data is held on a drive with only read and execute permissions A project work space is provided on a separate drive, with access to analysis applications No access is given to email, internet, clipboard, printer sharing, or disk mapping, and analysis can only be saved within the project work space

Institute for Employment Research (IAB) Scientific Use Files (SUF) Contain confidential administrative microdata on labor markets: Employment Unemployment benefit receipts Participation in labor market programs and registered job search A large number of socio-economic characteristics Similar in origin to the highly restrictive files but have been factually anonymized, enabling wider access The highly restrictive files can only be accessed through a thin client connection directly to the IAB servers in Nuremberg, Germany SUF file access through CISER’s remote access virtual data enclave (CRADC) through secure computing accounts using Remote Desktop Connection or Terminal Service Client SUF access is available to any researcher approved by IAB worldwide No affiliation with Cornell is required – any researcher approved by IAB can remotely access the files

CRADC – SIAB Data Access the IAB SUF data worldwide CRADC currently supports SIAB researchers in Canada Germany Spain United Kingdom United States CRADC provides the controlled environment for each researcher and project based on IAB’s terms IAB no longer needs to send restricted data CDs or transmissions Researchers are permitted SFTP access from their project work space to their campus office via static IP address CRADC can confirm that all project data and analysis files are securely destroyed at the end of any IAB data use agreement

Discovering SIAB through CED2AR, Accessing on CRADC Labor Market Datasets for longitudinal employment data Discover the study and variables using CED2AR Search and browse across codebooks Compare public and restricted dataset versions Select SIAB Create a SIAB research plan using CED2AR variables and values Review the terms of use to confirm that the planned analysis will be accepted (the data security plan is already covered) IAB approves the longitudinal employment data project Gain access to the SIAB data through CRADC Complete SIAB analysis, notify IAB of project completion CRADC will close the restricted project Browse or search CED2AR for new project variables

Our Team IAB Dana Muller David Schiller Joerg Heining CISER Ben Perry Michelle Edwards Stephanie Jacobs Warren Brown

Promoting Data through Discoverability Data Security Researcher Access Promoting Data through Discoverability

Thank you for your time and attention! Warren Brown Stephanie Jacobs cradc@cornell.edu ciser.cornell.edu