Enabling direct data access to social science research data


Similar presentations
Karen Dennison Accessing international survey data collections via ESDS British Academy, Tuesday 14 March 2006 ESDS International.

OAI and Publishers metadata Using the static repositories approach to disclose small journals.
Metadata and the UK Data Archive CESSDA Expert Seminar Odense September 2008 Margaret Ward Lenin Ageer.
Metadata Management at GESIS-ZA Reiner Mauer GESIS – Data Archive and Data Analysis CESSDA-Expert Seminar Odense, September 11th 2008.
Environmental Information Data Centre: enabling the discovery of CEH-held data John Watkins Deputy Director EIDC.
Metadata at ICPSR Sanda Ionescu, ICPSR.
Supported by EU projects 12/12/2013 Athens, Greece Open Data in Agriculture Hands-on with data infrastructures that can power your agricultural data products.
Meta Dater Metadata Management and Production System for surveys in Empirical Socio-economic Research A Project funded by EU under the 5 th Framework Programme.
IISP Conference: From National Archives towards United Structure of Social Science. Moscow, December 9, 2005 Comparative Survey Research and Study Documentation.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Center for Environmental Studies Arizona State University Digital Research Records at Center for Environmental Studies Peter McCartney.
The eXtensible Past XML As a Means for Easy Access to Historical Research Data and a Strategy for Digital Preservation.
DataCite: Making Data Citable Jan Brase (DataCite/TIB Hannover) Brigitte Hausstein (GESIS) Wolfgang Zenk-Möltgen (GESIS)
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
ETD Repositories Using DSpace Software Andrew Penman The Robert Gordon University 27 th September 2004.
Implementing Digital Object Identifiers at the GESIS Data Archive for the Social Sciences Workshop “Persistent Identifiers for the Social Sciences” Bonn,
Bibliography in the Digital Age - IFLA Satellite Meeting Warsaw, 9 August Online materials published in Austria collecting, archiving and metadata.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Chinese-European Workshop on Digital Preservation Beijing (China), July.
January, 23, 2006 Ilkay Altintas
CIMES, a tool for describing European Official Statistics Microdata P resented by Arnaud Szulek CASD/GENES Luxembourg, European Data Access Forum, March.
DOI Registration for Social and Economic Data da|ra Brigitte Hausstein GESIS Leibniz-Institute for the Social Sciences, Berlin.
Data on the Web Life Cycle Bernadette Farias Lóscio March, 2014.
DDI-RDF Discovery Vocabulary A Metadata Vocabulary for Documenting Research and Survey Data Linked Data on the Web (LDOW 2013) Thomas Bosch.
CaBench-to-Bedside (caB2B) A caGrid TM client to facilitate translational research Key Stakeholders Involved: Developer Washington University Persistent.
F. Toussaint (WDCC, Hamburg) / / 1 CERA : Data Structure and User Interface Frank Toussaint Michael Lautenschlager World Data Center for Climate.
The Legislative Library of Ontario’s Ontario Documents Repository Road to Partnership.
Experts Workshop on the IPT, v. 2, Copenhagen, Denmark The Pathway to the Integrated Publishing Toolkit version 2 Tim Robertson Systems Architect Global.
1 DSARCH OVERVIEW Dataset Archiving Utility Overview By Zaihua Ji.
IODE Ocean Data Portal - ODP  The objective of the IODE Ocean Data Portal (ODP) is to facilitate and promote the exchange and dissemination of marine.
TERN & the Ecoinformatics Facility Who Are We? DOI’s Within TERN Ecoinformatics.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Here are some things you can do while you wait 1.Open your omeka.net site in your browser (e.g. 2.Open.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
2005 – 06 – - ESSP1 WDC Climate : Web Access to Metadata and Data Frank Toussaint World Data Center for Climate (M&D/MPI-Met, Hamburg)
Web Programming Language
Publishing DDI-Related Topics Advantages and Challenges of Creating Publications Joachim Wackerow EDDI16 - 8th Annual European DDI User Conference Cologne,
Rich metadata from the start
DIAS & DIAS data release 2 years DIAS-GCI Cooperation Hiroko KINUTANI DIAS (Data Integration and Analysis System in Japan) , St. Petersburg.
An Overview of Data-PASS Shared Catalog
Fernando Aguilar, IFCA-CSIC
Flanders Marine Institute (VLIZ)
CUAHSI HIS Sharing hydrologic data
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
VI-SEEM Data Repository
Heinrich Widmann EUDAT & CKAN Heinrich Widmann
VI-SEEM Data Repository
Outline Pursue Interoperability: Digital Libraries
Introducing da|raSearchNet

EUDAT B2FIND A Cross-Discipline Metadata Service and Discovery Portal
The New Face of Information Retrieval: The Ankara University Open Access Platform Prof. Dr. Sekine Karakaş Prof. Dr. Doğan.
PHP and Forms.
Tech introduction.
A Question Database for the German Longitudinal Election Study
SDMX in the S-DWH Layered Architecture
Ørnulf Risnes Ricco Førgaard Archana Bidargaddi
Getting Started With Solr
EDDI2016 Esra Akdeniz, Wolfgang Zenk-Möltgen
The HIRMEOS Metrics Services
Prepared by Peter Boško, Luxembourg June 2012
W3C Recommendation 17 December 2013 徐江
The Next Generation of the Microdata Information System MISSY: An Integrated Solution for the Documentation of European Microdata European DDI User Conference,
Dataverse for citing and sharing research data
Steps towards a Single Point of Access for Survey Questions across Europe: The Euro Question Bank Project Wolfgang Zenk-Möltgen Azadeh MahmoudHashemi GESIS.
Robert Dattore and Steven Worley
Integrated Statistical Production System WITH GSBPM
Palestinian Central Bureau of Statistics
Presentation transcript:

Enabling direct data access to social science research data within the GESIS Data Catalogue DBK Wolfgang Zenk-Möltgen DI4R Digital Infrastructures for Research 2016 28-30 September, Krakow, Poland

The Data Catalogue DBK Social science research data Surveys and survey programmes Time series Comparative studies Broad range of topics covered DBK at the GESIS Data Archive Contains study descriptions of primary data from survey research and historical time series Has more than 5100 studies, e.g. ALLBUS, ISSP, Eurobarometer, Politbarometer, DeutschlandTrend, European Values Study, German Longitudinal Election Study, Comparative Study of Electoral Systems, Youth Studies and many more… Uses a generic metadata schema for the description of social science data Was developed since the 1960ies, together with international archives and the DDI Alliance Provides metadata management to publish it in different retrieval and distribution platforms Includes a version history for the datasets, including errata documentation and persistent identifiers (DOIs by da|ra) https://dbk.gesis.org/dbksearch/

Metadata access possibilities OAI-PMH provides Dublin Core, DDI-C/L, DCAT Metadata for discovery Web UI API via OAI-PMH available, but no data API.

Current data access possibilities Users download or order data needs for more diverse data formats interdisciplinary analysis methods linking data from different sources data which are rapidly changing Currently internal project to convert data automatically Based on ascii, csv data + definitions Rectangular data files from 100kB to 221MB, mean 12 MB (43.4GB in total) Columns=Variables. Rows=Cases. (datorium: single twitter dataset 7.8GB) But this still requires a download from the users

The Data Tank Open source data API diverse input formats diverse output formats via API XML, HTML, JSON, or CSV Supports semantic technologies DCAT-AP metadata Apache, MySQL, PHP stack To provide an API for the users Established standards should be used RESTful services No data duplication should be needed http://thedatatank.com/ by Open Knowledge Belgium

Installation and integration Installation issues due to Windows (WAMP) - Using different ports, since iis+mysql already on server running PHP extension under Windows not supported (ext-pcntl) Dbase extensions not working Caching module (memcached) not available, used file cache instead Data issues due to CSV format Tab delimiter not supported Double quote text qualifier not supported Design with „Collection“ and „URI“ for datasets has to be defined WAMP Server Using data from DBK Existing CSV datasets Define Collection & URI Map some metadata

Challenges and problems Research Workflow Access restrictions by dataset Versioning Usage stats needed Performance Selecting variables (paging by columns) Research question Find data Analyze data Social science data Privacy concerns even after anonymization Archive creates updates, versions for the dataset – API structure Archive needs statistics about usage Data API must be reliable and quick Rectangular dataset is never used completely, variable selection seems necessary

Future Thank you! User has a personal space at DBK personal data access possibilities personal statistics could also be used for uploading resulting datasets by users to enable preservation, citation, and sharing Thank you! After solving all the issues Researchers may have direct access to data via API Organized by a „personal space“ at DBK Can show also statistics for users Thank you very much! Contact: Wolfgang.Zenk-Moeltgen@gesis.org