Research Data Management towards Data Integration

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
A centre of expertise in data curation and preservation CETIS MDR SIG::28 June 2006::University of Bath Funded by: This work is licensed under the Creative.
FGDC & ISO: What is the Current Status and Considerations when Moving Forward? Viv Hutchison USGS Core Science Systems November 10, 2010 Salem, OR.
EXtensible Catalog David Lindahl University of Rochester.
An Agent-Oriented Approach to the Integration of Information Sources Michael Christoffel Institute for Program Structures and Data Organization, University.
A Data Management Life-Cycle By David Ferderer Project Chief Chris SkinnerContractor Greg GuntherContractor
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
DATAVERSE FOR JOURNALS Mercè Crosas, Ph.D. Director of Data Science IQSS, Harvard Society for Scholarly Publishing 37 th Meeting,
JISC CETIS Conference, Oxford, November 2004 Repositories: State of ELF “volunteer”: Martin Morrey Intrallect Ltd.
F. Toussaint (WDCC, Hamburg) / / 1 CERA : Data Structure and User Interface Frank Toussaint Michael Lautenschlager World Data Center for Climate.
Virtual Ice Charting System Archive Browser Interface Distribution IngestProduction Ice Analyst Application Database Click on the boxes for more information.
Development of a Web-based System For Managing Requests For LIS Data Searches in a Distributed Laboratory Environment Pathology Informatics 2010 Kavous.
IPlant cyberifrastructure to support ecological modeling Presented at the Species Distribution Modeling Group at the American Museum of Natural History.
SEEK EcoGrid l Integrate diverse data networks from ecology, biodiversity, and environmental sciences l Metacat, DiGIR, SRB, Xanthoria,... l EML is the.
Ovid Scientific Information Management Conference Zagreb Think Fast. Search Faster.
Sea Ice Mapping Systems Archive Browser Interface Distribution IngestProduction Ice Analyst Application Database Henrik Steen AndersonDMI Paul SeymourNIC.
ABSTRACT The JDBC (Java Database Connectivity) API is the industry standard for database- independent connectivity between the Java programming language.
IABIN Visioning Meeting Washington, D.C. October 2008 Mike Frame.
Scientific Annotation Middleware (SAM) Jim Myers, Elena Mendoza PNNL Al Geist, Jens Schwidder ORNL.
Exporting WaterML from the Earth System Modeling Framework Xinqi Wang Louisiana State University NCAR SIParCS Program August 4, 2009.
Construction of Shanghai Life Science & Bio-technology Service Platform for Data Access and Sharing International Workshop on Strategies Presentation of.
II Course on GBIF Node Management Arusha, Tanzania 31 st October and 1 st November 2008 Tim ROBERTSON Systems Architect GBIF Secretariat The GBIF Data.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Distributed Data Servers and Web Interface in the Climate Data Portal Willa H. Zhu Joint Institute for the Study of Ocean and Atmosphere University of.
Google Apps and Tools for the Classroom
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
Agree on deployment, UNEP Live – uneplive.unep.org.
Helmholtz Open Science Webinars on Research Data Webinar 34 – 6 / 11 April 2016 Dr. Birgit Schmidt Niedersächsische Staats- und Universitätsbibliothek.
Essex Insight Introduction to Essex Insight Training Guide Source: Research and Analysis Unit v4.
The CUAHSI Hydrologic Information System Spatial Data Publication Platform David Tarboton, Jeff Horsburgh, David Maidment, Dan Ames, Jon Goodall, Richard.
Databases and DBMSs Todd S. Bacastow January 2005.
Scientific DataManagement for biodiversity (and other) data
EUDAT: collaborative pan-European infrastructure providing research data services, training and consultancy This work is licensed.
WHAT’S NEW AT ORCID? liz krznarich, ORCID
Scientific Reproducibility using the Provenance for Healthcare and Clinical Research Framework Satya S. Sahoo Collaborators/Co-Authors: Joshua Valdez,
‘How to’ improve the Quote to Cash Process
Themes in Geosciences.
Business in a Connected World
Joseph JaJa, Mike Smorul, and Sangchul Song
Structural and reference metadata in the European Statistical System
Definition SpecIfIcatIons
A technical look at new capabilities and features
Semantic Database Builder
Toward FAIR Semantic Resources
Integrating Data and Information Across Observing System
Use Case: The GEO-Wetlands Community Portal
Lifting Data Portals to the Web of Data
Installation Restoration (IR) Collaboration Gateway
Chair of Tech Committee, BetterGrids.org
UNEP Live – uneplive.unep.org
What is database? Types and Examples
Introduction to D4Science
Staying afloat in the sensor data deluge
Metagenomics Microbial community DNA extraction
Definition SpecIfIcatIons
An ecosystem of contributions
2. An overview of SDMX (What is SDMX? Part I)
Overview EMODnet Biology Portal Standards used Web services available
Vonk FHIR Engine Christiaan Knaap 27 September 2018.
Saravana Kumar CEO/Founder - Kovai Atomic Scope – Product Update.
Knowledge Sharing Mechanism in Social Networking for Learning
Dataverse for citing and sharing research data
Make EML with r and share on github
User Support in EGI Reactive and proactive services
Robert Dattore and Steven Worley
Pawandeep Kaur*, Friederike Klan*, Birgitta König-Ries*
Australian and New Zealand Metadata Working Group
Cultivating Semantics for Data in Agriculture and Nutrition
Presentation transcript:

Research Data Management towards Data Integration Roman Gerlach, Birgitta König-Ries, Javad Chamanara, David Blaa, Sven Thiel, Martin Hohmuth, Nafiseh Navabpour Friedrich-Schiller-University, Jena (Germany) Endowed Chair for Distributed Information Systems (Research Data Management Helpdesk)

Intro BEXIS 2 is: Data Management Platform (i.e. software) designed for large research projects with central data management (incl. data manager) focus on active data (i.e. project live time) focus on tabular data, but not limited to focus on data integration and re-use generic, scalable, modular, free and open source roman.gerlach@uni-jena.de

BExIS++ Project (DFG) BEXIS 2 SOFTWARE SUSTAINABILITY DEVELOPMENT OUTREACH SUPPORT TRAINING roman.gerlach@uni-jena.de

BExIS Community BEXIS 2 BExIS++ BExIS AquaDiva iDiv TerraSensE GFBio UFZ Halle BExIS++ Biodiversity Exploratories Kilimanjaro GRK 1086 Jena Experiment EFForTS MPI-BGC Research Database BEFmate GRK 1666 BExIS roman.gerlach@uni-jena.de

What do we do to facilitate data integration and re-use? roman.gerlach@uni-jena.de

No Data in Black Boxes roman.gerlach@uni-jena.de

Let‘s take a look inside! Carl Zeiss Jena Biotar 2.0/58mm f. Exakta (http://www.klassik-cameras.de/Biotar.html) roman.gerlach@uni-jena.de

Heterogenity roman.gerlach@uni-jena.de

Heterogenity For example: 18,200 different variables in 856 datasets Download of templates  mapped into ~80 Data Attributes roman.gerlach@uni-jena.de

Example: Tabular data headers Data Type: DateTime Unit: None Data Structure Unit: Time Unit: Celecius Data Type: Float Data Attributes Soil Sampling Timestamp Temperature Ratio Rec. Time Air Temp. Soil Temp. Humidity Variables Sharing Data attributes among variables Sharing units and data types among data attributes Good for automatic data conversion, cross data set search, and data integration 1 22 18 46 2 23 17 45 3 21 16 30 5 15 25 6 14 11 Rec. Time Air Temp. Soil Temp. Hu. 1 22 18 46 2 23 17 45 3 21 16 30 5 15 25 6 14 11 Dataset roman.gerlach@uni-jena.de

Data structure creation Providing support at dataset design time roman.gerlach@uni-jena.de

Data Package Red classes come from other packages roman.gerlach@uni-jena.de

Views Subset of a dataset obtained by selection or projection Purpose Further processing, sharing or sampling Security /Digital rights management Spanning view View across multiple dataset using the same Data Structure Only data structure? How about same attributes? Does not apply! roman.gerlach@uni-jena.de

Metadata level roman.gerlach@uni-jena.de

Metadata level Import/export of multiple schemas/standards mapping between different schemas User-friendly tools to create metadata re-use (e.g. enter once, copy, import) guidance (e.g. terminologies, autocomplete) custom structure (standard compliant) roman.gerlach@uni-jena.de

System level Interaction with external systems Persistent Identifier Providers Authentication Providers (e.g. LDAP) Annotation Providers (GFBio terminology services) Geographic Information Systems roman.gerlach@uni-jena.de

Web API Data Access Sample REST API calls: Data http://www.name.com/api/data/6 /api/data/6?header=id,name /api/data/6?filter=(Grade>50 AND Grade <90) /api/data/6?header=id,name&filter=(Grade>50) Sample REST API calls: Metadata http://www.name.com/api/metadata/6 http://www.name.com/api/metadata/6?ConvertTo=EML roman.gerlach@uni-jena.de

Conclusion Facilitating data integration is one of the big challenges in data life cycle management Data integration starts with data design System should provide support (e.g. data structure design) roman.gerlach@uni-jena.de

Further Reading A conceptual model for data management in the field of ecology, Javad Chamanara, Birgitta König-Ries, Journal of Ecological Informatics, volume 24, November 2014, Pages 261–272, doi:10.1016/j.ecoinf.2013.12.003 An Extensible Conceptual Model for Tabular Scientific Datasets, Javad Chamanara, Michael Owonibi, Alsayed Algergawy, Roman Gerlach, The International Symposium on Challenges for Designing and Using Datasets (DATASETS 2015), June 21 - 26, 2015, Brussels, Belgium, http://www.thinkmind.org/index.php?view=article&articleid=immm_2015_5_20_98008 BEXIS 2 Tech Talk Series: https://youtu.be/ANGAVoZHTII Conceptual Model: http://fusion.cs.uni-jena.de/bppCM/index.htm roman.gerlach@uni-jena.de

Thanks! Questions? Contact: roman.gerlach@uni-jena.de http://bexis2.uni-jena.de/