Download presentation
Presentation is loading. Please wait.
Published byDulcie Garrett Modified over 8 years ago
1
An Open Data Platform in the framework of the EGI-LifeWatch Competence Centre Fernando Aguilar aguilarf@ifca.unican.es Jesús Marco marco@ifca.unican.es Ana Yaiza Rodríguez yaiza.rodriguez@aeonium.eu Instituto de Física de Cantabria (IFCA) Santander - Spain
2
Data Life Cycle has different steps that are not always well managed: Irreproducibility. They are not isolated. LifeWatch in collaboration with EGI propose a framework to manage.Introduction
3
Introduction Describe & Collect Curate Integrate Analyze Preserve Public Plan
4
LifeWatch is an e-science and technology infrastructure for biodiversity and ecosystem research to support the scientific community and other users. It is putting in place the infrastructure and information systems necessary to provide an analytical platform for the modeling and simulation of both existing and new data on biodiversity to enhance the knowledge of biodiversity functioning and management Example of relevant case studies: Invasive species Evolution of wetlands Evaluating the ecological quality of habitats EGI-LifeWatch Competence Centre: Establishing a direct collaboration between EGI.eu and the ESFRI LifeWatch to address specific needs. Resources, Workflows, Data streaming, citizen science. What is LifeWatch?
5
Context Framework: Internationalization of EBD-CSIC. Updating of information and knowledge systems. Doñana National Park: area of marshes, shallow streams, sand dunes and delta. Unique biodiversity: migratory birds and endangered species. Integrating resources in the dedicated pilot project – New servers (around 1000 cores, including large RAM, GPU, etc.) – New storage (1 + 1 PB) – Dark fiber network connection. FEDER + Ministry funds. Industry collaboration. LifeWatch support.
6
Example of Pilot Implementation Cloud IaaS (integrated in EGI.eu FedCloud) Distributed Control Collaborative Environments Data Acquisition Data Portal Storage Solution Final User (researcher/manager) VO Manager Manage/dev
7
LifeWatch OSF
8
Data Management Planning Data Lifecycle Managing – DMPs mandatory (H2020, NSF, etc.) Must cover: data handling (during & after), what data will be collected, methodology, standards, sharing, curation, preservation… DMPTool extended with new functionalities: DOI's allocation on a DMP Add taxonomies to DMPTool Ability to associate taxonomies to sections of DMP Adapting the generation of PDF / RTF for viewing new features Generating RDF/XML associated with DMP Integration with the Data storage and the LifeWatch Open Science Framework https://dmp.cdlib.org/
9
Added value: Taxonomies of instruments, parameters (e.g. NASA SWEET), etc. Export in RDF, XML, PDF… Data Management Planning
10
Data Acquisition Open Source pyVISA software and Remote Instruments Application installed on Remote Computer Data acquisition configured via web service interface Data sent to final storage Remote Instruments Application usable via iPython Notebook friendly front-end interface.
11
Data Acquisition Web Service interface: REST interface for setting up instruments and periodic tasks. Configuration storage: Saves instruments connection parameters and periodic tasks configurations. pyVISA and Backends: Libraries for connecting to instruments. Readings Temporary Storage: Store Readings before sending them. Usable via IPython Jupyter Notebook friendly front-end Readings sent to storage via simple REST API: country id, application id, instrument id, parameter id, data owner, reading data, reading time
12
LifeWatch OSF Features The LifeWatch Open Science Framework is a tool for researchers and general public that brings together data management features with a platform for analyzing those data. Share and discover Data Management Plans (DMPs), Datasets and Software. Preserve and make citable all records via DOIs + Local PIDs. Different types of Access Rights.
13
LifeWatch OSF Features Combine Datasets and Software to create Analysis that can be executed on cloud resources. EGI FedCloud. Preservation of Full Data Life Cycle: Reproducibility warranted.
14
Along INDIGO DataCloud Architecture
15
Conclusions Preserve Data Life Cycle requires different types of tools for different steps: Preserve Datasets is not enough. DMPs should not be an static item: it should be an useful and dynamic tool. Different technologies can help researchers to automatize data gathering. Towards Open Science: sharing data owned by researchers/users. Datasets are not enough. Cloud framework can contribute to achieve this goal. INDIGO-DataCloud project integrates this environments with Data to be easily managed. Biodiversity and Environmental Researching approach but can be adapted.
16
Thanks for your attention Fernando Aguilar aguilarf@ifca.unican.es Jesús Marco marco@ifca.unican.es Ana Yaiza Rodríguez yaiza.rodriguez@aeonium.eu Instituto de Física de Cantabria (IFCA) Santander - Spain
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.