Data Ingestion in EMSO Presented by Marco Pappalardo

Slides:



Advertisements
Similar presentations
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Advertisements

A Java Architecture for the Internet of Things Noel Poore, Architect Pete St. Pierre, Product Manager Java Platform Group, Internet of Things September.
Service Oriented Sensor Web Xingchen Chu and Rajkumar Buyya University of Melbourne, Australia Presented by: Gerardo I. Simari CMSC828P – Fall 2006 Professor.
Components of an Integrated Environmental Observatory Information System Cyberinfrastructure to Support Publication of Water Resources Data Jeffery S.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Managing Data Interoperability with FME Tony Kent Applications Engineer IMGS.
Discussion and conclusion The OGC SOS describes a global standard for storing and recalling sensor data and the associated metadata. The standard covers.
Good practice in Research Data Management Module 6: Tools, training and support.
Information Requirements for Integrating Spatially Discrete, Feature- Based Earth Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Kerstin Lehnert,
Project number: Data and Data Requirements Wouter Los University of Amsterdam.
WP 9 (former Task 1b of WP 1): Data infrastructure Robert Huber UNI-HB Esonet 2nd all regions workshop, Paris
Managed by UT-Battelle for the Department of Energy 1 Integrated Catalogue (ICAT) Auto Update System Presented by Jessica Feng Research Alliance in Math.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Environmental Monitoring: Database and Beyond Chengyang Zhang Computer Science Department University of North Texas.
Mapping between SOS standard specifications and INSPIRE legislation. Relationship between SOS and D2.9 Matthes Rieke, Dr. Albert Remke (m.rieke,
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal MINCyT,
CEOS WGISS, Hanoi May OSCAR Prototyping the sensor web Wyn Cudlip BNSC/QinetiQ Presentation to WGISS Hanoi May 2007 (Slides.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Data discovery and data processing for environmental research infrastructures Roberto Cossu ENVRI WP4 leader ESA.
MEDIN Work Plan for By March 2011 MEDIN will be 3 years into the original 5 year development plan started in Would normally ask for continued.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
NOAA/NESDIS/National Oceanographic Data Center Following the Flow of Two Underway Data Streams Within the U. S. National Oceanographic Data Center Steven.
National Center for Supercomputing Applications University of Illinois at Urbana–Champaign Data Management for the Mississippi River Luigi Marini September.
Project number: ENVRI and the Grid Wouter Los 20/02/20161.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
CLARIN EUDAT2020 uptake plan Dieter Van Uytvanck CLARIN ERIC EUDAT User Forum, Rome.
Data Management: Data Processing Types of Data Processing at USGS There are several ways to classify Data Processing activities at USGS, and here are some.
An Open Data Platform in the framework of the EGI-LifeWatch Competence Centre Fernando Aguilar Jesús Marco
A Big Data approach for ocean observations: the EMSODEV data management platform experience on top of the EGI FedCloud EGI Conference, 8 th April 2016,
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No EUDAT Aalto Data.
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
Sensor Web Enablement (SWE) developments for fixed monitoring platforms and research vessels By Dick M.A. Schaap – SeaDataNet Technical Coordinator with.
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal.
A Semi-Automated Digital Preservation System based on Semantic Web Services Jane Hunter Sharmin Choudhury DSTC PTY LTD, Brisbane, Australia Slides by Ananta.
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
Software & Technologies: an overview
GeoNetwork OpenSource: Geographic data sharing for everyone
Justin Buck OceanSITES data Incentives for participation: Data citation & data services Justin Buck
Donatella Castelli CNR-ISTI
Data Ingestion in ENES and collaboration with RDA
Fernando Aguilar, IFCA-CSIC
Lecture 8 Database Implementation
Flanders Marine Institute (VLIZ)
DI4R, 30th September 2016, Krakow
EGI-Engage Engaging the EGI Community towards an Open Science Commons
Flexible Extensible Digital Object Repository Architecture
Flexible Extensible Digital Object Repository Architecture
VI-SEEM Data Repository
PROCESS - H2020 Project Work Package WP6 JRA3
Conference on National Platforms for SDG Reporting
Data catalogues and the data repository ADMIRe JISC MRD
DATA SPHINX & EUDAT Collaboration
eCulture Science Gateway – reloaded
Case Study: Algae Bloom in a Water Reservoir
Near Real Time ETLs with Azure Serverless Architecture
Cloud computing mechanisms
Future Requirements of WIS Centres
Technical Capabilities
ENVRI Reference Model (RM) Information Viewpoint components
A. Della Vecchia, D. Guerrucci (ESA)
Joining the EOSC Ecosystem
EOSC-hub Contribution to the EOSC WGs
Research Data Dr Aoife Coffey, Research Data Coordinator
Presentation transcript:

Data Ingestion in EMSO Presented by Marco Pappalardo Spacearth Technology Srl, Italy marco.pappalardo@spacearth.net marco.pappalardo@softwareengineering.it INDIGO SUMMIT on Data Ingestion Catania, 12th May 2017 RIA-653549

Indigo Summit on Data Ingestion – Data Ingestion in EMSO What is EMSO? The European Multidisciplinary Seafloor and water-column Observatory (EMSO) is a large scale, distributed, marine Research Infrastructure (RI) of fixed-point observatories It serves marine science researchers, marine technology engineers, policy makers, and the public. It monitors natural hazards, climate change, and marine ecosystems. 11 nodes and 4 test sites Catania - May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO EMSO Nodes Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Observatory what? Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

EMSO Generic Instrumentation Module EGIM is a sea-floor observatory. Data acquired by the EGIMs, through an EGIM Sensor Observation Service Gateway, will be dispatched both to the EMSO Regional Data Nodes and to the EMSODEV Data Management Platform. The EMSODEV (EMSO) Data Management Platform will collect, analyze, … and publish data. Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Why EGIM? Goal: to develop and deploy EGIMs to measure a specific set of variables suitable for all sites and depths, including: temperature, conductivity (salinity), pressure (depth), turbidity, dissolved oxygen, ocean currents, and passive acoustics 1st deployment on Dec 2016 @ Vilanova y la Geltrù Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

EMSODEV Data Management Platform The DMP includes a set of common services, compliant to the phases of the computational viewpoint of the ENVRI Reference Model v2.0: Data acquisition; Data curation (including data storage and partitioning, data quality checking and cataloguing services, import/export utilities, query services); Data publishing (query preparation, preparation for import/export of curated data); Data processing services (real time and/or batch processing computing capabilities); Data use (platform authentication and authorization). Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO DMP API emsodev-api is a Spring-Boot based RESTfull web service REST API docs available within deployed app through Swagger Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Data gathering from OBSEA SOS Two raw data collectors exist: A Pull Transfer Flow: data is retrieved via API exposed by the SOS server available at the OBSEA observatory. A Push Transfer Flow: data will be sent to a DMP service which “listens” to near-real time updates on XML files describing sensors data and observations SOS server API GetCapabilities EMSODEV DATA MANAGEMENT PLATFORM GetObservation OBSEA data DescribeSensor Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Data Acquisition Real time data access several standards like OGC Sensor Web Enabled (OGC SWE) specifying interoperability interfaces and metadata encodings that enable real time integration of heterogeneous sensor webs into the information infrastructure. SWE specification like Sensor Observations Service (SOS), Sensor Model Language (SensorML), and Observations & Measurements (O&M), will be supported. Metadata formats extended Dublin Core format ISO19139 … Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Data Curation Sensors data will be coming from the SOS@EGIM in asynchronous/batch (PULL) mode real-time mode (PUSH) “Push” and “Pull” send (HTTP POST/PUT) formatted data to data store controllers Distributed File Systems, NoSQL DBs, Time Series DBs, Streaming Store Controllers Both PUSH and PULL transfer flow save metadata into Metadata and Service Repository. OneData was evaluated as candidate solution to enlarge this set of Data Storage solutions. Sensor data can be either Retrived via APIs exposed by an SOS server (Pull Transfer) Sent to DMP(latform) before being consolidated on the SOS server (Push Transfer Flow) Two main processes happen during the each transfer flow: data scraping, extracting parts of marine observ’s coming/retrieved from SOS server; data munging/wrangling, converting data from a "raw" format into another one that allows data to be more conveniently consumed later Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Data Publishing The will be equipped with DMP Tools in addition to API Activate process of importing a dataset from external data sources (EMSO regional nodes); Querying data curated within the EMSODEV DMP; Activate the process of defining (e.g. selecting a time range and a measured parameter) and generating a dataset to be exported outside the EMSODEV DMP. Medium to long-term preservation is ensured by regional EMSO nodes. Long term archiving will be ensured by national and international certified long-term data archives such as those of the ICSU World Data System (PANGAEA) and the National Oceanographic data centers (NODC). A common approach for Data Preservation is to be derived. Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Data Use and Reuse Complex interactions are mediated by virtual laboratories providing a persistent context for interactions between groups of users and components within DMP. experimental laboratory: a utility/tool allowing scientists/users to deploy datasets for processing and acquiring results. All laboratories must interact with a security service (AAI). Data produced will be available for usage beyond the original purpose Adopted sensors are often multi-purpose and designed for multiple users and applications. Selection of certified repositories for long-term preservation/curation in progress Data to be stored together with the minimum software, metadata and documentation. EMSO promotes standardization+integration of Regional EMSO Nodes data. to improve overall accessibility and reusability of local node data via the EMSO Data Portal. Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Demo Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Indigo Summit on Data Ingestion – Data Ingestion in EMSO Acknowledgement Daniele Baratta (Swing:It, Software Engineering Italia Srl) Michał Orzechowski (CYFRONET) Daniele Cosenza (Spacearth Technology Srl) Riccardo Delpopolo Carciopolo (Spacearth Technology Srl) Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO

Thank you for watching

INDIGO and EUDAT Solutions Currently OneData IAM B2DROP B2SHARE B2FIND In the future EUDAT services to use DMPonline Future Gateways Automated Integrity Tests Catania – May 12, 2017 Indigo Summit on Data Ingestion – Data Ingestion in EMSO