ICOS on-demand atmospheric transport computation A use case for interoperability of EGI and EUDAT services Ute Karstens, André Bjärby, Oleg Mirzov, Roger Groth, Mitch Selander, Maggie Hellström, Alex Vermeulen ICOS Carbon Portal @ Lund University Diego Scardaci, Matthew Viljoen EGI Foundation Peter Gille, Michaela Barth EUDAT and PDC Center for High Performance Computing, KTH Royal Institute of Technology, Stockholm
Integrated Carbon Observation System “A pan-European research infrastructure for quantifying and understanding the greenhouse gas balance of the European continent” Collect high-quality observational data relevant to the greenhouse gas budget of Europe Make the ICOS data freely available to all interested parties Promote the use of the ICOS data for further scientific study Support modelling activities of the greenhouse gas fluxes in time and space Support verification of the effectiveness of policies aiming to reduce greenhouse gas emissions
Footprint tool for atmospheric sites Web-based service at ICOS Carbon Portal On-demand computation and visualization of footprints and GHG concentrations at atmospheric measurement stations Based on the Lagrangian atmospheric transport model STILT Use case for testing interoperability between EGI and EUDAT services in WP7/Task 7.2 of EUDAT2020 Application examples: Analysis of the sensitivity of GHG concentration signals at potential and existing ICOS atmospheric measurement stations to GHG emissions and fluxes Evaluation of measurement strategies Network design studies
STILT atmospheric transport model calculations Atmospheric observations Emissions Meteorological driver fields ≈ 1 GB ≈ 0.5-1 TB ≈ 2-3 TB ≈ 1-2 TB per year Station Footprints GHG concentrations Federated Cloud STILT Lagrangian transport model ≈ 300 CPUs per footprint => 750 CPUh/station/year ICOS Carbon Portal Atmospheric observations Prior fluxes Emissions Meteorological driver fields EUDAT B2SAFE ≈ 1 GB ≈ 0.5-1 TB ≈ 2-3 TB ≈ 1-2 TB per year Station Footprints GHG concentrations EGI Federated Cloud STILT Lagrangian transport model ≈ 670 CPUs per footprint => 1700 CPUh per station per year ICOS Carbon Portal EUDAT B2SAFE ≈ 1-2 TB per year Station Footprints GHG concentrations Atmospheric observations Prior fluxes Emissions Meteorological driver fields EUDAT B2SAFE ≈ 1 GB ≈ 0.5-1 TB ≈ 2-3 TB EUDAT B2SAFE ≈ 1-2 TB per year Station Footprints GHG concentrations Atmospheric observations Prior fluxes Emissions Meteorological driver fields EUDAT B2SAFE ≈ 1 GB ≈ 0.5-1 TB ≈ 2-3 TB EUDAT B2SAFE ≈ 1-2 TB per year Station Footprints GHG concentrations EGI Federated Cloud STILT Lagrangian transport model ≈ 670 CPUs per footprint => 1700 CPUh per station per year ICOS Carbon Portal Atmospheric observations Prior fluxes Emissions Meteorological driver fields EUDAT B2SAFE ≈ 1 GB ≈ 0.5-1 TB ≈ 2-3 TB EUDAT B2SAFE ≈ 1-2 TB per year Station Footprints GHG concentrations EGI Federated Cloud STILT Lagrangian transport model ≈ 670 CPUs per footprint => 1700 CPUh per station per year ICOS Carbon Portal Atmospheric observations Prior fluxes Emissions Meteorological driver fields EUDAT B2SAFE ≈ 1 GB ≈ 0.5-1 TB ≈ 2-3 TB EGI Federated Cloud STILT Lagrangian transport model ≈ 670 CPUs per footprint => 1700 CPUh per station per year ICOS Carbon Portal EGI Federated Cloud STILT Lagrangian transport model ≈ 670 CPUs per footprint => 1700 CPUh per station per year ICOS Carbon Portal
Footprint tool workflow ICOS CP account User 1 VM Web service VM Worker AAI AAI User 2 Controller Model VM NFS Model Output Particle Location Footprints GHG conc. datahub.egi.eu OneData PDC/KTH VM ICOS Data Model Input Meteo Model Output Model Input
Footprint tool workflow ICOS CP account User 1 VM Web service VM Worker AAI VM Worker AAI User 2 Controller Model Model VM NFS Model Output Particle Location Footprints GHG conc. datahub.egi.eu OneData PDC/KTH VM ICOS Data Model Input Meteo Model Output Model Input
… Footprint tool workflow User 1 User 2 User 3 User 4 NFS VM Model Input Meteo datahub.egi.eu OneData ICOS CP account User 1 User 2 AAI Web service Controller ICOS Data Model Output PDC/KTH Worker Model NFS Particle Location Footprints GHG conc. User 3 User 4 … on demand
Components of the workflow Generic simulation Linux VMs in EGI Federated Cloud Scala for backend development Akka Cluster for orchestration Docker-based version of model (or data processing tool) Long-term storage to archive model results Intermediate storage close to the computation STILT-specific components Split model runs into small jobs to allow distribution over many cores/VMs to efficiently serve multiple users Handling of large numbers of small files (model output re-used as input) Prototype: 4 VMs of medium size (8 CPUs, 16-32 GB RAM) + 4 TB storage General framework will be applied to other types of model simulations and data processing tasks
Storage and data services Long-term storage for ICOS measurement data and elaborated products (e.g. STILT results) at B2SAFE instance at PDC/KTH (replicating to another B2SAFE node) using B2STAGE for data transfer (currently using iCommands/GridFTP) waiting for B2STAGE HTTP API for operational implementation at ICOS CP waiting for support of metadata handling for B2SAFE (GraphDB) Intermediate storage for STILT model input and output requires handling of large numbers (∼1 Mio) of small files (1-5 MB) output might be re-used as input in further model runs Network File System on dedicated VM to serve input/output data EGI DataHub for storage of data from multiple providers, e.g. meteorological 4D arrays (in future) OneData software solution already tested
Thank you ! More information about ICOS: www.icos-ri.eu and the Carbon Portal: www.icos-cp.eu Contact: ute.karstens@nateko.lu.se