EScience challenges in levees monitoring - lessons from "flood" projects Marian Bubak Department of Computer Science AGH University of Science and Technology.

Slides:



Advertisements
Similar presentations
Polska Infrastruktura Informatycznego Wspomagania Nauki w Europejskiej Przestrzeni Badawczej Institute of Computer Science AGH ACC Cyfronet AGH The PL-Grid.
Advertisements

Multi-level SLA Management for Service-Oriented Infrastructures Wolfgang Theilmann, Ramin Yahyapour, Joe Butler, Patrik Spiess consortium / SAP.
SLA-Oriented Resource Provisioning for Cloud Computing
UrbanFlood Towards a framework for creation, deployment and reliable operation of distributed, time-critical applications Marian Bubak and Marek Kasztelnik.
Towards Autonomic Adaptive Scaling of General Purpose Virtual Worlds Deploying a large-scale OpenSim grid using OpenStack cloud infrastructure and Chef.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
Polish Infrastructure for Supporting Computational Science in the European Research Space GridSpace Based Virtual Laboratory for PL-Grid Users Maciej Malawski,
ProActive Task Manager Component for SEGL Parameter Sweeping Natalia Currle-Linde and Wasseim Alzouabi High Performance Computing Center Stuttgart (HLRS),
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
UrbanFlood WP5 Common Information Space (CIS) after Year 1 Marian Bubak, Bartosz Baliś and WP5 team ACC Cyfronet AGH, Kraków, Poland
Support for Automatic Workflow Composition in Semantic Grid Environemnt Tomasz Gubała, Marian Bubak, Maciej Malawski Institute of Computer Science and.
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
POLITECNICO DI TORINO TRIBUTE and DIMMER. DIMMER - The context One of the major challenges in today’s economy concerns the reduction in energy usage and.
Towards auto-scaling in Atmosphere cloud platform Tomasz Bartyński 1, Marek Kasztelnik 1, Bartosz Wilk 1, Marian Bubak 1,2 AGH University of Science and.
ATIF MEHMOOD MALIK KASHIF SIDDIQUE Improving dependability of Cloud Computing with Fault Tolerance and High Availability.
INFSO-RI Enabling Grids for E-sciencE FloodGrid application Ladislav Hluchy, Viet D. Tran Institute of Informatics, SAS Slovakia.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Advanced Grid-Enabled System for Online Application Monitoring Main Service Manager is a central component, one per each.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
Virtual Machine Hosting for Networked Clusters: Building the Foundations for “Autonomic” Orchestration Based on paper by Laura Grit, David Irwin, Aydan.
Pepper: An Elastic Web Server Farm for Cloud based on Hadoop Author : S. Krishnan, J.-S. Counio Date : Speaker : Sian-Lin Hong IEEE International.
CGW 2003 Institute of Computer Science AGH Proposal of Adaptation of Legacy C/C++ Software to Grid Services Bartosz Baliś, Marian Bubak, Michał Węgiel,
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
DISTRIBUTED COMPUTING
Ocean Observatories Initiative Common Execution Infrastructure (CEI) Overview Michael Meisinger September 29, 2009.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Tekin Bicer Gagan Agrawal 1.
Cracow Grid Workshop, October 27 – 29, 2003 Institute of Computer Science AGH Design of Distributed Grid Workflow Composition System Marian Bubak, Tomasz.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
© DATAMAT S.p.A. – Giuseppe Avellino, Stefano Beco, Barbara Cantalupo, Andrea Cavallini A Semantic Workflow Authoring Tool for Programming Grids.
TESTBED FOR FUTURE INTERNET SERVICES TEFIS at the EU-Canada Future Internet Workshop, March Annika Sällström – Botnia Living Lab at Centre for.
Max Ong University of Sheffield, UK. AHM 2004 Session 2.3: Workflow Composition, Wednesday 1 st September 2004, 4pm. Workflow Advisor in DAME Abstract.
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
JEMMA: an open platform for a connected Smart Grid Gateway GRUPPO TELECOM ITALIA MAS2TERING Smart Grid Workshop Brussels, September Strategy &
A dynamic optimization model for power and performance management of virtualized clusters Vinicius Petrucci, Orlando Loques Univ. Federal Fluminense Niteroi,
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
EC-project number: Universal Grid Client: Grid Operation Invoker Tomasz Bartyński 1, Marian Bubak 1,2 Tomasz Gubała 1,3, Maciej Malawski 1,2 1 Academic.
AKOGRIMO Integration of Grid services with mobile technologies; validation in e-health, e-learning and disaster management areas CoreGRID European Grid.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
ICCS WSES BOF Discussion. Possible Topics Scientific workflows and Grid infrastructure Utilization of computing resources in scientific workflows; Virtual.
The Knowledge-based Workflow System for Grid Applications Ladislav Hluchý, Viet Tran, Ondrej Habala II SAS, Slovakia
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,
Federating PL-Grid Computational Resources with the Atmosphere Cloud Platform Piotr Nowakowski, Marek Kasztelnik, Tomasz Bartyński, Tomasz Gubała, Daniel.
Workflow scheduling and optimization on clouds
→ MIPRO Conference,Opatija, 31 May -3 June 2005 Grid-based Virtual Organization for Flood Prediction Miroslav Dobrucký Institute of Informatics, SAS Slovakia,
OGCE Workflow and LEAD Overview Suresh Marru, Marlon Pierce September 2009.
VLDATA Common solution for the (very-)large data challenge EINFRA-1, focus on topics (4) & (5)
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
Cloud-based e-science drivers for ESAs Sentinel Collaborative Ground Segment Kostas Koumandaros Greek Research & Technology Network Open Science retreat.
K-WfGrid: Grid Workflows with Knowledge Ladislav Hluchy II SAS, Slovakia.
An Open Data Platform in the framework of the EGI-LifeWatch Competence Centre Fernando Aguilar Jesús Marco
StratusLab is co-funded by the European Community’s Seventh Framework Programme (Capacities) Grant Agreement INFSO-RI Demonstration StratusLab First.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
PLG-Data and rimrock Services as Building
Marian Bubak Department of Computer Science
In quest of the operational database for real-time environmental monitoring and early warning systems Bartosz Baliś, Marian Bubak, Daniel Harezlak, Piotr.
Visualizing Complex Software Systems
Clouds , Grids and Clusters
From VPH-Share to PL-Grid: Atmosphere as an Advanced Frontend
Model Execution Environment for Investigation of Heart Valve Diseases
DICE - Distributed Computing Environments Team
Smart levee monitoring and flood decision support system: reference architecture and urgent computing management Bartosz Baliś, Tomasz Bartynski, Marian.
PROCESS - H2020 Project Work Package WP6 JRA3
The Anatomy and The Physiology of the Grid
A Survey of Interactive Execution Environments
Presentation transcript:

eScience challenges in levees monitoring - lessons from "flood" projects Marian Bubak Department of Computer Science AGH University of Science and Technology Kraków, Poland eScience 2015, Munich, August 31 – September 4, 2015

Bartosz Balis Daniel Harezlak Maciej Malawski Piotr Nowakowski Bartosz Wilk Tomasz Gubala Marek Kasztelnik Jan Meizner Maciej Pawlik... And colleagues from CrossGrid, K-WfGrid, UrbanFlood, ISMOP Thanks to

Outline Motivation Interactive system (person in a loop) Exploitation of knowledge Building early warming systems IT support for levees monitoring Summary

Motivation: our area of research Investigation of methods for complex scientific collaborative applications Elaboration of environments and tools for eScience Integration of large-scale distributed computing infrastructures Knowledge-based approach to services, components, and their composition

Motivation: Krakow, May 2010

Flood - CrossGrid ( ) CrossGrid: Development of Grid Environment for Interactive Applications ftp://ftp.cordis.europa.eu/pub/ist/docs/grids/crossgrid_achievement.pdf L. Hluchy, V. D. Tran, O. Habala, B. Simo, E. Gatial, J. Astalos, M. Dobrucky: Flood Forecasting in CrossGrid Project, in Marios D. Dikaiakos (Eds): Grid Computing Second European AcrossGrids Conference, AxGrids 2004, Nicosia, Cyprus, January 28-30, Revised Papers, LNCS 3165, , 2004 This paper presents a prototype of flood forecasting system based on Grid technologies. The system consists of workflow system for executing simulation cascade of meteorological, hydrological and hydraulic models, data management system for storing and accessing different computed and measured data, and web portals as user interfaces. The whole system is tied together by Grid technology and is used to support a virtual organization of experts, developers and users.

Flood - K-WfGrid (2004-7) K-WfGrid: Knowledge-based workflow system for Grid applications ftp://ftp.cordis.europa.eu/pub/ist/docs/grids/k-wf-grid-interim-sheet_en.pdf Ladislav Hluchý, Ondrej Habala, Martin Maliska, Branislav Simo, Viet D. Tran, Ján Astalos, Marian Babik: Grid Based Flood Prediction Virtual Organization. e-Science 2006, 4-6 December 2006, Amsterdam This paper describes evolution of one such system -- a flood prediction application. The application consists of a set of simulation models, visualization tools, and various support components. During past six years it has evolved from a simple hydraulic modeling scenario into a sophisticated cascade of simulations, using state-of-the art grid, workflow and knowledge management technologies, and is one of the first applications of the SOKU [1]concept in the field of computer simulations.

From IJdijk to UrbanFlood (2008) The IJkdijk consortium turns to 7FP to organize research on the development of GeoSensing technology Sensor network telecommunication systems Sensor data processing facilities Smartness in sensors (sensor plug and play, data acquisition). Robert Meijer, TNO ICT Groningen and University of Amsterdam

Smart levees

Stand-by mode Monitoring data collection (low frequency) Initial on-line analysis (trends, deviations in sensor readings) Presentation of external info: weather prediction, flood wave prediction, etc. Threat assessment mode Increased frequency of sensor data collection Resource- intensive threat level evaluation Alert mode Prediction of levee behavior Notification of authorities Monitoring and decisions

S S S Control Centre S S S Authorities Science Public UrbanFlood -Early Warning System

A platform facilitating development, deployment and execution of EWSs EWS development – EWS reference model – EWS development framework EWS deployment – EWS blueprints – EWS-factory-as-a-service EWS execution – CIS runtime services for resource allocation, self-monitoring, self-healing, mission-critical operation, and urgent computing Common Information Space

Flood simulation with CIS

Domain resources exposed as Basic Services Data, sensors, apps wrapped as appliances and deployed onto clouds, … Composite Services (Parts) Building blocks for EWSs Orchestrate domain resources towards complex application scenarios (e.g. area flood simulation) Early Warning System A number of Parts deployed, connected, and configured for a specific setting (e.g. a dike section) Common Information Space

Flood EWS with CIS

INSPIRE-compliant Flood Simulation Service

CIS as a system factory On-demand resource provisioning (local resources, clouds) Horizontal scaling of infrastructure (more instances) Load balancing with lazy evaluation On-line availability monitoring Notifications about problems Automatic restart of failed components

ISMOP: towards a levee monitoring system Investigations on monitoring and assessment of levees: Construction of an artificial levee New sensors for levee instrumentation Design and development of a sensor communication infrastructure – Optimal collection and transmission of sensor data Levee modeling and simulation – Comparison of simulated and real levee behavior Central System: software platforms for execution management, data management, visualization and decision support

ISMOP: Consortium Department of Computer Science AGH Department of Hydrogeology and Engineering Geology AGH Department of Geoinformatics and Applied Computer Science AGH NeoSentio, Kraków Sweco Hydroprojekt Kraków in collaboration with the Czernichów Community

ISMOP central system use cases Support for experiments on the artificial levee – Controlled flooding of the artificial levee and on-line data collection – Validation of models of levees Elaboration of a decision support system – Continuous monitoring of levees – Automation data-driven and model-driven analyses – Prediction of breaches

Experimental levee (1/4)

Experimental levee (2/4)

Experimental levee (3/4)

Experimental levee (4/4)

Assessment of levee breach threat via scenario matching

ISMOP Decision Suport System

Solution Leveraging open standards (OGC, INSPIRE) for data & metadata models Interoperability with external systems (e.g. ISOK, regional flood protection agencies) Solution: research in progress… Visualization of relevant information to effectively support the decision making process Solution Open domain-agnostic design (metadata and public APIs design are crucial) Adaptability to other domains (e.g. monitoring of communication infrastructure) Challenges: visualization and decision support

Visualization and decision support

Challenges: execution management Solution Monitored area divided into sections Managed by multiple instances of a Monitoring Application, dynamically deployed on-demand Scale up to 100s- 1000s kilometers of levees Solution Dynamic provisioning of resources from private or public clouds Autoscaling algorithms and policies Highly variable resource demands: from very low in standby mode to high in threat assessment mode

Execution and Provisioning Platform (EXP)

Challenges: data management Solution Multiple data stores and models to address diverse needs Diverse data sets (spatial, time series, binary, metadata) and data usage patterns Solution Big data infrastructure Map-Reduce data search Data-intensive processing Threat level evaluation scenario: up to 130 GB of data to search per 1km of a levee

Data Access & Analytics Platform (DAP)

Urgent computing scenario Goal: Assess flood risk for a large set levees by a specified deadline Solution: dynamic provisioning of cloud resources A user: – Target area for flood threat assessment – Time window size for current measurements – Deadline to get results The system: – Generates workflow representing all required computations and data dependencies – Plans the workflow execution so as to meet the deadline – Runs the workflow – Monitors its execution and reconfigures resource allocation if needed

Levee breach threat assessment

Implementation of urgent computing

Resource provisioning model Bag-of-tasks model – Selection of dominating tasks – Uniform task runtimes Performance model: T = f (v, d, s, …) – T – total computing time – v – number of VMs – d – time window in days – s – number of tasks (sections)

Resource provisioning model

Simulations Setup: private cloud infrastructure – a node with 8 cores (Xeon E5-2650) – virtual machines (1VCPU, 512MB RAM) – data for simulated scenarios (244MB total) on local disks Simulations: – sections – 1-16 VMs – 1-7 days time window Warmup tasks:

Resource provisioning - results Warmup tasks clearly separated as outliers Linear functions Parameters a, b, c determined using non-linear fit The model fits well to the data Warmup { a = 6.53 b = 9.41 c = sections 128 sections

Clouds for urgent computing (1/2) Elasticity – On-demand provisioning of VMs – Job prioritization and preemption Reliability – Public cloud services are specifically designed to support systems with high availability demands – Amazon: only five major outages in the years (only one for more than 6 h) Safety – Serious natural disaster may damage a local computing infrastructure – Public clouds as an emergency computing infrastructure – Data safety: public clouds as a reliable storage infrastructure for important but not sensitive data (example: pre-simulated scenarios data sets)

Clouds for urgent computing (2/2) Cost-effectiveness – Decision support systems for natural disasters generate ‘spiky’ workloads: perfect cloud use case – Cheaper than maintenance of a dedicated infrastructure – Day-to-day operation can be handled by a relatively small, low-cost on-premises infrastructure Performance? – Bag-of-tasks applications such as scenario identification perfectly fit the cloud – What about CPU- and communication-intensive tightly-coupled simulations?  HPC-in-the-Cloud is an emerging trend.

Summary (1/2) Environmental models results in complex applications – collaborative – multi-scale, multi-domain, – time-critical – With data and resource intensive scenarios We have contributed to – methods and tools for environmental computing – advanced problem solving environments, virtual laboratories – compositions of resources into complex scenarios – ccommodation to ”spiky” behavior (variable workload)

Summary (2/2) We have addressed – Complex distributed systems – Coordination of execution (workflow) – Monitoring and management of services – allocation of resources to services – fault tolerance – provenance tracking Sustainability issue - supporting technologies

More at

Acknowledgements This research was supported by the National Centre for Research and Development (NCBiR) under Grant No. PBS1/B9/18/2013.