eScience challenges in levees monitoring - lessons from "flood" projects Marian Bubak Department of Computer Science AGH University of Science and Technology Kraków, Poland eScience 2015, Munich, August 31 – September 4, 2015
Bartosz Balis Daniel Harezlak Maciej Malawski Piotr Nowakowski Bartosz Wilk Tomasz Gubala Marek Kasztelnik Jan Meizner Maciej Pawlik... And colleagues from CrossGrid, K-WfGrid, UrbanFlood, ISMOP Thanks to
Outline Motivation Interactive system (person in a loop) Exploitation of knowledge Building early warming systems IT support for levees monitoring Summary
Motivation: our area of research Investigation of methods for complex scientific collaborative applications Elaboration of environments and tools for eScience Integration of large-scale distributed computing infrastructures Knowledge-based approach to services, components, and their composition
Motivation: Krakow, May 2010
Flood - CrossGrid ( ) CrossGrid: Development of Grid Environment for Interactive Applications ftp://ftp.cordis.europa.eu/pub/ist/docs/grids/crossgrid_achievement.pdf L. Hluchy, V. D. Tran, O. Habala, B. Simo, E. Gatial, J. Astalos, M. Dobrucky: Flood Forecasting in CrossGrid Project, in Marios D. Dikaiakos (Eds): Grid Computing Second European AcrossGrids Conference, AxGrids 2004, Nicosia, Cyprus, January 28-30, Revised Papers, LNCS 3165, , 2004 This paper presents a prototype of flood forecasting system based on Grid technologies. The system consists of workflow system for executing simulation cascade of meteorological, hydrological and hydraulic models, data management system for storing and accessing different computed and measured data, and web portals as user interfaces. The whole system is tied together by Grid technology and is used to support a virtual organization of experts, developers and users.
Flood - K-WfGrid (2004-7) K-WfGrid: Knowledge-based workflow system for Grid applications ftp://ftp.cordis.europa.eu/pub/ist/docs/grids/k-wf-grid-interim-sheet_en.pdf Ladislav Hluchý, Ondrej Habala, Martin Maliska, Branislav Simo, Viet D. Tran, Ján Astalos, Marian Babik: Grid Based Flood Prediction Virtual Organization. e-Science 2006, 4-6 December 2006, Amsterdam This paper describes evolution of one such system -- a flood prediction application. The application consists of a set of simulation models, visualization tools, and various support components. During past six years it has evolved from a simple hydraulic modeling scenario into a sophisticated cascade of simulations, using state-of-the art grid, workflow and knowledge management technologies, and is one of the first applications of the SOKU [1]concept in the field of computer simulations.
From IJdijk to UrbanFlood (2008) The IJkdijk consortium turns to 7FP to organize research on the development of GeoSensing technology Sensor network telecommunication systems Sensor data processing facilities Smartness in sensors (sensor plug and play, data acquisition). Robert Meijer, TNO ICT Groningen and University of Amsterdam
Smart levees
Stand-by mode Monitoring data collection (low frequency) Initial on-line analysis (trends, deviations in sensor readings) Presentation of external info: weather prediction, flood wave prediction, etc. Threat assessment mode Increased frequency of sensor data collection Resource- intensive threat level evaluation Alert mode Prediction of levee behavior Notification of authorities Monitoring and decisions
S S S Control Centre S S S Authorities Science Public UrbanFlood -Early Warning System
A platform facilitating development, deployment and execution of EWSs EWS development – EWS reference model – EWS development framework EWS deployment – EWS blueprints – EWS-factory-as-a-service EWS execution – CIS runtime services for resource allocation, self-monitoring, self-healing, mission-critical operation, and urgent computing Common Information Space
Flood simulation with CIS
Domain resources exposed as Basic Services Data, sensors, apps wrapped as appliances and deployed onto clouds, … Composite Services (Parts) Building blocks for EWSs Orchestrate domain resources towards complex application scenarios (e.g. area flood simulation) Early Warning System A number of Parts deployed, connected, and configured for a specific setting (e.g. a dike section) Common Information Space
Flood EWS with CIS
INSPIRE-compliant Flood Simulation Service
CIS as a system factory On-demand resource provisioning (local resources, clouds) Horizontal scaling of infrastructure (more instances) Load balancing with lazy evaluation On-line availability monitoring Notifications about problems Automatic restart of failed components
ISMOP: towards a levee monitoring system Investigations on monitoring and assessment of levees: Construction of an artificial levee New sensors for levee instrumentation Design and development of a sensor communication infrastructure – Optimal collection and transmission of sensor data Levee modeling and simulation – Comparison of simulated and real levee behavior Central System: software platforms for execution management, data management, visualization and decision support
ISMOP: Consortium Department of Computer Science AGH Department of Hydrogeology and Engineering Geology AGH Department of Geoinformatics and Applied Computer Science AGH NeoSentio, Kraków Sweco Hydroprojekt Kraków in collaboration with the Czernichów Community
ISMOP central system use cases Support for experiments on the artificial levee – Controlled flooding of the artificial levee and on-line data collection – Validation of models of levees Elaboration of a decision support system – Continuous monitoring of levees – Automation data-driven and model-driven analyses – Prediction of breaches
Experimental levee (1/4)
Experimental levee (2/4)
Experimental levee (3/4)
Experimental levee (4/4)
Assessment of levee breach threat via scenario matching
ISMOP Decision Suport System
Solution Leveraging open standards (OGC, INSPIRE) for data & metadata models Interoperability with external systems (e.g. ISOK, regional flood protection agencies) Solution: research in progress… Visualization of relevant information to effectively support the decision making process Solution Open domain-agnostic design (metadata and public APIs design are crucial) Adaptability to other domains (e.g. monitoring of communication infrastructure) Challenges: visualization and decision support
Visualization and decision support
Challenges: execution management Solution Monitored area divided into sections Managed by multiple instances of a Monitoring Application, dynamically deployed on-demand Scale up to 100s- 1000s kilometers of levees Solution Dynamic provisioning of resources from private or public clouds Autoscaling algorithms and policies Highly variable resource demands: from very low in standby mode to high in threat assessment mode
Execution and Provisioning Platform (EXP)
Challenges: data management Solution Multiple data stores and models to address diverse needs Diverse data sets (spatial, time series, binary, metadata) and data usage patterns Solution Big data infrastructure Map-Reduce data search Data-intensive processing Threat level evaluation scenario: up to 130 GB of data to search per 1km of a levee
Data Access & Analytics Platform (DAP)
Urgent computing scenario Goal: Assess flood risk for a large set levees by a specified deadline Solution: dynamic provisioning of cloud resources A user: – Target area for flood threat assessment – Time window size for current measurements – Deadline to get results The system: – Generates workflow representing all required computations and data dependencies – Plans the workflow execution so as to meet the deadline – Runs the workflow – Monitors its execution and reconfigures resource allocation if needed
Levee breach threat assessment
Implementation of urgent computing
Resource provisioning model Bag-of-tasks model – Selection of dominating tasks – Uniform task runtimes Performance model: T = f (v, d, s, …) – T – total computing time – v – number of VMs – d – time window in days – s – number of tasks (sections)
Resource provisioning model
Simulations Setup: private cloud infrastructure – a node with 8 cores (Xeon E5-2650) – virtual machines (1VCPU, 512MB RAM) – data for simulated scenarios (244MB total) on local disks Simulations: – sections – 1-16 VMs – 1-7 days time window Warmup tasks:
Resource provisioning - results Warmup tasks clearly separated as outliers Linear functions Parameters a, b, c determined using non-linear fit The model fits well to the data Warmup { a = 6.53 b = 9.41 c = sections 128 sections
Clouds for urgent computing (1/2) Elasticity – On-demand provisioning of VMs – Job prioritization and preemption Reliability – Public cloud services are specifically designed to support systems with high availability demands – Amazon: only five major outages in the years (only one for more than 6 h) Safety – Serious natural disaster may damage a local computing infrastructure – Public clouds as an emergency computing infrastructure – Data safety: public clouds as a reliable storage infrastructure for important but not sensitive data (example: pre-simulated scenarios data sets)
Clouds for urgent computing (2/2) Cost-effectiveness – Decision support systems for natural disasters generate ‘spiky’ workloads: perfect cloud use case – Cheaper than maintenance of a dedicated infrastructure – Day-to-day operation can be handled by a relatively small, low-cost on-premises infrastructure Performance? – Bag-of-tasks applications such as scenario identification perfectly fit the cloud – What about CPU- and communication-intensive tightly-coupled simulations? HPC-in-the-Cloud is an emerging trend.
Summary (1/2) Environmental models results in complex applications – collaborative – multi-scale, multi-domain, – time-critical – With data and resource intensive scenarios We have contributed to – methods and tools for environmental computing – advanced problem solving environments, virtual laboratories – compositions of resources into complex scenarios – ccommodation to ”spiky” behavior (variable workload)
Summary (2/2) We have addressed – Complex distributed systems – Coordination of execution (workflow) – Monitoring and management of services – allocation of resources to services – fault tolerance – provenance tracking Sustainability issue - supporting technologies
More at
Acknowledgements This research was supported by the National Centre for Research and Development (NCBiR) under Grant No. PBS1/B9/18/2013.