The Mapper project receives funding from the EC's Seventh Framework Programme (FP7/ ) under grant agreement n° RI Multiscale Applications on European e- Infrastructures Marian Bubak AGH Krakow PL and University of Amsterdam NL on behalf of the MAPPER Consortium Chalmers e-Science Initiative Seminar 2 Dec 2011
2 Academic Computer Centre CYFRONET AGH (1973) 120 employees Academic Computer Centre CYFRONET AGH (1973) 120 employees Department of Computer Science AGH (1980) 800 students, 70 employees Department of Computer Science AGH (1980) 800 students, 70 employees Faculty of Electrical Engineering, Automatics, Computer Science and Electronics (1946) 4000 students, 400 employees Faculty of Electrical Engineering, Automatics, Computer Science and Electronics (1946) 4000 students, 400 employees AGH University of Science and Technology (1919) 15 faculties, students; 4000 employees AGH University of Science and Technology (1919) 15 faculties, students; 4000 employees Other 14 faculties Distributed Computing Environments (DICE) Team About the speaker University of Amsterdam, Institute for Informatics, Computational Science
3 DICE team ( Main research interests: investigation of methods for building complex scientific collaborative applications and large-scale distributed computing infrastructures elaboration of environments and tools for e-Science development of knowledge-based approach to services, components, and their semantic composition and integration CrossGrid interactive compute- and data-intensive applications K-Wf Grid knowledge-based composition of grid workflow applications CoreGRID problem solving environments, programming models GREDIA grid platform for media and banking applications ViroLab GridSpace virtual laboratory PL-Grid advanced virtual laboratory gSLM service level management for grid and clouds UrbanFlood Common Information Space for Early Warning Systems MAPPER VPH-Share Collage ? computational strategies, software and services for distributed multiscale simulations Federating cloud resources for development and execution of VPH computationally and data intensive applications Executable Papers; 1st award of Elsevier Competition at ICCS2011
4 Plan Multiscale applications Multiscale modeling Objectives of the MAPPER project Programming and Execution Tools MAPPER infrastructure ISR application scenario Summary
5 Vision Distributed Multiscale Computing... – Strongly science driven, application pull...on existing and emerging European e- Infrastructures, and exploiting as much as possible services and software developed in earlier (EU-funded) projects. – Strongly technology driven, technology push
6 Nature is multiscale Natural processes are multiscale – 1 H 2 O molecule – A large collection of H 2 O molecules, forming H-bonds – A fluid called water, and, in solid form, ice.
7 Multiscale modeling Scale Separation Map Nature acts on all the scales We set the scales And then decompose the multiscale system in single scale sub-systems And their mutual coupling temporal scale spatial scale xx L tt T
8 From a Multiscale System to many Singlescale Systems Identify the relevant scales Design specific models which solve each scale Couple the subsystems using a coupling method temporal scale spatial scale xx L tt T
9 Why multiscale models? There is simply no hope to computationally track complex natural processes at their finest spatio-temporal scales. – Even with the ongoing growth in computational power.
10 Minimal requirement
11 Multiscale computing Inherently hybrid models are best serviced by different types of computing environments When simulated in three dimensions, they usually require large scale computing capabilities. Such large scale hybrid models require a distributed computing ecosystem, where parts of the multiscale model are executed on the most appropriate computing resource. Distributed Multiscale Computing
12 Two paradigms Loosely Coupled – One single scale model provides input to another – Single scale models are executed once – workflows Tightly Coupled – Single scale models call each other in an iterative loop – Single scale models may execute many times – Dedicated coupling libraries are needed temporal scale spatial scale xx L tt T temporal scale spatial scale xx L tt T
13 MAPPER Multiscale APPlications on European e-infRastructures University of Amsterdam Max-Planck Gesellschaft zur Foerderung der Wissenschaften E.V. University of Ulster Poznan Supercomputing and Networking Centre Akademia Gorniczo- Hutnicza im. Stanislawa Staszica w Krakowie Ludwig-Maximilians- Universität München University of Geneva Chalmers Tekniska Högskola University College London
14 Motivation: user needs VPHFusion Computional Biology Material Science Engineering Distributed Multiscale Computing Needs
15 Applications 7 applications from 5 scientific domains brought under a common generic multiscale computing framework virtual physiological human fusion hydrology nano material science computational biology SSMCoupling topology (x)MML Task graph Scheduling
16 Ambition Develop computational strategies, software and services for distributed multiscale simulations across disciplines exploiting existing and evolving European e-infrastructure Deploy a computational science infrastructure Deliver high quality components aiming at large-scale, heterogeneous, high performance multi-disciplinary multiscale computing. Advance state-of-the-art in high performance computing on e- infrastructures enable distributed execution of multiscale models across e-Infrastructures,
17 High level tools: objectives Design and implement an environment for composing multiscale simulations from single scale models – encapsulated as scientific software components – distributed in various e-infrastructures – supporting loosely coupled and tightly coupled paradigm Support composition of simulation models: – using scripting approach – by reusable “in-silico” experiments Allow interaction between software components from different e- Infrastructures in a hybrid way. Measure efficiency of the tools
18 Requirements analysis Focus on multiscale applications that are described as a set of connected, but independent single scale modules and mappers (converters) Support describing such applications in uniform (standardized) way to: – analyze application behavior – support switching between different versions of the modules with the same scale and functionality – support building different multiscale applications from the same modules (reusability) Support computationally intensive simulation modules – requiring HPC or Grid resources – often implemented as parallel programs Support tight (with loop), loose (without loop) and hybrid (both) connection modes
19 Overview of tools MAPPER Memory (MaMe) - a semantics- aware persistence store to record metadata about models and scales Multiscale Application Designer (MAD) - visual composition tool transforming high level MML description into executable experiment GridSpace Experiment Workbench (EW) - execution and result management of experiments on e- infrastructures via interoperability layers (AHE, QCG)
20 Multiscale modeling language Uniformly describes multiscale models and their computational implementation on abstract level Two representations: graphical (gMML), textual (xMML) Includes description of – scale submodules – scaleless submodules (so called mappers and filters) – ports and their operators (for indicating type of connections between modules) – coupling topology – implementation Submodel execution loop in pseudocode f := finit /*initialization*/ t := 0 while not EC(f, t): Oi(f, t) /*intermediate observation*/ f := S(f, t) /*solving step*/ t += theta(f) end Of(f, t) /*final observation*/ Oi Of S finit undefined Corresponding symbols in gMML Example for Instent Restenosis application IC – initial conditions DD- drug diffusion BF – blood flow SMC – smooth muscle cells
21 jMML library Supports XMML analysis: Detection of initial models Constructing coupling topology (gMML) Generating task graph Deadlock detection Generating Scale Separation Map Supports Graphviz or pdf formats
22 MaMe - MAPPER memory Provides rich, semantics-aware persistence store for other components to record information Based on a well-defined domain model containing MAPPER metadata defined in MML Other MAPPER tools store, publish and reuse such matadata throughout the entire Project and its Consortium Provides dedicated web interface for human users to browse and curate metadata
23 MAD: Application Designer User friendly visual tool for composing multiscale applications Supports importing application structure from xMML (section A and B) Supports composing multiscale applications in gMML (section B) with additional graphical specific information - layout, color etc. (section C) Transforms gMML into xMML Performs MML analysis to identify its loosely and tightly coupled parts Using information from MaMe and GridSpace EW, transforms gMML into executable formats with information needed for actual execution (section D) : – GridSpace Experiment – MUSCLE connection file (cxa.rb)
24 GridSpace Experiment Workbench Supports execution of experiments on e-infrastructures via interoperability layers Result management support Uses Interpreter-Executor model of computation: – Interpreter - a software package available on the infrastructure, usually programatically accessible by DSL or script language e.g: MUSCLE, LAMMPS, CPMD – Executor - a common entity for hosts, clusters, grid brokers etc. capable of running Interpreters Allows easy configuration of available executors and interpreters Transforming example MML into executable GS Experiment
25 User environment Application composition: from MML to executable experiment Application composition: from MML to executable experiment Registration of MML metadata: submodules and scales Result Management Result Management Execution of experiment using interoperability layer on e-infrastructure Execution of experiment using interoperability layer on e-infrastructure
26 …… MoU signed Taskforce established 1 st evaluation Joined task force between MAPPER, EGI and PRACE Collaborate with EGI and PRACE to introduce new capabilities and policies onto e-Infrastructures Deliver new application tools, problem solving environments and services to meet end-users needs Work closely with various end-users communities (involved directly in MAPPER) to perform distributed multiscale simulations and complex experiments 05 1 st EU review selected two apps on MAPPER e-Infrastructure (EGI and PRACE resources) Tier - 2 Tier - 1 Tier - 0 MAPPER Taskforce E-infrastructure
27 MAPPER e-infrastructure MAPPER pre-production infrastructure – Cyfronet, LMU/LRZ, PSNC, TASK, UCL, WCSS – Environment for developing, testing and deployment of MAPPER components Central services – GridSpace, MAD, MaMe, monitoring, web-site EGI-MAPPER-PRACE task force – SARA Huygens HPC system
28 2 scenarios in operation loosely coupled DMC tightly coupled DMC
29 In-stent restenosis Coronary heart disease (CHD) remains the most common cause of death in the Europe, being responsible for approximately 1.92 million deaths each year* A stenosis is an abnormal narrowing of a blood vessel Solution: a stent placed with a balloon angioplasty Possible response, in 10% of the cases: abnormal tissue growth in the form of in-stent restenosis – Multiscale, multiphysics phenomenon involving physics, biology, chemistry, and medicine
30 In-stent restenosis model A 3D model of in-stent restenosis (ISR3D) – why does it occur, when does it stop? – Ultimate goal: Facilitate stent design Effect of drug eluting stents Models: – cells in the vessel wall; – blood in the lumen; – drug diffusion; and – most importantly their interaction 3D model is computationally very expensive 2D model has published results* *H. Tahir, A. G. Hoekstra et al. Interface Focus, 1(3), 365–373
31 Scale separation map Four main submodels Same spatial scale Different temporal scale
32 Coupling topology Model – is tightly coupled (excluding initial condition) – has a fixed number of synchronization points – has one instance per submodel
33 MML of ISR3D start stop submodel mapper edge heads/tails finalizationinitialization intermediate
34 ISR3D on MAPPER
35 Demo: Mapper Memory (MaMe) Semantics-aware persistence store Records MML-based metadata about models and scales Supports exchanging and reusing MML metadata for – other MAPPER tools via REST interface – human users via dedicated Web interface Ports and their operators
36 Demo: Gridspace EW for ISR3D Obtains MAD generated experiment containing a configuration file for MUSCLE interpreter Provides two executors for MUSCLE interpreter – SSH on Polish NGI UI – cluster execution – QCG – multisite execution Uses QCG executor for running MUSCLE interpreter on QCG and staging input/output files # declare kernels which can be launched in the CxA cxa.add_kernel(’submodel_instance1, ’my.submodelA’) cxa.add_kernel(’submodel_instance2’, ’my.submodelB’) … # configure connection scheme of the CxA cs = cxa.cs # configure unidirectional connection between kernels cs.attach ’ submodel_instance1’=> ’submodel_instance2’ do tie ’portA’, ’portB’ ….. end …
37 Computing ISR3D is implemented using the multiscale coupling library and environment (MUSCLE) Contains Java, Fortran and C++ submodels MUSCLE provides uniform communication between tightly coupled submodels MUSCLE can be run on a laptop, a cluster or multiple sites.
38 QCG Role Provides an interoperability layer between PRACE and EGI infrastructures Co-allocates heterogeneous resources according to the requirements of a MUSCLE application using an advance reservation mechanism Synchronizes the execution of application kernels in multi-cluster environment Efficiently executes and manages tasks on EGI and UCL resources Manages data transfers
39 ISR3D Results
40 ISR3D - conclusion Before MAPPER, ISR2D ran fast enough, ISR3D took too much exection time and a lot of time to couple Now, ISR3D runs distributedly using the MAPPER tools and middleware To get scientific results we will have to, and can, run many batch jobs – Done in the MeDDiCa EU project – Involves 1000s of runs Also, the code can be parallelized to run faster
41 Summary Elaboration of a concept of an environment supporting developers and users of multiscale applications for grid and cloud infrastructures Design of the formalism for describing connections in multiscale simulations Enabling access to e-infrastructures Validation of the formalism against real applications structure by using tools Proof of concept for transforming high level formal description to actual execution using e-infrastructures
More about MAPPER