Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Computing Environments (DICE) team – product portfolio

Similar presentations


Presentation on theme: "Distributed Computing Environments (DICE) team – product portfolio"— Presentation transcript:

1 Distributed Computing Environments (DICE) team – product portfolio
Piotr Nowakowski, Bartosz Baliś, Tomasz Bartyński, Tomasz Gubała, Daniel Harężlak, Marek Kasztelnik, Maciej Malawski, Jan Meizner, Katarzyna Rycerz, Bartosz Wilk, Marian Bubak Department of Computer Science and ACC CYFRONET AGH University of Science and Technology Kraków, Poland

2 Overview The DICE team was established at ACC CYFRONET AGH in 2001, and has since developed a track record of successful deployments of customized IT solutions for e-Science. Notable products include: GridSpace/GridSpace 2 – a set of virtual laboratories Collage – novel authoring environment for executable publications Atmosphere – hybrid computational cloud platform ISMOP – levee monitoring system plgapp – convenient development of HPC applications rimrock/PLGData – simplified access to HPC computational jobs and data storage resources cDRS/Integromics/MetaBiobank/BIP/LSSP – set of targeted applications for life sciences and medical science, exploiting HPC resources and offering convenient UIs with integrated security

3 GridSpace2 Virtual Laboratory
GridSpace2 facilitates development of computational experiments by linking together code run in dissimilar programming environments. GridSpace2 provides an alternative to drag-and-drop workflow construction environments (such as Taverna) and instead follows a scripting approach to workflow design. GridSpace2 provides a web interface where users can execute fragments of code and string them together into workflows. Each fragment has an associated interpreter and individual fragments can be linked together in workflow-like fashion (with loops, conditional statements, etc.) Results can be visualized and stored for later download.

4 GridSpace2 Virtual Laboratory
User-friendly web frontend Experiment workbench Construct experiment plans from code snippets Interactively run experiments Experiment Execution Environment Multiple interpreters Access to libraries, programs and services (gems) Access to computing infrastructures Clusters, grids, clouds

5 GridSpace2 Virtual Laboratory
A set of visual tools aiding users in composing GridSpace experiments have been implemented in the MAPPER project. These add-ons support scientists in composing multiscale applications from previously registered models. High-level descriptions are transformed into GridSpace experiments.

6 GridSpace2 Virtual Laboratory

7 Collage Authoring Environment
Collage is an environment supporting authoring, deployment and provisioning of executable publications, i.e. publications which contain executable code and expose data sets. Goal: extending the traditional publishing model with interactivity mechanisms; enabling readers (including reviewers) to replicate and verify experimentation results and browse large-scale result spaces. A prototype, developed in response to the Elsevier Grand Challenge on Executable Publications, won first prize in the competition and have since collaborated on extending the SciVerse web portal with interactive publication features. The Collage Authoring Environment launched a pilot special issue with the Computers & Graphics Journal.

8 Collage Authoring Environment
When a user requests access to an Executable Paper, the static contents of the paper are served by the Publisher Server. Dynamic contents, which are embedded in the publication, are instead served by a dedicated Executable Paper Engine, residing on a separate host, with access to HPC resources. Publisher and HPC provider roles are decoupled and follow mutually independent access policies.

9 Collage Authoring Environment

10 Atmosphere Cloud Platform
Atmosphere is a hybrid computational cloud platform which integrates a variety of cloud resources, both public and private, and exposes a uniform access layer for end users. Atmosphere enables users to securely access resources, develop application services and expose them in a controlled manner. Atmosphere conceals the heterogeneous nature of the underlying cloud infrastructure and enables cloud application development with no prior knowledge regarding cloud computing. Atmosphere has been deployed as the principal means of accessing cloud resources in the VPH-Share and PL-Grid projects.

11 Atmosphere Cloud Platform
Application -- or -- Workflow environment End user A full range of user-friendly GUIs is provided to enable service creation, instantiation and access. A comprehensive online user guide is also available. Atmosphere Registry (AIR) Atmosphere Ruby on Rails controller layer (core Atmosphere logic) Cloud sites The GUIs work by invoking a secure RESTful API which is exposed by the Atmosphere host. We refer to this API as the Cloud Facade. Any operation which can be performed using the GUI may also be invoked programmatically by tools acting on behalf of the platform user – this includes standalone applications and workflow management environments. Atmosphere provides a robust set of top-level interfaces (GUIs and APIs) and can integrate with a wide variety of cloud software stacks, both commercial and scientific.

12 Atmosphere Cloud Platform

13 ISMOP Levee Monitoring System
ISMOP stands for Computerized Levee Monitoring System (see The goal of the infrastructure is to collect measurements (temperature, water pressure, etc.) from sensors deployed in flood embankments along the Vistula river, and run CFD simulations to determine the likelihood of a catastrophic flood. Calculations are performed with the use of CYF/DCS cloud resources User interfaces are available at The project is ongoing

14 ISMOP Levee Monitoring System
The system adapts to changing conditions and can operate in both standard and urgent modes. Computing resources are internally provisioned by Atmosphere.

15 ISMOP Levee Monitoring System

16 plgapp – scientific application development platform
plgapp is a platform for hosting lightweight web applications using high performance computing infrastructures. A development mode for applications is automatically provided where new versions can be uploaded and tested without affecting the production environment. plgapp applications are run in the user’s browser and all interaction with the computing infrastructure is performed with the help of JavaScript libraries (in most cases the set of libraries provided by the platform should suffice). In order to automate the process of transmitting application files to the platform, Dropbox support has been integrated.

17 plgapp – scientific application development platform
Each plgapp constitutes a isolated scientific portal with two execution environments Production environment used by regular users Testing environment used to validate new features and by early adopters Application files can be locally synchronized via web forms or with a Dropbox client Ensures tight implement-save-run loop for better programmer productiveness Production deployment of test applications can be done from within a dedicated web panel

18 plgapp – scientific application development platform

19 rimrock command execution interface
rimrock simplifies interaction with remote servers by allowing users to execute applications in batch/interactivemodes. The platform can be used to easily interface HPC resources through a friendly web-based interface. Application output can be fetched online and new input can be sent using a simple REST interface. Dedicated REST interfaces are provided for running HPC jobs. Users do not need to learn how to create JDI (Job Description Language) files – all that is necessary is to pass the command which is to be executed; rimrock takes care of the rest.

20 rimrock command execution interface
Single point of access to the infrastructure via REST JSON messages HTTPs as the communication protocol Access to both computation and data Supports various middleware packages including local queues Can reuse existing scripts Submits jobs to all computing centers of the PLGrid infrastructure Security does not change Standard user proxy used to delegate requests

21 rimrock command execution interface

22 DataNet DataNet enables lightweight metadata and data management in applications exploiting high-performance computing (HPC) resources. DataNet can be used to create ad-hoc data models and deploy them as actionable repositories with fine-tuned access restrictions. The system employs a programming language-neutral RESTful approach to data storage and retrieval. Supported models include structured data (e.g. relational databases) as well as files. DataNet ensures scalability by interfacing with a number of available PaaS platforms, in addition to storage sites provided by HPC infrastructures.

23 DataNet A web interface is used by users to create, extend and discover metadata models Model repositories are deployed in the PaaS cloud layer for scalable and reliable access from computing nodes through REST interfaces Data items from Storage Sites are linked from the model repositories

24 DataNet

25 PLG-Data PLG-Data ( is a web application which simplifies access to data storage resources attached to PL-Grid high performance computing machines, including in particular the HPC clusters deployed at ACC CYFRONET AGH. The aim of the application is to enable quick access to data stored and processed with the use of HPC clusters – including Prometheus and Zeus (at ACC CYFRONET AGH) The platform provides a convenient web interface and internally communicates with the underlying computing resources using GridFTP with X.509 proxy certificates. The platform is fully integrated with PL-Grid authentication and security mechanisms.

26 Comparative Drug Ranking System
cDRS is a web service developed by members of the DICE team to enable clinical virologists to assess the susceptibility of HIV strains to various treatment regimens. The system was developed in collaboration with the University of Amsterdam (UvA), originally in the context of the FP6 ViroLab project. cDRS has since been spun off into a standalone application, and also deployed as a cloud service in VPH-Share. Users can input lists of mutations in key parts of the HIV genotype (vs. reference strains), following which cDRS computes the susceptibility of the mutated virus to the available antiretroviral drugs.

27 DNA Microarray Integromics Platform
Integromics (DNA Microarray Integromics Platform) is a service developed by members of the DICE team in the context of the LifeScience Domain Grid, one of the application-oriented branches of PL-Grid, catering to the needs of the Polish life sciences community ( ). The platform enables users to conduct DNA microarray experiments and sieve through their output data in search for useful correlations. Integromics supports a variety of popular DNA microarray data formats and can run a selection of analysis algorithms on input data. Results can be visualized and shared in a variety of ways. The platform is fully integrated with PL-Grid authentication and security mechanisms.

28 DNA Microarray Integromics Platform
The platform is provided as a web application backended by the PL-Grid Infrastructure. The main web server performs third-party authentication using PL-Grid OpenID, and subsequently serves authorized content. Large-scale computations are performed asynchronously in a job queue and all user data is backed up to secure secondary storage for additional fault tolerance

29 DNA Microarray Integromics Platform

30 MetaBiobank The goal of MetaBiobank is to create a common entry point where the contents of various biobanks operated by Polish research centers can be jointly browsed and accessed ( MetaBiobank can import descriptions of samples and sample collections, and store them in the MIABIS format (standard biobank data sharing scheme, supported by the BBMRI consortium). Users can register their own biobanks and/or request access to other users’ biobanks and individual samples. A user-friendly web interface is provided. The platform is fully integrated with PL-Grid authentication and security mechanisms.

31 Brassica Information Portal
The Brassica Information Portal ( developed jointly with The Genome Analysis Center (TGAC UK) is a comprehensive repository of information related to the Brassica breeding community. BIP can be used to derive information regarding Brassica subspecies and varieties, plant trials and the relations between Brassica traits and genetic code fragments. Data accession mechanisms are built into the system, facilitating registration of new trials and plant varieties. Efficient query and visualization capabilities are provided. The system supersedes the CropStore database previously operated by TGAC.

32 Lead Structure Search Portal
LSSP (Lead Structure Search Portal) is another service developed for The Genome Analysis Center in Norwich, UK. The goal of the system is to increase the effectiveness of the lead structure search procedure for its users. Supported search strategies include structure-based experiments (for specific proteins) experiments and ligand-based experiments (for specific ligand libraries). The system provides a front-end for a variety of batch processing tools (referred to as methods) which are supplied by TGAC. Methods can be executed sequentially, in a LIFO fashion, with the progress and outcome of each execution communicated to the end user in a user-friendly way. The system is currently under development.

33 Lead Structure Search Portal

34 Summary In the course of our work we tackle diverse challenges on various layers of system-level science. Our solutions range from targeted tools which efficiently solve a specific problem, all the way to generic virtual laboratories. We specialize in e-Science applications that bring together high-performance computing with efficient web UIs to deliver better user experience for researchers.

35 For further information…
For a bird’s eye introduction to the team and to our key projects, visit our Github site at More information regarding our interests and publications can be found on the DIstributed Computing Environments (DICE) team homepage at


Download ppt "Distributed Computing Environments (DICE) team – product portfolio"

Similar presentations


Ads by Google