A Survey of Interactive Execution Environments

Slides:



Advertisements
Similar presentations
Polska Infrastruktura Informatycznego Wspomagania Nauki w Europejskiej Przestrzeni Badawczej Institute of Computer Science AGH ACC Cyfronet AGH The PL-Grid.
Advertisements

TouchDevelop: Productive Scripting on and for Mobile Devices and Web Services Thomas Ball Sebastian Burckhardt, Peli de Halleux, Michał Moskal, Nikolai.
QuIDE was used during the Quantum Computation classes at DCS AGH The students assessed the usability with the System Usability Scale QuIDE was compared.
Polish Infrastructure for Supporting Computational Science in the European Research Space GridSpace Based Virtual Laboratory for PL-Grid Users Maciej Malawski,
Next Generation Domain-Services in PL-Grid Infrastructure for Polish Science. Numerical Simulations of Metal Forming Production Processes and Cycles by.
Towards auto-scaling in Atmosphere cloud platform Tomasz Bartyński 1, Marek Kasztelnik 1, Bartosz Wilk 1, Marian Bubak 1,2 AGH University of Science and.
Advanced Grid-Enabled System for Online Application Monitoring Main Service Manager is a central component, one per each.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
Panel 22 July, 2015 Panel Data Intensive Science at HPCS 2015 – The International Conference on High Performance Computing & Simulation
In each iteration macro model creates several micro modules, sends data to them and waits for the results. Using Akka Actors for Managing Iterations in.
Experience with the OpenStack Cloud for VPH Applications Jan Meizner 1, Maciej Malawski 1,2, Piotr Nowakowski 1, Paweł Suder 1, Marian Bubak 1,2 AGH University.
Basic Grid Registry configuration – there is not any backup data Grid Registry configuration where every domain has duplicated information Find all services.
DataNet – Flexible Metadata Overlay over File Resources Daniel Harężlak 1, Marek Kasztelnik 1, Maciej Pawlik 1, Bartosz Wilk 1, Marian Bubak 1,2 1 ACC.
The Helix Nebula Marketplace HNX The European cloud marketplace for scientists, researchers, developers & public organisations Marc-Elian Bégin, CEO, Co-founder,
High Level Architecture (HLA)  used for building interactive simulations  connects geographically distributed nodes  time management (for time- and.
EC-project number: Universal Grid Client: Grid Operation Invoker Tomasz Bartyński 1, Marian Bubak 1,2 Tomasz Gubała 1,3, Maciej Malawski 1,2 1 Academic.
AKOGRIMO Integration of Grid services with mobile technologies; validation in e-health, e-learning and disaster management areas CoreGRID European Grid.
Datalayer Notebook Allows Data Scientists to Play with Big Data, Build Innovative Models, and Share Results Easily on Microsoft Azure MICROSOFT AZURE ISV.
HUSKY CONSULTANTS FRANKLIN VALENCIA WIOLETA MILCZAREK ANTHONY GAGLIARDI JR. BRIAN CONNERY.
Lightweight construction of rich scientific applications Daniel Harężlak(1), Marek Kasztelnik(1), Maciej Pawlik(1), Bartosz Wilk(1) and Marian Bubak(1,
Federating PL-Grid Computational Resources with the Atmosphere Cloud Platform Piotr Nowakowski, Marek Kasztelnik, Tomasz Bartyński, Tomasz Gubała, Daniel.
BIOLOGICAL SYSTEMS SIMULATIONS BASED ON NEGATIVE FEEDBACK EUROPEAN UNION EUROPEAN REGIONAL DEVELOPMENT FUND The work was co-funded by the European Regional.
NA-MIC National Alliance for Medical Image Computing Core 1b – Engineering Computational Platform Jim Miller GE Research.
High Level Architecture (HLA)  used for building interactive simulations  connects geographically distributed nodes  time management (for time- and.
GOOGLE APP ENGINE By Muktadiur Rahman. Contents  Cloud Computing  What is App Engine  Why App Engine  Development with App Engine  Quote & Pricing.
Parameter Sweep and Resources Scaling Automation in Scalarm Data Farming Platform J. Liput, M. Paciorek, M. Wrona, M. Orzechowski, R. Slota, and J. Kitowski.
Ocean Observatories Initiative Serving Ocean Model Data on the Cloud M. Meisinger, C. Farcas, E. Farcas, C. Alexander, M. Arrott, J. de La Beaujardière,
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
Spark and Jupyter 1 IT - Analytics Working Group - Luca Menichetti.
InSilicoLab – Grid Environment for Supporting Numerical Experiments in Chemistry Joanna Kocot, Daniel Harężlak, Klemens Noga, Mariusz Sterzel, Tomasz Szepieniec.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Petr Škoda, Jakub Koza Astronomical Institute Academy of Sciences
Distributed Computing Environments (DICE) team – product portfolio
Seasonal School Demo and Assigments
Cloud Computing for Science
Demo of the Model Execution Environment WP2 Infrastructure Platform
Demo of the Model Execution Environment WP2 Infrastructure Platform
Model Execution Environment Current status of the WP2 Infrastructure Platform Marian Bubak1, Daniel Harężlak1, Marek Kasztelnik1 , Piotr Nowakowski1, Steven.
MIRACLE Cloud-based reproducible data analysis and visualization for outputs of agent-based models Xiongbing Jin, Kirsten Robinson, Allen Lee, Gary Polhill,
Foundations of Data Science
From VPH-Share to PL-Grid: Atmosphere as an Advanced Frontend
Status and Challenges: January 2017
Model Execution Environment for Investigation of Heart Valve Diseases
What is Cloud Computing - How cloud computing help your Business?
DICE - Distributed Computing Environments Team
Introduction to R Programming with AzureML
Recap: introduction to e-science
VI-SEEM Data Repository
PROCESS - H2020 Project Work Package WP6 JRA3
Coding in the Cloud This slide deck includes recorded video demonstrations of content from the live presentation. Joon-Yee.
NSF : CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science PI: Geoffrey C. Fox Software: MIDAS HPC-ABDS.
Cloud Computing Dr. Sharad Saxena.
High Performance Data Scientist
1ACC Cyfronet AGH, Kraków, Poland
Cloud Evolution Dennis Gannon
Department of Intelligent Systems Engineering
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Mariusz Sterzel1 , Lukasz Dutka1, Tomasz Szepieniec1
Quoting and Billing: Commercialization of Big Data Analytics
Panel on Research Challenges in Big Data
Infrastructure for Personalised Medicine: It’s MEE that Matters!
Final Review 27th March Final Review 27th March 2019.
Enol Fernandez & Giuseppe La Rocca EGI Foundation
Big Data, Simulations and HPC Convergence
H2020 EU PROJECT | Topic SC1-DTH | GA:
Convergence of Big Data and Extreme Computing
Getting Started with GridLAB-D on the Cloud
H2020 EU PROJECT | Topic SC1-DTH | GA:
Presentation transcript:

A Survey of Interactive Execution Environments for Extreme Large-Scale Computations Katarzyna Rycerz1, Piotr Nowakowski2, Jan Meizner2, Bartosz Wilk2, Jakub Bujas1, Łukasz Jarmocik1, Michał Krok1, Przemysław Kurc1, Sebastian Lewicki1, Mateusz Majcher1, Piotr Ociepka1, Lukasz Petka1, Krzysztof Podsiadło1, Patryk Skalski1, Wojciech Zagrajczuk1, Michał Zygmunt1, and Marian Bubak1,2 1Department of Computer Science, AGH University of Science and Technology, al. Mickiewicza 30, 30-059 Kraków, Poland 2Academic Computer Centre Cyfronet AGH, Nawojki 11, 30-950, Kraków, Poland Goals To provide exascale ready computational and data services that will accelerate innovation To validate the services in real-world settings, both in scientific research and in industry pilot deployments: Square Kilometre Array – a large radiotelescope project medical informatics airline revenue management open data for global disaster risk reduction agricultural analysis based on Copernicus data Extreme Large Computing Services Based on “focus on services and forget about infrastructures” idea Support computational activities: analysis, data mining, pattern recognition, etc. Use heterogeneous research datasets (input and output data from modelling, simulation, visualization and other scientific applications stored in data centers and on storage systems available on European e-infrastructures) Support HPC and cloud based computations needed for various data analyses Survey of interactive execution environments Focus on: integration of scripting notebooks with HPC infrastructures to support building extreme large computing services extension mechanisms required to add support specific to exascale processing of large data sets ability to mix multiple languages in one document integration with cloud infrastructures Name Large data set support Integration with Cloud/HPC infrastructures Extension mechanisms R Notebook using additional custom libraries (e.g. for Apache SPARK) using custom libraries communicating with HPC queuing systems (e.g. SLURM) It is possible to develop custom engines for languages which are not natively supported. DataBricks the whole platform is based on Apache SPARK Available only on Amazon Web Services or Microsoft Azure almost none Beaker using additional custom libraries no specific support for HPC; Docker version available Users can add Beaker support for unsupported languages via a dedicated API.  Jupyter no mature solution for HPC; Docker version available Additional languages can be supported by writing a new Jupyter kernel. Cloud Datalab support for Google data services (e.g. BigQuery, Cloud Machine Learning Engine, etc.) restricted to the Google Cloud platform limited Zeppelin native support for Apache Spark can be run on HPC using connection to the YARN cluster support for additional languages can be added Summary DataBricks and Cloud Datalab must be run on specific cloud resources Zeppelin and DataBricks are based on Apache SPARK, which potentially limits their usage to that platform R Notebooks seems promising; however, some important features are only available with a commercial version of Rstudio BeakerX (successor to Beaker) and Cloud Data are based on the Jupyter solution Jupyter seems to be a suitable base for developing extreme large computing environments References Ciepiela, E., Harężlak, D., Kasztelnik, M., Meizner, J., Dyk, G., Nowakowski, P. and Bubak, M., 2013. The collage authoring environment: From proof-of-concept prototype to pilot service. Procedia Computer Science, 18, pp.769-778. Beaker Notebook webpage http://beakernotebook.com/features Databricks webpage https://databricks.com/product/databricks 4. Datalab webpage https://cloud.google.com/datalab/ 5. Jupyter webpage http://jupyter.org/ 6. Rstudio webpage https://www.rstudio.com 7. Zeppelin web page https://zeppelin.apache.org/ http://dice.cyfronet.pl Funding by EU H2020 grant 777533.