DIRAC services.

Slides:



Advertisements
Similar presentations
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Advertisements

Workload Management Massimo Sgaravatto INFN Padova.
Architecture overview 6/03/12 F. Desprez - ISC Cloud Context : Development of a toolbox for deploying application services providers with a hierarchical.
Testing PanDA at ORNL Danila Oleynik University of Texas at Arlington / JINR PanDA UTA 3-4 of September 2013.
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
Apache Airavata GSOC Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
BESIII distributed computing and VMDIRAC
Grid Initiatives for e-Science virtual communities in Europe and Latin America DIRAC TEAM CPPM – CNRS DIRAC Grid Middleware.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
European Grid Initiative Federated Cloud update Peter solagna Pre-GDB Workshop 10/11/
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting October 10-11, 2002.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
BESIII Production with Distributed Computing Xiaomei Zhang, Tian Yan, Xianghu Zhao Institute of High Energy Physics, Chinese Academy of Sciences, Beijing.
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
Getting started DIRAC Project. Outline  DIRAC information system  Documentation sources  DIRAC users and groups  Registration with DIRAC  Getting.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
The Eucalyptus Open-source Cloud Computing System Daniel Nurmi Rich Wolski, Chris Grzegorczyk, Graziano Obertelli, Sunil Soman, Lamia Youseff, Dmitrii.
Grid and Cloud Computing Alessandro Usai SWITCH Sergio Maffioletti Grid Computing Competence Centre - UZH/GC3
Federating PL-Grid Computational Resources with the Atmosphere Cloud Platform Piotr Nowakowski, Marek Kasztelnik, Tomasz Bartyński, Tomasz Gubała, Daniel.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Aneka Cloud ApplicationPlatform. Introduction Aneka consists of a scalable cloud middleware that can be deployed on top of heterogeneous computing resources.
3/12/2013Computer Engg, IIT(BHU)1 CLOUD COMPUTING-1.
Workload management, virtualisation, clouds & multicore Andrew Lahiff.
DIRAC Pilot Jobs A. Casajus, R. Graciani, A. Tsaregorodtsev for the LHCb DIRAC team Pilot Framework and the DIRAC WMS DIRAC Workload Management System.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.
WP5 – Infrastructure Operations Test and Production Infrastructures StratusLab kick-off meeting June 2010, Orsay, France GRNET.
DIRAC Distributed Computing Services A. Tsaregorodtsev, CPPM-IN2P3-CNRS FCPPL Meeting, 29 March 2013, Nanjing.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
The StratusLab Distribution and Its Evolution 4ème Journée Cloud (Bordeaux, France) 30 November 2012.
Multi-community e-Science service connecting grids & clouds R. Graciani 1, V. Méndez 2, T. Fifield 3, A. Tsaregordtsev 4 1 University of Barcelona 2 University.
Distributed Computing Framework A. Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille EGI Webinar, 7 June 2016.
EGI-InSPIRE RI EGI Compute and Data Services for Open Access in H2020 Tiziana Ferrari Technical Director, EGI.eu
Scientific Data Processing Portal and Heterogeneous Computing Resources at NRC “Kurchatov Institute” V. Aulov, D. Drizhuk, A. Klimentov, R. Mashinistov,
EGI-InSPIRE RI An Introduction to European Grid Infrastructure (EGI) March An Introduction to the European Grid Infrastructure.
Distributed computing and Cloud Shandong University (JiNan) BESIII CGEM Cloud computing Summer School July 18~ July 23, 2016 Xiaomei Zhang 1.
Accessing the VI-SEEM infrastructure
Introduction to Cloud Technology
Review of the WLCG experiments compute plans
Workload Management Workpackage
Ecological Niche Modelling in the EGI Cloud Federation
StratusLab First Periodic Review
Cloud Challenges C. Loomis (CNRS/LAL) EGI-TF (Amsterdam)
Dag Toppe Larsen UiB/CERN CERN,
Dag Toppe Larsen UiB/CERN CERN,
Example: Rapid Atmospheric Modeling System, ColoState U
IGE Globus Appliances Dr. Ioan Lucian Muntean, Dr. Adrian Colesa
ATLAS Cloud Operations
StratusLab Final Periodic Review
StratusLab Final Periodic Review
EMI Interoperability Activities
Project Status A.Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille,
Recap: introduction to e-science
Grid Deployment Board meeting, 8 November 2006, CERN
EGI-Engage Engaging the EGI Community towards an Open Science Commons
An easier path? Customizing a “Global Solution”
WLCG Collaboration Workshop;
VMDIRAC status Vanessa HAMAR CC-IN2P3.
PROCESS - H2020 Project Work Package WP6 JRA3
Management of Virtual Execution Environments 3 June 2008
Xiaomei Zhang On behalf of CEPC software & computing group Nov 6, 2017
Cloud Computing Dr. Sharad Saxena.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Module 01 ETICS Overview ETICS Online Tutorials
Technical Capabilities
Wide Area Workload Management Work Package DATAGRID project
Presentation transcript:

DIRAC services

Services FG-DIRAC DIRAC4EGI FG-DIRAC beyond France-Grilles Maintenance, operation Practically all the DIRAC@IN2P3 members are involved How this can be presented to the benefit of DIRAC@IN2P3 project Testing ground ? DIRAC4EGI CPPM together with UB and Cyfronet offered to maintin the service Awaiting the EGI answer Should DIRAC@IN2P3 be involved ? Playing ground for various activities, e.g. cloud management, COMDIRAC, data management FG-DIRAC beyond France-Grilles Merge FG-DIRAC and DIRAC4EGI Keep logically separate but technically unique Service administration tools should be further developed Part of the DIRAC@IN2P3 contract ?

The cloud case

Clouds VM scheduler developed for Belle MC production system Dynamic VM spawning taking Amazon EC2 spot prices and Task Queue state into account Discarding VMs automatically when no more needed The DIRAC VM scheduler by means of dedicated VM Directors is interfaced to OCCI compliant clouds: OpenStack, OpenNebula Apache-libcloud API compliant clouds Amazon EC2

VMDIRAC 2 VM submission ToDo Cloud endpoint abstraction Implementation Apache-libcloud ROCCI EC2 CloudDirector similar to SiteDirector ToDo Cloud endpoint testing/monitoring tools for site debugging Follow the endpoint interface evolution

VMDIRAC 2 VM contextualization (current) Standard minimal images No DIRAC proper images, no image maintenance costs, but … Cloudinit mechanism only Using a passwordless certificate passed as user data mardirac.in2p3.fr host certificate Using bootstrapping scripts similar to LHCb Vac/Vcycle Using pilot 2.0 On the fly installation of DIRAC, CVMFS, … Takes time, can be improved with custom images Starting VirtualMachineMonitorAgent Monitor and report the VM state, VM heartbeats Halt the VM in case of no activity Getting instructions from the central service, e.g. to halt the VM Starting as many pilots as they are cores ( single core jobs ) Starting one pilot for

VMDIRAC 2 VM contextualization in the works The goal Bootstrapping scripts shared with the Pilot package introduced recently Single pilot per VM capable to run multiple payloads single or multi-core Same logic as for multi-core queues VMMonitor agent enhanced logic Halting on no activity Signaling pilots to stop Machine Job Features The goal Make a fully functional dynamic cloud computing resource allocation system taking into account group fair shares

VMDIRAC 2 VM web application Enhanced monitoring, accounting No Google tools ! VM manipulation by administrators Start, halt, other instructions to the VMMonitor agent Possibility to connect to VM to debug problems Web terminal console On the fly public IP assignment

The supercomputer case

The supercomputer case Multiple HPC centers are available for large scientific communities E.g., HEP experiments started to have access to a number of HPC centers Using traditional HTC applications Filling in the gaps of empty slots Including HPC into their data production systems Advantages of federating HPC centers More users and applications for each centers - better efficiency of usage Elastic usage: users can have more resources for a limited time period Example: Partnership for Advanced Computing in Europe, PRACE Common agreements on sharing HPC resources No common interware for a uniform access

The supercomputer case Unlike grid sites, HPC centers are not uniform Different access protocols Different user authentication methods Different batch systems Different connectivity to outside world If we want to include HPC centers into a common infrastructure we have to find a way to overcome these differences Pilot agents can be very helpful here Needs effort from both interware and HPC center sides

HPC example Pilot submitted to the batch system through an (GSI)SSH tunnel Pilot communicates with the DIRAC service through the Gateway proxy service Output upload to the target SE through the SE proxy

Co-design problem of distributed HPC Common requirements for HPC Outside world connectivity User authentication SSO schema with federated identity providers Users representing whole communities Application software provisioning Monitoring, accounting Can be delegated to the Interware level Support from interware Common model for HPC resources description Algorithms for HPC workload management with more complex payload requirements specification Uniform user interface Support from applications Allow running in multiple HPC centers e.g. standardized MPI libraries Granularity

Towards Open Distributed Supercomputer Infrastructure A common project involving several supercomputer centers Lobachevsky, NNU HybriLIT, JINR, Dubna CC/IN2P3, Lyon Mesocenter, AMU, Marseille LRZ, … The goal is to provide necessary components to include supercomputers into a common infrastructures Together with other types of resources Based on the DIRAC interware technology Several centers are already connected Simple “grid”-like applications, multi-core applications Multi-processor, multi-node applications are in the works

Publications Workflows Big Data HPC Clouds COMDIRAC High level workflow treatment Metadata in workflows Big Data ?? HPC WMS for HPC ( reservation, masonry, multi-core, multi-host ) WMS for hybrid HPC/HTC/Cloud systems Clouds Managing cloud resources with community policies/shares/quotas COMDIRAC Interface to a distributed computer ( FSDIRAC included ? )