Download presentation
Presentation is loading. Please wait.
1
DIRAC services
2
Services FG-DIRAC DIRAC4EGI FG-DIRAC beyond France-Grilles
Maintenance, operation Practically all the members are involved How this can be presented to the benefit of project Testing ground ? DIRAC4EGI CPPM together with UB and Cyfronet offered to maintin the service Awaiting the EGI answer Should be involved ? Playing ground for various activities, e.g. cloud management, COMDIRAC, data management FG-DIRAC beyond France-Grilles Merge FG-DIRAC and DIRAC4EGI Keep logically separate but technically unique Service administration tools should be further developed Part of the contract ?
3
The cloud case
4
Clouds VM scheduler developed for Belle MC production system
Dynamic VM spawning taking Amazon EC2 spot prices and Task Queue state into account Discarding VMs automatically when no more needed The DIRAC VM scheduler by means of dedicated VM Directors is interfaced to OCCI compliant clouds: OpenStack, OpenNebula Apache-libcloud API compliant clouds Amazon EC2
5
VMDIRAC 2 VM submission ToDo Cloud endpoint abstraction Implementation
Apache-libcloud ROCCI EC2 CloudDirector similar to SiteDirector ToDo Cloud endpoint testing/monitoring tools for site debugging Follow the endpoint interface evolution
6
VMDIRAC 2 VM contextualization (current) Standard minimal images
No DIRAC proper images, no image maintenance costs, but … Cloudinit mechanism only Using a passwordless certificate passed as user data mardirac.in2p3.fr host certificate Using bootstrapping scripts similar to LHCb Vac/Vcycle Using pilot 2.0 On the fly installation of DIRAC, CVMFS, … Takes time, can be improved with custom images Starting VirtualMachineMonitorAgent Monitor and report the VM state, VM heartbeats Halt the VM in case of no activity Getting instructions from the central service, e.g. to halt the VM Starting as many pilots as they are cores ( single core jobs ) Starting one pilot for
7
VMDIRAC 2 VM contextualization in the works The goal
Bootstrapping scripts shared with the Pilot package introduced recently Single pilot per VM capable to run multiple payloads single or multi-core Same logic as for multi-core queues VMMonitor agent enhanced logic Halting on no activity Signaling pilots to stop Machine Job Features The goal Make a fully functional dynamic cloud computing resource allocation system taking into account group fair shares
8
VMDIRAC 2 VM web application Enhanced monitoring, accounting
No Google tools ! VM manipulation by administrators Start, halt, other instructions to the VMMonitor agent Possibility to connect to VM to debug problems Web terminal console On the fly public IP assignment
9
The supercomputer case
10
The supercomputer case
Multiple HPC centers are available for large scientific communities E.g., HEP experiments started to have access to a number of HPC centers Using traditional HTC applications Filling in the gaps of empty slots Including HPC into their data production systems Advantages of federating HPC centers More users and applications for each centers - better efficiency of usage Elastic usage: users can have more resources for a limited time period Example: Partnership for Advanced Computing in Europe, PRACE Common agreements on sharing HPC resources No common interware for a uniform access
11
The supercomputer case
Unlike grid sites, HPC centers are not uniform Different access protocols Different user authentication methods Different batch systems Different connectivity to outside world If we want to include HPC centers into a common infrastructure we have to find a way to overcome these differences Pilot agents can be very helpful here Needs effort from both interware and HPC center sides
12
HPC example Pilot submitted to the batch system through an (GSI)SSH tunnel Pilot communicates with the DIRAC service through the Gateway proxy service Output upload to the target SE through the SE proxy
13
Co-design problem of distributed HPC
Common requirements for HPC Outside world connectivity User authentication SSO schema with federated identity providers Users representing whole communities Application software provisioning Monitoring, accounting Can be delegated to the Interware level Support from interware Common model for HPC resources description Algorithms for HPC workload management with more complex payload requirements specification Uniform user interface Support from applications Allow running in multiple HPC centers e.g. standardized MPI libraries Granularity
14
Towards Open Distributed Supercomputer Infrastructure
A common project involving several supercomputer centers Lobachevsky, NNU HybriLIT, JINR, Dubna CC/IN2P3, Lyon Mesocenter, AMU, Marseille LRZ, … The goal is to provide necessary components to include supercomputers into a common infrastructures Together with other types of resources Based on the DIRAC interware technology Several centers are already connected Simple “grid”-like applications, multi-core applications Multi-processor, multi-node applications are in the works
15
Publications Workflows Big Data HPC Clouds COMDIRAC
High level workflow treatment Metadata in workflows Big Data ?? HPC WMS for HPC ( reservation, masonry, multi-core, multi-host ) WMS for hybrid HPC/HTC/Cloud systems Clouds Managing cloud resources with community policies/shares/quotas COMDIRAC Interface to a distributed computer ( FSDIRAC included ? )
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.