Download presentation
Presentation is loading. Please wait.
1
Workflow Management Software For Tomorrow
Fernando Barreiro Kaushik De Alexei Klimentov Tadashi Maeno Danila Oleynik Pavlo Svirin Matteo Turiilli Distributed Data Management Meta-data handling Rucio AMI Physics Group Production requests pyAMI DEFT DB PanDA DB Requests, Production Tasks ProdSys2/ DEFT JEDI PanDA server Tasks Tasks Jobs Analysis tasks Jobs Production requests pilot ARC interface Pilot scheduler pilot condor-g pilot pilot pilot EGEE/EGI OSG NDGF HPCs Worker nodes HPC cross-experiment discussion CERN, May 10, 2019
2
Outline WFM SW evolution (HEP) WFM SW on HPC from non-LHC
3
Workflow Management. PanDA. Production and Distributed Analysis System
PanDA Brief Story 2005: Initiated for US ATLAS (BNL and UTA) 2006: Support for analysis 2008: Adopted ATLAS-wide 2009: First use beyond ATLAS 2011: Dynamic data caching based on usage and demand 2012: ASCR/HEP BigPanDA project 2014: Network-aware brokerage 2014 : Job Execution and Definition I/F (JEDI) adds complex task management and fine grained dynamic job management 2014: JEDI- based Event Service 2015: New ATLAS Production System, based on PanDA/JEDI 2015 :Manage Heterogeneous Computing Resources : HPCs and clouds 2016: DOE ASCR project 2016:PanDA for bioinformatics 2017:COMPASS adopted PanDA , NICA (JINR) PanDA beyond HEP : BlueBrain, IceCube, LQCD Global ATLAS operations Up to ~800k concurrent job slots 25-30M jobs/month at >250 sites ~1400 ATLAS users First exascale workload manager in HENP 1.3+ Exabytes processed every year in Exascale scientific data processing today BigPanDA Monitor Concurrent cores run by PanDA Big HPCs Grid Clouds
4
Future Challenges for WorkFlow(Load) Management Software
New physics workflows and technologies machine learning training, parallelization, vectorization… also new ways how Monte-Carlo campaigns are organized Address computing model evolution and new strategies “provisioning for peak” Incorporating new architectures (like TPU, GPU, RISC, FPGA, ARM…) Leveraging new technologies (containerization, no-SQL analysis models, high data reduction frameworks, tracking…) Integration with networks (via DDM, via IS and directly) Data popularity -> event popularity Address future complexities in workflow handling Machine learning and Task Time To Complete prediction Monitoring, analytics, accounting and visualization Granularity and data streaming
5
Future development. Harvester Highlights
Primary objectives : To have a common machinery for diverse computing resources To provide a common layer in bringing coherence to different HPC implementations To optimize workflow executions for diverse site capabilities T.Maeno To address wide spectrum of computing resources/facilities available to ATLAS and experiments in general New model : PanDA server- harvester-pilot The project was launched in Dec 2016 (PI T.Maeno)
6
Harvester Status What is Harvester Current Status
A bridge service between workload, data management systems and resources to allow (quasi-) real time communication between them Flexible deployment model to work with various operational restrictions, constraints, and policies in those resources E.g. local deployment on edge node for HPCs behind multi-factor authentication, central deployment + SSH + RPC for HPCs without outbound network connections, stage-in/out plugins for various data transfer services/tools, messaging via share file system, … Experiments can use harvester by implementing their own plug-ins, harvester is not tightly coupled with PanDA Current Status Architecture design, coding and implementation completed Commissioning ~done Deployed on wide range of resources Theta/ALCF, Cori/NERSC, Titan/OLCF in production Summit/OLCF, MareNostrum4/BSC under testing Also at Google Cloud, Grid (~all ATLAS sites),
7
OLCF CERN BNL PanDA/Harvester deployment for
8
Harvester for tomorrow (HPC only)
Full-chain technical validation with Yoda Yoda : ATLAS Event Service with MPI functionality running on HPC Yoda+Jumbo jobs in production Jumbo jobs : relax input file boundaries, pick up any event from dataset Two hops data stage-out with Globus Online + Rucio Containers integration Implementation of a capability to use simultaneously CPU/GPU within one node for MC and ML payloads Implementation of a capability to dynamically shape payloads based on real-time resource information Centralization of Harvester instances using CE, SSH, MOM, …
9
Simulation Science HEP
“Summit”: ~1017 FLOPS HEP SIMULATION NEUROSCIENCE “ENIAC”: ~103 FLOPS Computing has seen an unparalleled exponential development In the last decades supercomputer performance grew 1000x every ~10 years Almost all scientific disciplines have long embraced this capability Original slide from F.Schurmann (EPFL)
10
Pegasus WFMS/PanDA Collaboration started in October 2018
December 2018: first working prototype, standard Pegasus example (Split document/word count) tested on Titan Future plans/possible applications/open questions: test the same workflow in a heterogeneous environment: Grid + HPC (Summit/NERSC/…) with data transfer via RUCIO or other tools Possible application: data transfer for LQCD jobs from/to OLCF storages Currently Pegasus/PanDA integration works on job level. How can we integrate Pegasus with other PanDA components like JEDI is still TBD
11
Next Generation Executor Project (Rutgers U)
Schedules and runs multiple tasks concurrently and consecutively in one or more batch jobs: Tasks are individual programs Tasks are executed within the walltime of each batch job Late binding: Tasks are scheduled and then placed within a batch job at runtime Tasks and resource heterogeneity: Scheduling, placing and running CPU/GPU/OpenMM/MPI tasks in the same batch job Use single/multiple CPU/GPU for the same task and across multiple tasks Supports multiple HPC machines. Requires limited development to support a new HPC machine. Use cases: Molecular dynamics and HEP payloads on Summit
12
Status and near-term plans :
Test runs with bulk task submission with harvester and NGE Address bulk and performant submission (currently ~320 units in 4000 sec) Run at scale on Summit once issues with jsrun are addressed by IBM Conduct submission with relevant workloads from MD and HEP
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.