Download presentation
Presentation is loading. Please wait.
1
Event Service Wen Guan University of Wisconsin 1
2
Content EventService –Event Service Introduction –Event Service queue setup –Event Service Monitor Yoda: Event Service on HPC –Yoda on HPC –Yoda on Edison –Yoda on ARC 2
3
What is Event Service Event level processing 3
4
Event Service Processing GetJob Pilot S3 objectstore panda GetEvents Process StageOut updateEvent job Events (1-10),(10-20)…(990-1000) Merge_job getJob stagein merge stageout Pilot dCache/dpm… 4
5
Difference between ES job and Normal Job Pilot runs getJob to request work from Panda. A payload is returned from Panda with can be normal or ES work –eventService=True for ES job. –Normal job doesn’t have it. Pilot parses the payload. Pilot automatically selects different processes for different jobs. 5
6
Define ES Queue Difference from normal queue: –Corecount can be 1. cannot be None. –catchall: localEsMerge jobseed=es or std(non-es) or all(es and non-es) –jobseed is used by panda to schedule ES jobs to the queue. –Attach Objectstore If no OS attached, ES job will fail In AGIS, associate OS to the queue A default OS is already attached to a queue. Example:https://atlas-agis.cern.ch/agis/pandaqueue/detail/Arizona_ES/full/ 6
7
Attach OS to queue(1) 7
8
Attach OS to queue(2) 8
9
Event Service Monitor 9
10
10
11
Event Service Monitor 11
12
Event Service Monitor 12
13
Summary Easy to setup an ES queue. Documentation available, comments on it welcome –https://twiki.cern.ch/twiki/bin/view/PanDA/EventServiceOperations Including OS setup for a queue. Also some debug info. –https://twiki.cern.ch/twiki/bin/view/PanDA/EventServer Help: –atlas-comp-event-service@cern.ch Already many ES queues 13
14
Content EventService –Event Service Introduction –Event Service queue setup –Event Service Monitor Yoda: Event Service on HPC –Yoda on HPC –Yoda on Edison –Yoda on ARC 14
15
Yoda on HPC Purpose –Make use of HPC with many CPUs in one job. –No outbound internet connection Prevent us from conventional ES Yoda: –Run ES as a single MPI job. 15
16
Schematic view of Yoda 16
17
Yoda on NERSC (in production) Frontend(login machine) submit job poll job poll outputs HPCManager slurm Plugin getJob(from Panda) stageIn getEventRanges getOutputs(from HPCmanager) stageOut RunJobHPCEvent Pilot HPC cluster HPCJob Yoda Rank 0 Droid Rank 1 Droid Rank n Share File system Input Files, PFC job.json,events.json outputs outputs 17
18
Yoda on ARC (testing) Frontend(login machine) submit job poll job poll outputs HPCManager slurm Plugin HPC cluster HPCJob Yoda Rank 0 Droid Rank 1 Droid Rank n Share File system Input Files, job.json,events.json outputs outputs getJob(from Panda) stageIn getEventRanges getOutputs(from HPCmanager) stageOut RunJobHPCEvent mpirun HPCManager MPI Plugin CE 18
19
Yoda on ARC (testing) HPC cluster HPCJob Yoda Rank 0 Droid Rank 1 Droid Rank n Share File system Pilot, Input Files, job.json,events.json outputs outputs CE ARC Control Tower Release the interactive node 19
20
Summary Yoda is an ES solution on HPC Production Running on NERSC. –Since last year on Edison HPC. –Switch from PBS to Slurm on Edison. –Tested on new NERSC Cori system. Yoda on ARC. –Release the interactive node. –Simulated on NERSC Edison. –Integrating testing with ARC-CT. –Will be tested on ARC sites. 20
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.