Presentation is loading. Please wait.

Presentation is loading. Please wait.

ALICE Computing Model in Run3

Similar presentations


Presentation on theme: "ALICE Computing Model in Run3"— Presentation transcript:

1 ALICE Computing Model in Run3
Predrag Buncic CERN

2 The ALICE Online-Offline Project
Run 3 MULTIPLEXER Computer Farm Complete Events Memory Matrix Continuous detector reading: replace events with time windows (20 ms, 1000 events). 2 Predrag Buncic The ALICE Online-Offline Project

3 The ALICE Online-Offline Project
O2 TDR 3 Predrag Buncic The ALICE Online-Offline Project

4 The ALICE Online-Offline Project
Run 3 6 TB/s 90 GB/s 2.4 MW computing facility 100 k CPU cores, 5000 GPUs, 500 FPGAs 60 PB local disk buffer at P2 4 Predrag Buncic The ALICE Online-Offline Project

5 The ALICE Online-Offline Project
How to keep the cost down? x100 Event rate x10 Storage Software performance Use of hardware accelerators Parameterized MC Minimize remote I/O Specialize sites for given workflow 5 Predrag Buncic The ALICE Online-Offline Project

6 The ALICE Online-Offline Project
What will change in Run 3? Reconstruction Calibration Analysis Archiving Simulation T0/T1 CTF -> ESD -> AOD AF AOD -> HISTO, TREE O2 RAW -> CTF -> ESD -> AOD 1 T2/HPC MC -> CTF -> ESD 1..n 1..3 CTF AOD Grid Tiers will be mostly specialized for given role O2 facility (2/3 of reconstruction and calibration), T1s (1/3 of reconstruction and calibration, archiving to tape), T2s (simulation) All AODs will be collected on the specialized Analysis Facilities (AF) capable of processing ~5 PB of data within ½ day timescale The goal is to minimize data movement and optimize processing efficiency 6 Predrag Buncic The ALICE Online-Offline Project

7 The ALICE Online-Offline Project
Evolution of T2s CPU (cores) Disk (PB) Expecting Grid resources (CPU, storage) to continue growing at 20% rate per year Since T2s will be used almost exclusively for simulation jobs (no input) and resulting AODs will be exported to T1s/AFs, we expect to significantly lower needs for storage on T2s Simulation will remain the main consumer of CPU cycles 7 Predrag Buncic The ALICE Online-Offline Project

8 The ALICE Online-Offline Project
Computing shares in Run 3 2/3 1/3 Relative importance of O2 vs Grid may change with time if Grid resources continue to grow as expected 8 Predrag Buncic The ALICE Online-Offline Project

9 The ALICE Online-Offline Project
Analysis Facilities Motivation Analysis is the least efficient of all workloads that we run on the Grid I/O bound in spite of attempts to make it more efficient by using the analysis trains Requirements 20-30’000 cores and 5-10 PB of disk on very performant file system In order to run the organized analysis on local data like we do today on the Grid efficiently using available resources we need to adapt and develop our analysis software 9 Predrag Buncic The ALICE Online-Offline Project

10 The ALICE Online-Offline Project
Reducing complexity: Data Management Federating storage Each cloud/region provides reliable data management and sufficient processing capability Whatever happened in a region, stays in that region 10 Predrag Buncic The ALICE Online-Offline Project

11 The ALICE Online-Offline Project
Reducing complexity: Workload Management HPC MC -> CTF T1 Simulation Facility AOD ALICE GRID No need to track every job AOD HPC AOD -> HISTO, TREE Analysis Facility Localizing execution of complex workflows For this to work, batch systems would have to be able to execute the complex workflows (many of them can but we do not use them like that in the grid context) 11 Predrag Buncic The ALICE Online-Offline Project

12 The ALICE Online-Offline Project
Top 5 concerns 12 Predrag Buncic The ALICE Online-Offline Project


Download ppt "ALICE Computing Model in Run3"

Similar presentations


Ads by Google