Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015
Upgrade Roadmap Predrag Buncic 2 ALICEALICE Future IT Challenges for ALICE Run1 Run2 Run3 ALICE + LHCb Run4 ATLAS + CMS
The ALICE Experiment Predrag Buncic 3 ALICEALICE Event display from a real Pb-Pb collision. Each line is a particle traversing the ALICE apparatus, leaving a signal in the crossed detector Future IT Challenges for ALICE
Towards Higher Luminosities Predrag Buncic 4 ALICEALICE LHC proton-proton now: Luminosity 7 × cm –2 s –1 25 ns LHC proton-proton after 2018: Luminosity 4 × cm –2 s –1 Future IT Challenges for ALICE
Data Acquisition (DAQ) Design Concept Predrag Buncic 5 ALICEALICE Acquire data of tens of millions of channels Store them in a matrix of hundreds of memories Multiplex to a computer farm Assemble and store data from the same event MULTIPLEXER Memory Matrix Computer FarmComplete Events Future IT Challenges for ALICE
Paradigm shift: Continuous readout Predrag Buncic 6 ALICEALICE Continuous detector reading: replace events with time windows (20 ms, events). Self sufficient small dataset (10 GB) Calibrate and reconstruction online: reduce data volume & structure the data MULTIPLEXER Memory Matrix Computer FarmComplete Events Future IT Challenges for ALICE
Run 3 Predrag Buncic 7 ALICEALICE Storage Online/Offline Facility 90 GB/s 1.1 TB/s x12 compression factor Future IT Challenges for ALICE
O2 Facility Predrag Buncic 8 ALICEALICE 60 PB 10^5 CPUs 5000 GPUs Future IT Challenges for ALICE
O2 Facility Predrag Buncic 9 ALICEALICE 60 PB 10^5 CPUs 5000 GPUs Future IT Challenges for ALICE
O2 Facility Predrag Buncic 10 ALICEALICE 60 PB 10^5 CPUs 5000 GPUs Future IT Challenges for ALICE
O2 Facility Predrag Buncic 11 ALICEALICE 60 PB 10^5 CPUs 5000 GPUs Future IT Challenges for ALICE
O2 Facility Predrag Buncic 12 ALICEALICE 60 PB 10^5 CPUs 5000 GPUs Future IT Challenges for ALICE
O2 Facility Predrag Buncic 13 ALICEALICE 60 PB 10^5 CPUs 5000 GPUs Future IT Challenges for ALICE
O2 Facility Predrag Buncic 14 ALICEALICE 60 PB 10^5 CPUs 5000 GPUs Future IT Challenges for ALICE
O2 Facility Predrag Buncic 15 ALICEALICE 60 PB 10^5 CPUs 5000 GPUs Future IT Challenges for ALICE
Roles of Tiers in Run 3 Predrag Buncic 16 ALICEALICE Reconstruction Calibration Analysis Reconstruction Calibration Archiving Analysis Simulation T0/T1 CTF -> ESD -> AOD AF AOD -> HISTO, TREE O 2 RAW -> CTF -> ESD -> AOD 1 T2/HPC MC -> CTF -> ESD -> AOD 1..n 1..3 CTF AOD Future IT Challenges for ALICE
O2 Software Framework Predrag Buncic 17 ALICEALICE Input File(s) T 5 Output File T 1 T 3 T 2 T 4 T 6 t0 time t1 ALFA - Common development between GSI/FAIR experiments and ALICE Distributed, multi process application Each process can be multi-threaded and possibly adapted to use hardware accelerators Message Queues for data exchange Future IT Challenges for ALICE
Detector readout at 1.1 TB/s and early data reduction in FLPs FPGA based data transfer, cluster finder, data compression FPGA programming, tools and training FLP to EPN data transfer at 500 GB/s Network modeling, simulation and cost optimization Lossy data compression on EPNs requiring precise quasi-online calibration and time frame based reconstruction Low latency database (key value store) for calibration Using ~ cores and 5000 GPUs for synchronous processing Management and s/w deployment/scheduling on heterogeneous clusters Use of lightweight virtualization (containers) CernVM like docker containers? Provisioning and maintaining 60 PB disk buffer to store compressed time frames and assure subsequent calibration/reconstruction iterations EOS or cluster file system? Summary of future IT challenges for ALICE Predrag Buncic 18 ALICEALICE Future IT Challenges for ALICE
Extending O2 style computing to Grid Optimizing scheduling and throughput of distributed application Modeling of the system in order to optimize cost and performance Using dedicated facilities or re-purposed T1/T2s for analysis HPC style big data processing, 5PB/day Software framework developments Performance on all levels Use of hardware accelerators Efficient streaming I/O Configuring and deploying distributed multi-process applications Simulation Developing fast simulation tools Use of opportunistic and HPC resources O2 Facility design and implementation 2.4 MW facility on limited budget Summary of future IT challenges for ALICE Predrag Buncic 19 ALICEALICE Future IT Challenges for ALICE