ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.

Slides:



Advertisements
Similar presentations
HLT - data compression vs event rejection. Assumptions Need for an online rudimentary event reconstruction for monitoring Detector readout rate (i.e.
Advertisements

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
ALFA: The new ALICE-FAIR software framework
The LHCb DAQ and Trigger Systems: recent updates Ricardo Graciani XXXIV International Meeting on Fundamental Physics.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
Status and roadmap of the AlFa Framework Mohammad Al-Turany GSI-IT/CERN-PH-AIP.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Quality Control B. von Haller 8th June 2015 CERN.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Tom Dietel University of Cape Town for the ALICE Collaboration Computing for ALICE at the LHC.
J OINT I NSTITUTE FOR N UCLEAR R ESEARCH OFF-LINE DATA PROCESSING GRID-SYSTEM MODELLING FOR NICA 1 Nechaevskiy A. Dubna, 2012.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
Use of GPUs in ALICE (and elsewhere) Thorsten Kollegger TDOC-PG | CERN |
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory.
Predrag Buncic, October 3, 2013 ECFA Workshop Aix-Les-Bains - 1 Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory.
TDAQ Upgrade Software Plans John Baines, Tomasz Bold Contents: Future Framework Exploitation of future Technologies Work for Phase-II IDR.
Status report from the Deferred Trigger Study Group John Baines, Giovanna Lehmann Miotto, Wainer Vandelli, Werner Wiedenmann, Eric Torrence, Armin Nairz.
Simulations and Software CBM Collaboration Meeting, GSI, 17 October 2008 Volker Friese Simulations Software Computing.
University user perspectives of the ideal computing environment and SLAC’s role Bill Lockman Outline: View of the ideal computing environment ATLAS Computing.
Pierre VANDE VYVRE for the O 2 project 15-Oct-2013 – CHEP – Amsterdam, Netherlands.
Predrag Buncic, October 3, 2013 ECFA Workshop Aix-Les-Bains - 1 Computing at the HL-LHC Predrag Buncic on behalf of the Trigger/DAQ/Offline/Computing Preparatory.
CBM Computing Model First Thoughts CBM Collaboration Meeting, Trogir, 9 October 2009 Volker Friese.
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
O 2 Project Roadmap P. VANDE VYVRE 1. O2 Project: What’s Next ? 2 O2 Plenary | 11 March 2015 | P. Vande Vyvre TDR close to its final state and its submission.
ALICE Online Upgrade P. VANDE VYVRE – CERN/PH ALICE meeting in Budapest – March 2012.
US ATLAS Tier 1 Facility Rich Baker Deputy Director US ATLAS Computing Facilities October 26, 2000.
Workflows and Data Management. Workflow and DM Run3 and after: conditions m LHCb major upgrade is for Run3 (2020 horizon)! o Luminosity x 5 ( )
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
OPERATIONS REPORT JUNE – SEPTEMBER 2015 Stefan Roiser CERN.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 01/12/2015.
Pierre VANDE VYVRE ALICE Online upgrade October 03, 2012 Offline Meeting, CERN.
CWG13: Ideas and discussion about the online part of the prototype P. Hristov, 11/04/2014.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
LHCbComputing Computing for the LHCb Upgrade. 2 LHCb Upgrade: goal and timescale m LHCb upgrade will be operational after LS2 (~2020) m Increase significantly.
ALICE Software Evolution Predrag Buncic. GDB | September 11, 2013 | Predrag Buncic 2.
Monitoring for the ALICE O 2 Project 11 February 2016.
ALICE O 2 project B. von Haller on behalf of the O 2 project CERN.
ALICE O 2 | 2015 | Pierre Vande Vyvre O 2 Project Pierre VANDE VYVRE.
1 Reconstruction tasks R.Shahoyan, 25/06/ Including TRD into track fit (JIRA PWGPP-1))  JIRA PWGPP-2: Code is in the release, need to switch setting.
Workshop ALICE Upgrade Overview Thorsten Kollegger for the ALICE Collaboration ALICE | Workshop |
16 September 2014 Ian Bird; SPC1. General ALICE and LHCb detector upgrades during LS2  Plans for changing computing strategies more advanced CMS and.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Current activities and short term plans 24/04/2015 P. Hristov.
Domenico Elia1 ALICE computing: status and perspectives Domenico Elia, INFN Bari Workshop CCR INFN / LNS Catania, Workshop Commissione Calcolo.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Predrag Buncic CERN Plans for Run2 and the ALICE upgrade in Run3 ALICE Tier-1/Tier-2 Workshop February 2015.
Ideas and Plans towards CMS Software and Computing for Phase 2 Maria Girone, David Lange for the CMS Offline and Computing group April
Monthly video-conference, 18/12/2003 P.Hristov1 Preparation for physics data challenge'04 P.Hristov Alice monthly off-line video-conference December 18,
LHCb Computing 2015 Q3 Report Stefan Roiser LHCC Referees Meeting 1 December 2015.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 24/05/2015.
Ian Bird WLCG Workshop San Francisco, 8th October 2016
evoluzione modello per Run3 LHC
Workshop Computing Models status and perspectives
LHC experiments Requirements and Concepts ALICE
for the Offline and Computing groups
ALICE – First paper.
Commissioning of the ALICE HLT, TPC and PHOS systems
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
ALICE Computing Upgrade Predrag Buncic
ILD Ichinoseki Meeting
Low Level HLT Reconstruction Software for the CMS SST
Computing at the HL-LHC
Presentation transcript:

ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014

Timeline and resources consideration ALICE upgrade in Run3 CPU needs proportional to track multiplicity + pileup Storage needs proportional to accumulated luminosity ALICE upgrade basic estimates Event rate 50KHz (Pb-Pb), 200KHz (p-p, p-Pb) Event size 1.1TB/sec from detector; 13GB/sec average processed and compressed to storage 2

Data processing and systems consolidation RAW data rates and volume necessitate the creation of an online-offline facility (02) for data compression, incorporating – DAQ functionality – detector readout, data transport and event building – HLT functionality – data compression, clustering algorithms, tracking algorithms – Offline functionality – calibration, full event processing and reconstruction, up to analysis objects data 3

Data flow and compression in O2 Internal O2 Storage “Intermediate” format: Synchronous and asynchronous RAW data processing with increased precision – up to max compression GRID storage AODs Simulations Compressed RAW (custodial) Internal O2 Storage “Intermediate” format: Synchronous and asynchronous RAW data processing with increased precision – up to max compression GRID storage AODs Simulations Compressed RAW (custodial) 4

5 2 x 10 or 40 Gb/s FLP 10 Gb/s FLP ITS TRD Muon FTP FLP EMC FLP TPC FLP TOF FLP PHO Trigger Detectors ~ 2500 DDL3s in total ~ 250 FLPs First Level Processors EPN Data Storage EPN Storage Network Storage Network Farm Network Farm Network 10 Gb/s ~ 1250 EPNs Event Processing Nodes Hardware architecture of O2

Software and process improvements for O2 ‘Offline quality’ calibration – Critical for the data compression – Compressed data in format allowing reprocessing, i.e. finer-grain calibration is still possible Use of FPGAs, GPUs and CPUs in combination – Software uses specific advantages of each – A well-tested approach in production (current HLT) New framework to incorporate all tasks – ALFA (ALICE-FAIR) developed in collaboration with the FAIR collaboration at GSI Darmstadt 6

ALFA – online system for data processing Technical challenges Data transport Cluster infrastructure File systems and package distribution Access of large data sets Process orchestration Algorithmic challenges Fast algorithms Scalable processing instances Data model Collaborative challenges Combination of computer scientists (online) and physicists (offine) 7 Technical/Algorithmic challenges are addressed in already formed Computing Working Groups

Simplified data processing scheme 8 First Level Processor Event Processing Node Focus on online processing Develop and/or reuse code for data generation Data transport mechanism Processing algorithms Store the results O2 storage

Data transport prototype 9 EPN FLP EPN aidrefma05 aidrefma01 aidrefma02 aidrefma03 aidrefma04 aidrefma07 aidrefma06aidrefma08 Existing and tested to expected throughput

Preparations for Run3 – tests of processing chain Cluster finding in HLT for all detectors – New, more powerful, HLT cluster – System validated for TPC over 2 years of data taking in Run1 – Largest data compression factor (x4) Detector calibration online – Many detectors (currently offline) calibration algorithms in HLT – Critical step for good quality tracking and data compression Preparation of O2 test of expected size – Testbed for ALFA development 10

The Grid General assumption and roles – Will continue to do what it does now – MC simulation, end-user and organized data analysis, raw data reconstruction (reduced) – Custodial storage for RAW data – output of O2 compressed data stream Expected that the Grid will grow with the same rate Search for additional resources – New sites – Contributed resources from existing sites 11

Growth expectations Based on ‘flat budget’ and modified Moore’s law 12 Number of cores growth by 25% year on year – power/core ~constant Storage growth at 20% per year Projected at 2020 => ~3-4x the current power

Grid upgrades – data processing Assume sites will continue offering standard CPUs – ‘GPU ready’ algorithms have to cope Growth beyond ‘flat budget’ scenario necessitates finding new resources – This is an ongoing process Contributed cycles from existing, non-standard sources – Supercomputer back-fill – this can potentially bring equivalent of tens of thousand CPU cores for MC – Ongoing collaborative work with ATLAS (PanDa) Support for various cluster configurations will continue and expand, as needed – Standard batch/whole node/AI-cloud-on demand – Increases flexibility, efficiency 13

Grid upgrades – data storage Grid storage is expected to hold the analysis object data and compressed RAW (custodial) The data volume to the Grid will be reduced to reasonable size by the O2 facility General expectation for cheaper storage driven by the needs of the ‘big players’ (insert Facebook here) – May modify upward the current 20% yearly growth – Perhaps slower media – need for more sophisticated data access models – Biggest impact is on the cost of storage for the O2 facility 14

Summary ALICE upgrade for LHC Run3 (2020 and beyond) – Event rate x 100, data volume x 10 (of current rates) Data reduction to manageable levels - dedicated O2 computing facility, incorporating DAQ/HLT and Offline functions – Heterogeneous hardware (CPU/GPU/FPGA) – Data volume reduction by x10 to 13GB/sec, up to analysis-ready data containers – Framework development, software demonstrators, test cluster in preparation Grid resources are expected to grow with the usual rate – Intensive programme to incorporate new sites and contributed computing power on existing facilities 15