Workshop Computing Models status and perspectives

Slides:



Advertisements
Similar presentations
Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
Advertisements

T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
ALICE Operations short summary LHCC Referees meeting June 12, 2012.
ALICE Operations short summary and directions in 2012 WLCG workshop May 19-20, 2012.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
ALICE O 2 Plenary | October 1 st, 2014 | Pierre Vande Vyvre O2 Project Status P. Buncic, T. Kollegger, Pierre Vande Vyvre 1.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting September 22, 2015.
ALICE – networking LHCONE workshop 10/02/ Quick plans: Run 2 data taking Both for Pb+Pb and p+p – Reach 1 nb -1 integrated luminosity for rare.
ALICE Offline Week | CERN | November 7, 2013 | Predrag Buncic AliEn, Clouds and Supercomputers Predrag Buncic With minor adjustments by Maarten Litmaath.
ALICE Grid operations: last year and perspectives (+ some general remarks) ALICE T1/T2 workshop Tsukuba 5 March 2014 Latchezar Betev Updated for the ALICE.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Claudio Grandi INFN Bologna CMS Computing Model Evolution Claudio Grandi INFN Bologna On behalf of the CMS Collaboration.
ALICE Operations short summary ALICE Offline week June 15, 2012.
CBM Computing Model First Thoughts CBM Collaboration Meeting, Trogir, 9 October 2009 Volker Friese.
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)
LHCb report to LHCC and C-RSG Philippe Charpentier CERN on behalf of LHCb.
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 01/12/2015.
Ian Bird WLCG Networking workshop CERN, 10 th February February 2014
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
LHCbComputing Computing for the LHCb Upgrade. 2 LHCb Upgrade: goal and timescale m LHCb upgrade will be operational after LS2 (~2020) m Increase significantly.
ALICE Software Evolution Predrag Buncic. GDB | September 11, 2013 | Predrag Buncic 2.
ALICE Grid operations +some specific for T2s US-ALICE Grid operations review 7 March 2014 Latchezar Betev 1.
ALICE Run 2 Readiness WLCG Collaboration Workshop Okinawa Apr 11, 2015 Maarten Litmaath CERN v1.2 1.
Main parameters of Russian Tier2 for ATLAS (RuTier-2 model) Russia-CERN JWGC meeting A.Minaenko IHEP (Protvino)
LHCb Current Understanding of Italian Tier-n Centres Domenico Galli, Umberto Marconi Roma, January 23, 2001.
Workshop ALICE Upgrade Overview Thorsten Kollegger for the ALICE Collaboration ALICE | Workshop |
16 September 2014 Ian Bird; SPC1. General ALICE and LHCb detector upgrades during LS2  Plans for changing computing strategies more advanced CMS and.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Alessandro De Salvo CCR Workshop, ATLAS Computing Alessandro De Salvo CCR Workshop,
Domenico Elia1 ALICE computing: status and perspectives Domenico Elia, INFN Bari Workshop CCR INFN / LNS Catania, Workshop Commissione Calcolo.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Predrag Buncic CERN Plans for Run2 and the ALICE upgrade in Run3 ALICE Tier-1/Tier-2 Workshop February 2015.
LHCb LHCb GRID SOLUTION TM Recent and planned changes to the LHCb computing model Marco Cattaneo, Philippe Charpentier, Peter Clarke, Stefan Roiser.
Monthly video-conference, 18/12/2003 P.Hristov1 Preparation for physics data challenge'04 P.Hristov Alice monthly off-line video-conference December 18,
LHCb Computing 2015 Q3 Report Stefan Roiser LHCC Referees Meeting 1 December 2015.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 24/05/2015.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
Data Formats and Impact on Federated Access
Status of WLCG FCPPL project
T1/T2 workshop – 7th edition - Strasbourg 3 May 2017 Latchezar Betev
Ian Bird WLCG Workshop San Francisco, 8th October 2016
ALICE internal and external network
SuperB and its computing requirements
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
evoluzione modello per Run3 LHC
LHC experiments Requirements and Concepts ALICE
for the Offline and Computing groups
ALICE – First paper.
Update on Plan for KISTI-GSDC
Operations in 2012 and plans for the LS1
Offline data taking and processing
LHCb Software & Computing Status
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
Commissioning of the ALICE HLT, TPC and PHOS systems
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
Simulation use cases for T2 in ALICE
ALICE Computing Model in Run3
ALICE Computing Upgrade Predrag Buncic
New strategies of the LHC experiments to meet
R. Graciani for LHCb Mumbay, Feb 2006
The ATLAS Computing Model
Development of LHCb Computing Model F Harris
The LHC Computing Grid Visit of Professor Andreas Demetriou
Presentation transcript:

Workshop Computing Models status and perspectives Bologna, 19 Febbraio 2015 ALICE CM: status and perspectives Domenico Elia Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 Outline Focus on Run2: physics programme and rates, detector upgrades basics for 2015-2018 operations software and process improvements infrastructure improvements resource considerations and requirements Run3: deep change in the CM expected project being designed, Computing Upgrade TDR in progress Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Physics programme, upgrades Targeting integrated luminosity 1 nb-1 for PbPb: by combination of Run1 and Run2 statistics consistent with the ALICE approved programme 4-fold increase in instant luminosity for PbPb Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Physics programme, upgrades Targeting integrated luminosity 1 nb-1 for PbPb: by combination of Run1 and Run2 statistics consistent with the ALICE approved programme 4-fold increase in instant luminosity for PbPb Double event rate of TPC/TRD: consolidation of TPC and TRD readout electronics Increased capacity of HLT and DAQ: rate up to 8 GB/sec to T0 (for Heavy-Ion data taking) Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Physics programme, upgrades Detector upgrades: TRD (+5 modules), full azimuthal coverage PHOS (+1 module) DCAL (new calorimeter) +5 TRD’s +1 PHOS DCAL Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Basics of Run2 operations ALICE Grid model largely unchanged in Run2: integration of every new computing centre average 2 replicas of analysis objects: dependency on resource stability, 1 copy for least popular data Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Basics of Run2 operations ALICE Grid model largely unchanged in Run2: integration of every new computing centre average 2 replicas of analysis objects: dependency on resource stability, 1 copy for least popular data low differentiation of tasks: T0/T1 still raw data keepers/producers all other tasks (MC + analysis) performed everywhere tasks generally sent to data, but data can go to tasks if needed: jobs go to data, in case of failure read from closest replica (<5%) ALICE global data distribution by exclusive use of xrootd protocol Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Basics of Run2 operations ALICE Grid model largely unchanged in Run2: integration of every new computing centre average 2 replicas of analysis objects: dependency on resource stability, 1 copy for least popular data low differentiation of tasks: T0/T1 still raw data keepers/producers all other tasks (MC + analysis) performed everywhere tasks generally sent to data, but data can go to tasks if needed: jobs go to data, in case of failure read from closest replica (<5%) ALICE global data distribution by exclusive use of xrootd protocol analysis input mostly on AODs (limited use of ESDs) push analyzers to organized trains (LEGO framework) Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Basics of Run2 operations Computing tasks and workflow: Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Efficiency per workflow ALICE CM: Focus on Run2 Wall time resource share 2014 RAW data processing: 3% @ T0/T1s only Individual analysis: 14% @all centres 425 users Organized analysis: 14% @all centres Efficiency per workflow (average over all sites) MC productions: 69% @all centres Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Software and process impro’s Evolution of the CM: new version of the software framework (AliRoot 5.x) use of HLT for online Raw data compression (factor 4): already tested in Run1, implies reduction of tape storage @ Tier-0/1 use of HLT for calibration: move first calibration iteration to online Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Software and process impro’s Evolution of the CM: new version of the software framework (AliRoot 5.x) use of HLT for online Raw data compression (factor 4): already tested in Run1, implies reduction of tape storage @ Tier-0/1 use of HLT for calibration: move first calibration iteration to online use of HLT track seeds for offline reconstruction improve performance of GEANT4 simulation for ALICE further development of fast and parametrized simulation Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Software and process impro’s Evolution of the CM Additional software and process improvements: start adapting ALICE distributed computing to Cloud, using of HLT farm for offline processing corresponds to additional 3% CPU resources improving performance of the organized analysis trains speeding up and improving the efficiency of the analysis activity by active data management explore contributed resources: ie spare CPU cycles on supercomputers collaborating with other experiment on this issue Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability: major factor for successful analysis and high CPU efficiency goal for all SEs: > 98% availability Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme: brings substantial improvement in inter-site connectivity allows for further diluition of boundaries between sites and tasks Europe largely covered, focus on South America and Asia Network use will increase: large data volumes, more to transfer between sites remote access to storage Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme Network use will increase IPv6 adoption: IPv4 address depletion is already a fact for new sites ALICE services are IPv6 ready xrootd v.4 should be IPv6 ready (release end of May) other sevices are being brought into compliance Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme Network use will increase IPv6 adoption Refurbishment of SAM/SUM tests: WLCG monitoring consolidation projet, advanced status Site tests will reflect more and more the VO tests: in the ALICE case provided by MonALISA Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Resource considerations Basic assumptions for Run2 resource estimate: same CPU power needed for reconstruction 25% larger raw event size: additional detectors, detector coverage higher track multiplicity with increased beam energy and pileup MC productions: 100% pp, pPb + 30-40% PbPb events Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Resource considerations Basic assumptions for Run2 resource estimate: same CPU power needed for reconstruction 25% larger raw event size: additional detectors, detector coverage higher track multiplicity with increased beam energy and pileup MC productions: 100% pp, pPb + 30-40% PbPb events Resource requirements for 2015: scrutinized and approved by CRSG in April 2014 CPU request growth compatible with “flat budget” tape and disk resources increase after 2013/14 flat profile major demand on resources towards end of 2015 (PbPb run) Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Resource considerations Resource requirements 2015-2017: Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Resource considerations Resource requirements 2015-2017: T1+T2 resource increase compared to previous year: 2015: 7% 2016: 25% 2017: 20% Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Resource considerations Resource requirements 2015-2017: T1+T2 resource increase compared to previous year: 2015: 66% 2016: 17% 2017: 17% Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Resource considerations Resource requirements 2015-2017: T1 resource increase compared to previous year: 2015: 57% 2016: 52% 2017: 26% Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario: ALICE upgrade aiming to high statistics sample (10 nb-1) continuous readout TPC, upgraded ITS 50 kHz PbPb interaction rate (current rate x100) ~1.1 TB /s detector readout Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy: data reduction by (partial) online reconstruction and compression Store only reconstruction results, discard raw data demonstrated with TPC cluster finder running on HLT since PbPb 2011 using data structures optimized for lossless compression using algorithms designed to allow for subsequent offline reconstruction passes with improved calibrations implies much tighter coupling between Online and Offline reconstruction software  O2 Project Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy: data reduction by (partial) online reconstruction and compression Store only reconstruction results, discard raw data - demonstrated with TPC cluster finder running on HLT since PbPb 2011 - using data structures optimized for lossless compression - using algorithms designed to allow for subsequent offline reconstuction passes with improved calibrations From Detector Readout to Analysis, from DAQ, HLT to Offline: together, one computing framework  O2 project Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy Simulation strategy: migrate from G3 to G4 expect to profit from future G4 developments multithreaded G4, G4 on GPU … be ready to use contributed resources supercomputers, volunteer computing resources must work on (fast) parametrized simulation basic support exists in the current framework must make more use of embedding, event mixing Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy Simulation strategy O2 project and basics of the new CM: project started on March 2013 reconstruction @ Tier-0 (online, using FPGA, GPU, MCCPU etc) MC and analysis @ Tier-1/2’s (AF on Demand) evolution of the framework AliRoot 5.x  6.x working on AliRoot 6.x already started in collaboration with FAIR Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Perspectives for Run3 Software development timeline (Predrag Buncic) ALICE CM: Run3 Perspectives for Run3 Evolution of current framework Root 5.x Improve the algorithms and procedures Software development timeline (Predrag Buncic) AliRoot 5.x New modern framework Root 6.x, C++11 Optimized for I/O FPGA, GPU, MIC… AliRoot 6.x Run 1 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 LS1 Run 2 LS2 Run 3 2022 2023 2024-26 LS3 Run 4 Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy Simulation strategy O2 project and basics of the new CM: project started on March 2013 reconstruction @ Tier-0 (online, using FPGA, GPU, MCCPU etc) MC and analysis @ Tier-1/2’s (AF on Demand) evolution of the framework AliRoot 5.x  6.x adapting ALICE distributed computing to Cloud Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 In order to reduce complexity national or regional T1/T2 centers could transform themselves into Cloud regions. Providing IaaS and reliable data services with very good network between the sites. Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy Simulation strategy O2 project and basics of the new CM Computing Upgrade TDR to the LHCC meeting on June 2015 initially expected by October 2014 delay not due to changes in the strategy but: enlarged scope (project to be presented in its global environment) associate to TDR all institutes showing active interest more simulation, benchmark and prototypoing results Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 Summary Run2: data volume in the period 2015-2018 expected ~3x Run1 focus of the Grid development will be on improving the analysis efficiency and decreasing the turnaround time of organized trains several other software and process improvements site performance and stability will continue to be a key factor for success of the ALICE offline computing planned resource increase expected to meet the demands, working on data popularity monitoring and replica limitation Run3: TDR in progress, going to be discussed with LHCC in June Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 Backup slides Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: status and perspectives Integrated data from Run1 Data taking 2010: pp @ 0.9 – 7 TeV PbPb @ 2.76 TeV (MB) 2011: pp @ 2.76 – 7 TeV (MB & rare) PbPb @ 2.76 TeV (MB & rare) 2012-2013: pp @ 8 TeV (rare) pPb @ 5.02 TeV (pilot, 2012) pPb @ 5.02 TeV (MB & rare, 2013) Total data volume: 7.3 PB raw data 2 copies @ CERN (T0) + 1 replica @ 6 T1s copies on tape @ T1s (“good” data only) 16 PB derived data Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 Status of the ALICE computing Grid running profile Progressive reduction of User analysis with increasing usage of “LEGO trains” in the last 2 years Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Focus on Run2 Efficiency per workflow Average over all sites Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 Status of the ALICE computing CPU efficiency Improved a lot in the last few years: Main actions: modifications of the OCDB structure improvement of raw processing more efficient analysis trains (LEGO framework) moving users from ESDs to AODs several steps from ~50-60% (2011) to current average (unweighted): Tier-0/1: ~85% Tier-2: ~80% Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 Status of the ALICE computing Computing sharing in 2013 ALICE total: 50/50 (T0+T1s)/T2s ~290M Wall hours (260M in 2012) Italian contribution: 50/50 T1/T2s ~43M Wall hours (15% of the total) Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 Status of the ALICE computing Italian contribution @ Tier-1 L. Morganti, S. Taneja, CdG Tier-1 25.3.2014 TAPE (TB) DISK (TB) Low consumption of tape storage compared to requirements due to revised strategy of data preservation: ALICE analysis not using data on tape due to high latency of the tape system (ESDs and AODs data reside on disk) obsolete data permanently deleted and not saved to tape  new requirements reflect this practice Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 Status of the ALICE computing Italian contribution @ Tier-2’s Site Availability from EGI monthly reports (Jan 2013 – Mar 2014): Italian Tier-2’s quite satisfactory, all above the average CNAF 99.0% Average 90.8% BA 95.1% CT 92.7% PD-LNL 98.8% TO 91.2% Domenico Elia Workshop Computing Models / Bologna 19.2.2015

Workshop Computing Models / Bologna 19.2.2015 ALICE CM: Run3 Perspectives for Run3 Domenico Elia Workshop Computing Models / Bologna 19.2.2015