Presentation is loading. Please wait.

Presentation is loading. Please wait.

Workshop Computing Models status and perspectives

Similar presentations


Presentation on theme: "Workshop Computing Models status and perspectives"— Presentation transcript:

1 Workshop Computing Models status and perspectives
Bologna, 19 Febbraio 2015 ALICE CM: status and perspectives Domenico Elia Domenico Elia Workshop Computing Models / Bologna

2 Workshop Computing Models / Bologna 19.2.2015
Outline Focus on Run2: physics programme and rates, detector upgrades basics for operations software and process improvements infrastructure improvements resource considerations and requirements Run3: deep change in the CM expected project being designed, Computing Upgrade TDR in progress Domenico Elia Workshop Computing Models / Bologna

3 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Physics programme, upgrades Targeting integrated luminosity 1 nb-1 for PbPb: by combination of Run1 and Run2 statistics consistent with the ALICE approved programme 4-fold increase in instant luminosity for PbPb Domenico Elia Workshop Computing Models / Bologna

4 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Physics programme, upgrades Targeting integrated luminosity 1 nb-1 for PbPb: by combination of Run1 and Run2 statistics consistent with the ALICE approved programme 4-fold increase in instant luminosity for PbPb Double event rate of TPC/TRD: consolidation of TPC and TRD readout electronics Increased capacity of HLT and DAQ: rate up to 8 GB/sec to T0 (for Heavy-Ion data taking) Domenico Elia Workshop Computing Models / Bologna

5 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Physics programme, upgrades Detector upgrades: TRD (+5 modules), full azimuthal coverage PHOS (+1 module) DCAL (new calorimeter) +5 TRD’s +1 PHOS DCAL Domenico Elia Workshop Computing Models / Bologna

6 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Basics of Run2 operations ALICE Grid model largely unchanged in Run2: integration of every new computing centre average 2 replicas of analysis objects: dependency on resource stability, 1 copy for least popular data Domenico Elia Workshop Computing Models / Bologna

7 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Basics of Run2 operations ALICE Grid model largely unchanged in Run2: integration of every new computing centre average 2 replicas of analysis objects: dependency on resource stability, 1 copy for least popular data low differentiation of tasks: T0/T1 still raw data keepers/producers all other tasks (MC + analysis) performed everywhere tasks generally sent to data, but data can go to tasks if needed: jobs go to data, in case of failure read from closest replica (<5%) ALICE global data distribution by exclusive use of xrootd protocol Domenico Elia Workshop Computing Models / Bologna

8 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Basics of Run2 operations ALICE Grid model largely unchanged in Run2: integration of every new computing centre average 2 replicas of analysis objects: dependency on resource stability, 1 copy for least popular data low differentiation of tasks: T0/T1 still raw data keepers/producers all other tasks (MC + analysis) performed everywhere tasks generally sent to data, but data can go to tasks if needed: jobs go to data, in case of failure read from closest replica (<5%) ALICE global data distribution by exclusive use of xrootd protocol analysis input mostly on AODs (limited use of ESDs) push analyzers to organized trains (LEGO framework) Domenico Elia Workshop Computing Models / Bologna

9 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Basics of Run2 operations Computing tasks and workflow: Domenico Elia Workshop Computing Models / Bologna

10 Efficiency per workflow
ALICE CM: Focus on Run2 Wall time resource share 2014 RAW data processing: 3% @ T0/T1s only Individual analysis: 14% @all centres 425 users Organized analysis: 14% @all centres Efficiency per workflow (average over all sites) MC productions: 69% @all centres Domenico Elia Workshop Computing Models / Bologna

11 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Software and process impro’s Evolution of the CM: new version of the software framework (AliRoot 5.x) use of HLT for online Raw data compression (factor 4): already tested in Run1, implies reduction of tape Tier-0/1 use of HLT for calibration: move first calibration iteration to online Domenico Elia Workshop Computing Models / Bologna

12 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Software and process impro’s Evolution of the CM: new version of the software framework (AliRoot 5.x) use of HLT for online Raw data compression (factor 4): already tested in Run1, implies reduction of tape Tier-0/1 use of HLT for calibration: move first calibration iteration to online use of HLT track seeds for offline reconstruction improve performance of GEANT4 simulation for ALICE further development of fast and parametrized simulation Domenico Elia Workshop Computing Models / Bologna

13 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Software and process impro’s Evolution of the CM Additional software and process improvements: start adapting ALICE distributed computing to Cloud, using of HLT farm for offline processing corresponds to additional 3% CPU resources improving performance of the organized analysis trains speeding up and improving the efficiency of the analysis activity by active data management explore contributed resources: ie spare CPU cycles on supercomputers collaborating with other experiment on this issue Domenico Elia Workshop Computing Models / Bologna

14 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability: major factor for successful analysis and high CPU efficiency goal for all SEs: > 98% availability Domenico Elia Workshop Computing Models / Bologna

15 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme: brings substantial improvement in inter-site connectivity allows for further diluition of boundaries between sites and tasks Europe largely covered, focus on South America and Asia Network use will increase: large data volumes, more to transfer between sites remote access to storage Domenico Elia Workshop Computing Models / Bologna

16 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme Network use will increase IPv6 adoption: IPv4 address depletion is already a fact for new sites ALICE services are IPv6 ready xrootd v.4 should be IPv6 ready (release end of May) other sevices are being brought into compliance Domenico Elia Workshop Computing Models / Bologna

17 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Infrastructure improvements Focus on SE stability LHCone programme Network use will increase IPv6 adoption Refurbishment of SAM/SUM tests: WLCG monitoring consolidation projet, advanced status Site tests will reflect more and more the VO tests: in the ALICE case provided by MonALISA Domenico Elia Workshop Computing Models / Bologna

18 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Resource considerations Basic assumptions for Run2 resource estimate: same CPU power needed for reconstruction 25% larger raw event size: additional detectors, detector coverage higher track multiplicity with increased beam energy and pileup MC productions: 100% pp, pPb % PbPb events Domenico Elia Workshop Computing Models / Bologna

19 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Resource considerations Basic assumptions for Run2 resource estimate: same CPU power needed for reconstruction 25% larger raw event size: additional detectors, detector coverage higher track multiplicity with increased beam energy and pileup MC productions: 100% pp, pPb % PbPb events Resource requirements for 2015: scrutinized and approved by CRSG in April 2014 CPU request growth compatible with “flat budget” tape and disk resources increase after 2013/14 flat profile major demand on resources towards end of 2015 (PbPb run) Domenico Elia Workshop Computing Models / Bologna

20 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Resource considerations Resource requirements : Domenico Elia Workshop Computing Models / Bologna

21 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Resource considerations Resource requirements : T1+T2 resource increase compared to previous year: 2015: 7% 2016: 25% 2017: 20% Domenico Elia Workshop Computing Models / Bologna

22 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Resource considerations Resource requirements : T1+T2 resource increase compared to previous year: 2015: 66% 2016: 17% 2017: 17% Domenico Elia Workshop Computing Models / Bologna

23 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Resource considerations Resource requirements : T1 resource increase compared to previous year: 2015: 57% 2016: 52% 2017: 26% Domenico Elia Workshop Computing Models / Bologna

24 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario: ALICE upgrade aiming to high statistics sample (10 nb-1) continuous readout TPC, upgraded ITS 50 kHz PbPb interaction rate (current rate x100) ~1.1 TB /s detector readout Domenico Elia Workshop Computing Models / Bologna

25 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy: data reduction by (partial) online reconstruction and compression Store only reconstruction results, discard raw data demonstrated with TPC cluster finder running on HLT since PbPb 2011 using data structures optimized for lossless compression using algorithms designed to allow for subsequent offline reconstruction passes with improved calibrations implies much tighter coupling between Online and Offline reconstruction software  O2 Project Domenico Elia Workshop Computing Models / Bologna

26 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy: data reduction by (partial) online reconstruction and compression Store only reconstruction results, discard raw data - demonstrated with TPC cluster finder running on HLT since PbPb 2011 - using data structures optimized for lossless compression - using algorithms designed to allow for subsequent offline reconstuction passes with improved calibrations From Detector Readout to Analysis, from DAQ, HLT to Offline: together, one computing framework  O2 project Domenico Elia Workshop Computing Models / Bologna

27 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy Simulation strategy: migrate from G3 to G4 expect to profit from future G4 developments multithreaded G4, G4 on GPU … be ready to use contributed resources supercomputers, volunteer computing resources must work on (fast) parametrized simulation basic support exists in the current framework must make more use of embedding, event mixing Domenico Elia Workshop Computing Models / Bologna

28 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy Simulation strategy O2 project and basics of the new CM: project started on March 2013 Tier-0 (online, using FPGA, GPU, MCCPU etc) MC and Tier-1/2’s (AF on Demand) evolution of the framework AliRoot 5.x  6.x working on AliRoot 6.x already started in collaboration with FAIR Domenico Elia Workshop Computing Models / Bologna

29 Perspectives for Run3 Software development timeline (Predrag Buncic)
ALICE CM: Run3 Perspectives for Run3 Evolution of current framework Root 5.x Improve the algorithms and procedures Software development timeline (Predrag Buncic) AliRoot 5.x New modern framework Root 6.x, C++11 Optimized for I/O FPGA, GPU, MIC… AliRoot 6.x Run 1 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 LS1 Run 2 LS2 Run 3 2022 2023 LS3 Run 4 Domenico Elia Workshop Computing Models / Bologna

30 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy Simulation strategy O2 project and basics of the new CM: project started on March 2013 Tier-0 (online, using FPGA, GPU, MCCPU etc) MC and Tier-1/2’s (AF on Demand) evolution of the framework AliRoot 5.x  6.x adapting ALICE distributed computing to Cloud Domenico Elia Workshop Computing Models / Bologna

31 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 In order to reduce complexity national or regional T1/T2 centers could transform themselves into Cloud regions. Providing IaaS and reliable data services with very good network between the sites. Domenico Elia Workshop Computing Models / Bologna

32 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 Detectors and running scenario Reconstruction strategy Simulation strategy O2 project and basics of the new CM Computing Upgrade TDR to the LHCC meeting on June 2015 initially expected by October 2014 delay not due to changes in the strategy but: enlarged scope (project to be presented in its global environment) associate to TDR all institutes showing active interest more simulation, benchmark and prototypoing results Domenico Elia Workshop Computing Models / Bologna

33 Workshop Computing Models / Bologna 19.2.2015
Summary Run2: data volume in the period expected ~3x Run1 focus of the Grid development will be on improving the analysis efficiency and decreasing the turnaround time of organized trains several other software and process improvements site performance and stability will continue to be a key factor for success of the ALICE offline computing planned resource increase expected to meet the demands, working on data popularity monitoring and replica limitation Run3: TDR in progress, going to be discussed with LHCC in June Domenico Elia Workshop Computing Models / Bologna

34 Workshop Computing Models / Bologna 19.2.2015
Backup slides Domenico Elia Workshop Computing Models / Bologna

35 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: status and perspectives Integrated data from Run1 Data taking 2010: 0.9 – 7 TeV TeV (MB) 2011: – 7 TeV (MB & rare) TeV (MB & rare) : 8 TeV (rare) TeV (pilot, 2012) TeV (MB & rare, 2013) Total data volume: 7.3 PB raw data 2 CERN (T0) T1s copies on T1s (“good” data only) 16 PB derived data Domenico Elia Workshop Computing Models / Bologna

36 Workshop Computing Models / Bologna 19.2.2015
Status of the ALICE computing Grid running profile Progressive reduction of User analysis with increasing usage of “LEGO trains” in the last 2 years Domenico Elia Workshop Computing Models / Bologna

37 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Focus on Run2 Efficiency per workflow Average over all sites Domenico Elia Workshop Computing Models / Bologna

38 Workshop Computing Models / Bologna 19.2.2015
Status of the ALICE computing CPU efficiency Improved a lot in the last few years: Main actions: modifications of the OCDB structure improvement of raw processing more efficient analysis trains (LEGO framework) moving users from ESDs to AODs several steps from ~50-60% (2011) to current average (unweighted): Tier-0/1: ~85% Tier-2: ~80% Domenico Elia Workshop Computing Models / Bologna

39 Workshop Computing Models / Bologna 19.2.2015
Status of the ALICE computing Computing sharing in 2013 ALICE total: 50/50 (T0+T1s)/T2s ~290M Wall hours (260M in 2012) Italian contribution: 50/50 T1/T2s ~43M Wall hours (15% of the total) Domenico Elia Workshop Computing Models / Bologna

40 Workshop Computing Models / Bologna 19.2.2015
Status of the ALICE computing Italian Tier-1 L. Morganti, S. Taneja, CdG Tier TAPE (TB) DISK (TB) Low consumption of tape storage compared to requirements due to revised strategy of data preservation: ALICE analysis not using data on tape due to high latency of the tape system (ESDs and AODs data reside on disk) obsolete data permanently deleted and not saved to tape  new requirements reflect this practice Domenico Elia Workshop Computing Models / Bologna

41 Workshop Computing Models / Bologna 19.2.2015
Status of the ALICE computing Italian Tier-2’s Site Availability from EGI monthly reports (Jan 2013 – Mar 2014): Italian Tier-2’s quite satisfactory, all above the average CNAF 99.0% Average 90.8% BA 95.1% CT 92.7% PD-LNL 98.8% TO 91.2% Domenico Elia Workshop Computing Models / Bologna

42 Workshop Computing Models / Bologna 19.2.2015
ALICE CM: Run3 Perspectives for Run3 Domenico Elia Workshop Computing Models / Bologna


Download ppt "Workshop Computing Models status and perspectives"

Similar presentations


Ads by Google