Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outline: The LHCb Computing Model Philippe Charpentier, CERN ICFA workshop on Grid activities, Sinaia, Romania, 13-18 October 2006 01 10 100 111 011 01.

Similar presentations


Presentation on theme: "Outline: The LHCb Computing Model Philippe Charpentier, CERN ICFA workshop on Grid activities, Sinaia, Romania, 13-18 October 2006 01 10 100 111 011 01."— Presentation transcript:

1 Outline: The LHCb Computing Model Philippe Charpentier, CERN ICFA workshop on Grid activities, Sinaia, Romania, 13-18 October 2006 01 10 100 111 011 01 01 00 01 01 0 10 10 11 01 00 B 00 l e

2 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 2 LHCb in brief m Experiment dedicated to studying CP-violation o Responsible for the dominance of matter on antimatter o Matter-antimatter difference studied using the b-quark (beauty) o High precision physics (tiny difference…) m Single arm spectrometer o Looks like a fixed-target experiment o Smallest of the 4 big LHC experiments o ~500 physicists m Nevertheless, computing is also a challenge….

3 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 3 LHCb data processing software Simul. Gauss Analysis DaVinci MCHits DST Raw Data (r)DST MCParts GenParts Event model / Physics event model AOD Conditions Database Gaudi Digit. Boole Trigger Moore Recons. Brunel

4 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 4 LHCb software stack m Uses CMT for build and configuration (handling dependencies) m LHCb projects: o Applications P Gauss (simulation), Boole (digitisation), Brunel (reconstruction), Moore (HLT), DaVinci (analysis) o Algorithms P LBCOM (commone packages), Rec (reconstruction), Phys (physics), Online o Event model P LHCb o Software framework P Gaudi o LCG Applications area P POOL, root, COOL o Lcg/external P External SW: boost, xerces… also middleware client (lfc, gfal,…) LHCb Online SEALPOOL RootExt.Libs GaussBooleBrunelPanoramixMoore Gaudi LCG Framework Applications PhysRecLbcom Event Model DaVinci Component projects COOL CORAL Geant4 GENSER

5 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 5 LHCb Basic Computing principles m Raw data shipped in real time to Tier-0 o Registered in the Grid (File Catalog) o Raw data provenance in a Bookkeeping database (query-enabled) o Resilience enforced by a second copy at Tier-1’s o Rate: ~2000 evts/s (35 kB)  70 MB/s o 4 main trigger sources (with little overlap) P b-exclusive; dimuon; D*; b-inclusive m All data processing up to final Tuple or histogram production distributed o Not even possible to reconstruct all data at Tier0… m Part of the analysis is not data-related o Extracting physics parameters on CP violation (toy-MC, complex fitting procedures…) o Also using distributed computing resources

6 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 6 Basic principles (cont’d) m LHCb runs jobs where data are o All data are placed explicitly m Analysis made possible by reduction of datasets o many different channels of interest o very few events in each channel (from 10 2 to 10 6 events / year) o physicist dealing with maximum 10 7 events o small and simple events o final dataset manageable on physicist’s desktop (100’s of GBytes) m Calibration and alignment performed on a selected part of the data stream o Alignment and tracking calibration using dimuons (~200/s) o PID calibration using D* (~100/s)

7 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 7 LHCb dataflow Online MSS-SE Tier1 MSS-SE Recons. Stripping Simulation. Raw Digi Raw/Digi rDST DST rDST+Raw Tier1 Tier2Tier0 Analysis DST

8 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 8 Comments on the LHCb Distributed Computing m Only last part of the analysis is foreseen to be “interactive” o Either analysing ROOT trees or using GaudiPython/pyRoot m User analysis at Tier1’s - why? o Analysis is very delicate, needs careful file placement o Tier1’s are easier to check, less prone (in principle) to outages o CPU requirements are very modest m What is LHCb’s concept of the Grid? o It is a set of computing resources working in a collaborative way o Provides computing resources for the collaboration as a whole o Recognition of contributions is independent on what type of jobs are run at a site P There are no noble and less noble tasks. All are needed to make the experiment a success P Resources are not made available for nationals o Resource high availability is the key issue

9 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 9 How to best achieve Distributed Computing? m Data Management is primordial o It was almost completely absent from EDG R&D P R&D took place but didn’t deliver anything usable o (Too) few resources are allocated in EGEE P Successful packages were developed in close collaboration with VOs P LFC, FTS: very close contacts with users P SRM v2.2 specification: done after experiments’ request and with their participation m Infrastructure is vital o Resource management o 24x7 support coverage o Reliable and powerful networks (OPN) m Resource sharing is a must o Less support needed o Best resource usage (less idle CPUs, empty tapes, unused networks…) o …. but opportunistic resources should not be neglected… EGEE

10 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 10 How to best achieve Distributed Computing (cont’d) m Workload Management o This received most development effort (EDG, EGEE) o Developments were not (are still not) done in so close collaboration with users P Experiments participate in TCG meetings, but their experience is not enough taken into account o Experiments had to develop their own solutions to implement what they needed o A bit of history…. P 2000-2004: EDG (R&D) P 2004: LCG ARDA RTAG - generated great hopes…. P 2004- EGEE WMS re-engineering - still not fully exposed to experiments and not at the expected level (although more stable) d Analysis tasks requires a 99% efficiency P In parallel, experiments developed their solutions to cope with these inefficiencies: AliEn, DIRAC d They also allow them to deal with heterogeneous Grids d … and take advantage of opportunistic resources

11 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 11 LHCb Distributed Computing software m Integrated WMS and DMS : DIRAC o Presentations by Andrei and Andrew on Sunday m Distributed analysis portal: GANGA o Presentation by Ulrik on Friday o Uses DIRAC W&DMS as back-end m Main characteristics o Implements late job scheduling o Overlay network (pilot agents, central task queue) o Allows LHCb policy to be enforced o Alleviates the level of support required from sites o LHCb services designed to be redundant and hence highly available (multiple instances with failover, VO-BOXes)

12 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 12 The LHCb Tier1s m 6 Tier1s o CNAF (IT, Bologna) o GridKa (DE, Karlsruhe) o IN2P3 (FR, Lyon) o NIKHEF (NL, Amsterdam) o PIC (ES, Barcelona) o RAL (UK, Didcot) m Contribute o o Reconstruction o Stripping o Analysis m Keeps copies on MSS of o Raw (2 copies shared) o Locally produced rDST o DST (2 copies) o MC data (2 copies) m Keeps copies on disk of o DST (7 copies)

13 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 13 LHCb Computing: a few numbers m Event sizes o on persistent medium (not in memory) m Processing time o Best estimates as of today m Requirements for 2008 o 4 10 6 seconds of beam TDR estimate Current estimate Event SizekB RAW2535 rDST2520 DST100110 Evt processingkSI2k.s Reconstruction2.4 Stripping0.2 Analysis0.3

14 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 14 Reconstruction requirements 2 passes per year: 1 quasi real time over ~100 day period (2.8 MSI2k) re-processing over 2 month period of shutdown (4.3 MSI2k) Make use of Filter Farm at pit (2.2 MSI2k) - data back to the pit b-exclusiveDimuonD*b-inclusiveTotal Input fraction0.10.30.150.451.0 Number of events8  10 8 2.4  10 9 1.2  10 9 3.6  10 9 8  10 9 MSS storage (TB)16482472160 CPU (MSI2k.yr)0.150.450.230.681.52

15 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 15 Stripping requirements Stripping 4 times per year - 1 month production outside of recons Stripping has at least 4 output streams Only rDST stored for “non-b” channels+RAW i.e. 55 kB RAW+full DST for “b” channels - i.e. 110kB Output on disk SE at all Tier-1 centres Exclusive-bdimuonD*Inclusive-bTotal Input fraction0.10.30.150.451.00 Reduction factor10551009.57 Event yield per stripping 8  10 7 4.8  10 8 2.4  10 8 3.6  10 7 8.4  10 9 CPU (MSI2k.year)0.020.060.030.020.11 Storage requirement per stripping (TB) 92613452 TAG (TB)12148

16 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 16 Simulation requirements -studies to measure performance of detector & event selection in particular regions of phase space -use large statistics dimuon & D* samples for systematics - reduced Monte Carlo needs ApplicationNos. of eventsCPU time/evt (kSI2k.s) Total CPU (MSI2k.year) SignalGauss8  10 8 751.9 Boole8  10 8 10.03 Brunel8  10 7 2.40.01 InclusiveGauss8  10 8 751.9 Boole8  10 8 10.03 Brunel8  10 7 2.40.01 Total3.87

17 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 17 Simulation storage requirements -Simulation still dominate LHCb CPU needs -Current evt size for Monte Carlo DST (with truth info) is ~400kB/evt; -Total storage needs 64TB in 2008 -Output at CERN and another 2 copies distributed over Tier-1 centres OutputNos. of eventsStorage/evt (kB) Total Storage (TB) SignalDST8  10 7 40032 TAG8  10 7 10.1 InclusiveDST8  10 7 40032 TAG8  10 7 10.1 Total64

18 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 18 Analysis requirements -user analysis accounted in model predominantly batch - ~30k jobs/year -predominantly analysing ~10 6 events -CPU of 0.3 kSI2k.s/evt -Analysis needs grow linearly with year in early phase of expt Nos. of physicist performing analysis140 Nos. of analysis jobs per physicist/week4 Event size reduction factor after analysis5 Number of “active” Ntuples10 2008 CPU needs (MSI2k.years)0.31 2008 Disk storage (TB)80

19 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 19 Summary (incl. efficiencies) for 2008 Data on disk 2008 (TB) RAWrDSTStrippedSimulationAnalysis 7643775375114 CPU needs in 2008 (MSI2k.yr) Recons.StrippingSimulationAnalysis 1.40.54.60.5 Data on tape 2008 (TB) RAWrDSTStrippedSimulationAnalysis 560320483128-

20 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 20 Summary & evolution of requirements

21 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 21 Conclusions m LHCb has proposed a Computing Model adapted at its specific needs (number of events, event size, low number of physics candidates) m Reconstruction, stripping and analysis resources located at Tier1s (and possibly some Tier2s with enough storage and CPU capacities) m CPU requirements dominated by Monte-Carlo, assigned to Tier2s and opportunistic sites o With DIRAC, even idle desktops / laptops could be used ;-) P LHCb@home ? m Requirements are modest compared to other experiments m DIRAC is well suited and adapted to this computing model o Integrated WMS and DMS m GANGA is being more and more used for submitting user analysis to the Grid m LHCb’s Computing should be ready when first data come

22 ICFA Workshop on Grid, Sinaia, Romania, 13-17 October 2006 LHCb Computing Model, PhC 22 Hot news! Stop press! 16 October 22:29 CET Test jobs running successfully at NIPNE! 16 October 21:32 CET


Download ppt "Outline: The LHCb Computing Model Philippe Charpentier, CERN ICFA workshop on Grid activities, Sinaia, Romania, 13-18 October 2006 01 10 100 111 011 01."

Similar presentations


Ads by Google