Presentation is loading. Please wait.

Presentation is loading. Please wait.

LHCb Computing and Grid Status Glenn Patrick LHCb(UK), Dublin – 23 August 2005.

Similar presentations


Presentation on theme: "LHCb Computing and Grid Status Glenn Patrick LHCb(UK), Dublin – 23 August 2005."— Presentation transcript:

1 LHCb Computing and Grid Status Glenn Patrick LHCb(UK), Dublin – 23 August 2005

2 Glenn PatrickLHCb(UK) – 23 August 2005 2 Computing completes TDRs Jan 2000 June 2005

3 Glenn PatrickLHCb(UK) – 23 August 2005 3 LHCb – June 2005 03 June 2005 HCAL MF1-MF3 Mu-filters MF4 LHCb Magnet ECAL

4 Glenn PatrickLHCb(UK) – 23 August 2005 4 Grid World Online System 40 MHz Level-0 Hardware 1 MHz Level-1 Software 40 kHz HLT Software 2 kHz Tier 0 Raw Data: 2kHZ, 50MB/s Tier 1

5 Glenn PatrickLHCb(UK) – 23 August 2005 5 HLT Output b-exclusivedimuonD*b-inclusiveTotal Trigger Rate (Hz)2006003009002000 Fraction10%30%15%45%100% Events/year (10 9 )263920 200 Hz Hot Stream Will be fully reconstructed on online farm in real time. “Hot stream” (RAW + rDST) written to Tier 0. 2kHz RAW data written to Tier 0 for reconstruction at CERN and Tier 1s. Calibration for proper-time resolution. Clean peak allows PID calibration. Understand bias on other B selections.

6 Glenn PatrickLHCb(UK) – 23 August 2005 6 Data Flow Reconstruction Brunel Simulation Gauss Digitisation Boole Analysis DaVinci MC Truth Raw Data DST Analysis Objects Stripped DST Framework - Gaudi Detector Description Conditions Database Event Model/Physics Event Model

7 Glenn PatrickLHCb(UK) – 23 August 2005 7 Grid Architecture Tier 1 centre (RAL) + 4 virtual Tier 2 centres LCG-2/EGEE World’s Largest Grid! ~16,000 CPU and 5PB over 192 sites in ~39 countries GridPP provides ~3,000 CPU at 20 UK sites

8 Glenn PatrickLHCb(UK) – 23 August 2005 8 Grid Ireland EGEE made up of regions. UKI region consists of 3 federations: GridPP Grid Ireland National Grid Service We are here

9 Glenn PatrickLHCb(UK) – 23 August 2005 9 LHCb Computing Model 14 candidates CERN Tier 1 essential for accessing “hot stream” for 1.First alignment & calibration. 2.First high-level analysis.

10 Glenn PatrickLHCb(UK) – 23 August 2005 10 LHC Comparison ExperimentTIER 1TIER 2 ALICEReconstructionMC productionChaotic analysis ATLASReconstructionSimulation Scheduled analysis/strimmingAnalysisCalibration CMSReconstructionAnalysis for 20-100 users All simulation prodn. LHCbReconstructionMC production Scheduled strimmingNo analysis. Chaotic analysis

11 Glenn PatrickLHCb(UK) – 23 August 2005 11 Distributed Data RAW DATA 500 TB CERN = Master Copy 2 nd copy distributed over six Tier 1s STRIPPING 140 TB/pass/copy Pass 1: During data taking at CERN and Tier 1s (7 months) Pass 2: After data taking at CERN and Tier 1s (1 month) RECONSTRUCTION 500TB/pass Pass 1: During data taking at CERN and Tier 1s (7 months) Pass 2: During winter shutdown at CERN, Tier 1s and online farm (2months) Pass 3: During shutdown at CERN, Tier 1s and online farm Pass 4: Before next year data taking at CERN and Tier 1s (1 month)

12 Glenn PatrickLHCb(UK) – 23 August 2005 12 Check File integrity DaVinci stripping Check File integrity DaVinci stripping Check File integrity DaVinci stripping Stripping Job - 2005 Read INPUTDATA and stage them in 1 go Check File status Not yet Staged Prod DB group2 group1 groupN staged Send bad file info Check File integrity DaVinci stripping Good file Merging process DST and ETC ETC DST Send file info Usage of SRM Stripping runs on reduced DSTs (rDST). Pre-selection algorithms categorise events into streams. Events that pass are fully reconstructed and full DSTs written. CERN, CNAF, PIC used so far – sites based on CASTOR.

13 Glenn PatrickLHCb(UK) – 23 August 2005 13 LHCb Resource Profile Global CPU (MSI2k.yr) 20062007200820092010 CERN0.270.540.901.251.88 Tier-1’s1.332.654.425.558.35 Tier-2’s2.294.597.65 TOTAL3.897.7812.9714.4517.88 Global DISK (TB) 20062007200820092010 CERN24849682610951363 Tier-1’s7301459243228973363 Tier-2’s71423 TOTAL9841969328140154749 Global MSS (TB) 20062007200820092010 CERN408825135928574566 Tier-1’s6221244207442857066 TOTAL103020693433714411632

14 Glenn PatrickLHCb(UK) – 23 August 2005 14 2008 2009 2010 Comparisons - CPU Tier 1 CPU – integrated (Nick Brook) LHCb

15 Glenn PatrickLHCb(UK) – 23 August 2005 15 Comparisons- Disk LCG TDR – LHCC, 29.6.2005 (Jurgen Knobloch) 54% pledged CERN Tier-1 Tier-2

16 Glenn PatrickLHCb(UK) – 23 August 2005 16 UK Tier 1 Status Total Available (August 2005) CPU=796 KSI2K (500 dual cpu) Disk =187 TB (60 servers) Tape=340 TB Minimum Required Tier 1 2008 CPU=4732 KSI2K Disk=2678 TB Tape=2538 TB LHCb(UK) 2008 (15% share) CPU=663 KSI2K Disk=365 TB Tape=311 TB LHCb(UK) 2008 (1/6 share) CPU=737 KSI2K Disk=405 TB Tape=346 TB

17 Glenn PatrickLHCb(UK) – 23 August 2005 17 UK Tier 1 Utilisation Hardware purchase scheduled for early 2005 postponed. PPARC discussions ongoing. Capacity Grid Non-Grid 70% Grid use increasing. CPU “undersubscribed” (but efficiencies of Grid jobs may be a problem). LHCb 69% Jan-July 2005 CPU/Walltime < 50% for some ATLAS jobs

18 Glenn PatrickLHCb(UK) – 23 August 2005 18 UK Tier 1 Exploitation BaBar LHCb ATLAS 2004 LHCb ATLAS BaBar 17.8.05 2005

19 Glenn PatrickLHCb(UK) – 23 August 2005 19 UK Tier 1 Storage Classic SE not sufficient as LCG storage solution. SRM now the agreed interface to storage resources. Lack of SRM prevented data stripping at UK Tier 1. This year, new storage infrastructure deployed for UK Tier 1. Storage Resource Manager (SRM) – Interface providing a combined view of secondary and tertiary storage to Grid clients. dCache – Disk Pool Management system jointly developed by DESY and FermiLab. Single namespace to manage 100s of TB of data. Access via GRIDFTP and SRM. Interfaced to RAL tapestore. CASTOR under evaluation as replacement for home-grown (ADS) tape service. CCLRC to deploy 10,000 tape robot? LHCb now has disk allocation of 8.2TB with 4x1.6TB under dCache control (c.f. BaBar=95TB, ATLAS=19TB, CMS=40TB). Computing Model says LHCb Tier 1 should have ~122TB in 2006…

20 Glenn PatrickLHCb(UK) – 23 August 2005 20 UK Tier 2 Centres CPUDisk ALICEATLASCMSLHCbALICEATLASCMSLHCb London0.01.00.80.40.00.20.311.0 NorthGrid0.02.50.00.30.01.30.012.1 ScotGrid0.00.20.00.20.0 39.6 SouthGrid0.20.50.20.30.00.10.06.8 Committed Resources available to experiment at Tier-2 in 2007 Size of an average Tier-2 in experiment's computing model Hopefully, more resources from future funding bids e.g. SRIF3 April 2006 – March 2008 Under Delivered - Tier1+Tier2 (March 2005) CPU = 2277 KSI2K out of 5184 KSI2K, DISK = 280TB out of 968TB Improving as hardware is deployed in the Tier 2 institutes.

21 Glenn PatrickLHCb(UK) – 23 August 2005 21 Tier 2 Exploitation Over 40 sites in UKI federation of EGEE + over 20 Virtual Organisations. GRIDPP only. Does not include Grid Ireland. 17 Aug, Grid Operations Centre LHCb CMS ATLAS 800 data points – improved accounting prototype on the way… …but you get the idea. Tier 2 sites are vital LHCb Grid resource. BaBar

22 22 DIRAC Architecture DIRAC Job Management Service DIRAC Job Management Service DIRAC CE LCG Resource Broker Resource Broker CE 1 DIRAC Sites Agent CE 2 CE 3 Production manager Production manager GANGA UI User CLI JobMonitorSvc JobAccountingSvc AccountingDB Job monitor InformationSvc FileCatalogSvc MonitoringSvc BookkeepingSvc BK query webpage BK query webpage FileCatalog browser FileCatalog browser User interfaces DIRAC services DIRAC resources DIRAC Storage DiskFile gridftp bbftp rfio Services Oriented Architecture

23 Glenn PatrickLHCb(UK) – 23 August 2005 23 Data Challenge 2004 DIRAC alone LCG in action 1.8 10 6 /day LCG paused Phase 1 Completed 3-5 10 6 /day LCG restarted 187 M Produced Events 20 DIRAC sites + 43 LCG sites were used. Data written to Tier 1s. Overall, 50% of events produced using LCG. At end, 75% produced by LCG. UK second largest producer (25%) after CERN.

24 Glenn PatrickLHCb(UK) – 23 August 2005 24 RTTC - 2005 Real Time Trigger Challenge – May/June 2005 150M Minimum bias events to feed online farm and test software trigger chain. Completed in 20 days (169M events) on 65 different sites. 95% produced with LCG sites 5% produced with “native” DIRAC sites Average of 10M events/day. Average of 4,000 cpus Countries Events Produced UK60 M Italy42 M Switzerland23 M France11 M Netherlands10 M Spain8 M Russia3 M Greece2.5 M Canada2 M Germany0.3 M Belgium0.2M Sweden0.2 M Romany, Hungary, Brazil, USA 0.8 M 37%

25 Glenn PatrickLHCb(UK) – 23 August 2005 25 Looking Forward SC3 LHC Service Operation Full physics run 200520072006 2008 First physics First beams cosmics SC4 Next Challenge SC3 – Sept. 2005 Start DC06 Processing phase May 2006 Alignment/calibration Challenge October 2006 Ready for data taking April 2007 Analysis at Tier 1s Nov. 2005 Excellent support from UK Tier 1 at RAL. 2 application support posts at Tier 1 appointed in June 2005 BUT LHCb(UK) technical co-ordinator still to be appointed.

26 Glenn PatrickLHCb(UK) – 23 August 2005 26 LHCb and SC3 Phase 1 (Sept. 2005  ): a)Movement of 8TB of digitised data from CERN/Tier 0 to LHCb Tier 1 centres in parallel over a 2 week period (~10k files). Demonstrate automatic tools for data movement and bookkeeeping. b)Removal of replicas (via LFN) from all Tier 1 centres. c)Redistribution of 4TB data from each Tier 1 centre to Tier 0 and other Tier 1 centres over a 2 week period. Demonstrate data can be redistributed in real time to meet stripping demands. d)Moving of stripped DST data (~1TB, 190k files) from CERN to all Tier 1 centres. Phase 1 (Sept. 2005  ): a)Movement of 8TB of digitised data from CERN/Tier 0 to LHCb Tier 1 centres in parallel over a 2 week period (~10k files). Demonstrate automatic tools for data movement and bookkeeeping. b)Removal of replicas (via LFN) from all Tier 1 centres. c)Redistribution of 4TB data from each Tier 1 centre to Tier 0 and other Tier 1 centres over a 2 week period. Demonstrate data can be redistributed in real time to meet stripping demands. d)Moving of stripped DST data (~1TB, 190k files) from CERN to all Tier 1 centres. Phase 2 (Oct. 2005  ): a)MC production in Tier 2 centres with DST data collected in Tier 1 centres in real time followed by stripping in Tier 1 centres (2 months). Data stripped as it becomes available. b)Analysis of stripped data in Tier 1 centres. Phase 2 (Oct. 2005  ): a)MC production in Tier 2 centres with DST data collected in Tier 1 centres in real time followed by stripping in Tier 1 centres (2 months). Data stripped as it becomes available. b)Analysis of stripped data in Tier 1 centres.

27 Glenn PatrickLHCb(UK) – 23 August 2005 27 GridPP Status GRIDPP1 Prototype Grid £17M, complete September 2001 – August 2004 GRIDPP2 Production Grid £16M, ~20% complete September 2004 – August 2007 Beyond August 2007? Funding from September 2007 will be incorporated as part of PPARC’s request for planning input for LHC exploitation. To be considered by panel (G. Lafferty, S. Watts & P. Harris) providing input to the Science Committee in the autumn.  Input from ALICE, ATLAS, CMS, LHCb and GRIDPP.

28 Glenn PatrickLHCb(UK) – 23 August 2005 28 SC3 LHC Service Operation Full physics run 200520072006 2008 First physics First beams cosmics SC4 LCG Status LCG-2 (=EGEE-0) prototyping product 2004 2005 LCG-3 (=EGEE-x?) product LCG has two phases. Phase 1: 2002 – 2005 Build a service prototype, based on existing grid middleware Gain experience in running a production grid service Produce the TDR for the final system LCG and experiment TDRs submitted Phase 2: 2006 – 2008 Build and commission the initial LHC computing environment We are here

29 Glenn PatrickLHCb(UK) – 23 August 2005 29 UK:Workflow Control Primary event Spill-over event Production Desktop Gennady Kuznetsov (RAL) Gauss BGauss MB Boole BBoole MB Brunel BBrunel MB Sim Digi Reco Software installation Gauss execution Check logfile Dir listing Bookkeeping report Steps Modules Used for RTCC and current production/stripping.

30 Glenn PatrickLHCb(UK) – 23 August 2005 30 Web Browser Bookkeeping ARDA Server TCP/IP Streaming ARDA Client API Tomcat Servlet ARDA Client API GANGA application UK: LHCb Metadata and ARDA Carmine Cioffi (Oxford) Testbed underway to measure performance with ARDA and ORACLE servers.

31 Glenn PatrickLHCb(UK) – 23 August 2005 31 AtlasPROD DIAL DIRAC LCG2 gLite localhost LSF submit, kill get output update status store & retrieve job definition prepare, configure Ganga4 Job scripts Gaudi Athena AtlasPROD DIAL DIRAC LCG2 gLite localhost LSF + split, merge, monitor, dataset selection UK: GANGA Grid Interface Karl Harrison (Cambridge) Alexander Soroko (Oxford) Alvin Tan (Birmingham) Ulrik Egede (Imperial) Andrew Maier (CERN) Kuba Moscicki (CERN) Ganga 4 beta release 8 th July

32 Glenn PatrickLHCb(UK) – 23 August 2005 32 UK: Analysis with DIRAC Task-Queue Agent Job executes on WN DIRAC Job Installs software Closest SE Data as LFN Matching Check for all SE’s which have data If no data specified Software Installation + Analysis via DIRAC WMS Stuart Patterson (Glasgow) DIRAC API for analysis job submission [ Requirements = other.Site == "DVtest.in2p3.fr"; Arguments = "jobDescription.xml"; JobName = "DaVinci_1"; OutputData = { "/lhcb/test/DaVinci_user/v1r0/LOG/DaVinci_v12r11.alog" }; parameters = [ STEPS = "1"; STEP_1_NAME = "0_0_1" ]; SoftwarePackages = { "DaVinci.v12r11" }; JobType = "user"; Executable = "$LHCBPRODROOT/DIRAC/scripts/jobexec"; StdOutput = "std.out"; Owner = "paterson"; OutputSandbox = { "std.out", "std.err", "DVNtuples.root", "DaVinci_v12r11.alog", "DVHistos.root" }; StdError = "std.err"; ProductionId = "00000000"; InputSandbox = { "lib.tar.gz", "jobDescription.xml", "jobOptions.opts" }; JobId = ID ] PACMAN DIRAC installation tools See later talk!

33 Half way there! But the climb gets steeper and there may be more mountains beyond 2007 2005 Monte-Carlo Production on the Grid 2007 Data Taking Data Stripping Distributed Analysis Distributed Reconstruction Conclusion DC04 DC03


Download ppt "LHCb Computing and Grid Status Glenn Patrick LHCb(UK), Dublin – 23 August 2005."

Similar presentations


Ads by Google