Download presentation
Presentation is loading. Please wait.
1
LHCb Readiness for Run 2 2015 WLCG Workshop Okinawa
Stefan Roiser / CERN IT-SDC for LHCb Distributed Computing
2
LHCb Run2 Readiness - StR
Content Online changes with impact for Offline Offline Data Processing Offline Data Management Services LHCb relies on 11 April '15 LHCb Run2 Readiness - StR
3
Preamble – LHC Evolution
Run 1 Planned for Run 2 Max beam energy 4 TeV 6.5 TeV Transverse beam emittance 1.8 μm 1.9 μm β* (beam oscillation) 0.6 m / LHCb 3 m 0.4 m / LHCb 3 m Number of bunches 1374 2508 Max protons per bunch 1.7 * 1011 1.15 * 1011 Bunch spacing 50 ns 25 ns LHC Maximum Luminosity 7.7 * 1033 cm-2s-1 1.6 * 1034 cm-2s-1 LHCb Maximum Luminosity 4 * 1032 cm-2s-1 LHCb μ (avg # collisions/crossing) 1.6 1.2 ATLAS & CMS LHCb NB: LHCb uses “luminosity leveling”, ie. the “in time pile up” and so the instantaneous luminosity stays constant for LHCb during an LHC fill 11 April '15 LHCb Run2 Readiness - StR
4
Pit & Online
5
LHCb Run2 Readiness - StR
Trigger Scheme Hardware trigger reduce event rate to ~ 1 MHz High Level Trigger computing farm split into HLT1 with partial event reconstruction, output will be buffered on local disks HLT 1 output used for detector calibration and alignment (O(hours)). (was done offline in Run 1) HLT2 runs deferred with signal event reconstruction very close to offline reconstruction 12.5 kHz event rate to OFFLINE At ~ 60kB event size this is ~ 750 MB/s Event rate was 4.5 kHz in Run 1 NB: Because of deferred trigger, very little availability of HLT for offline data processing See also Marco’s talk tomorrow on further evolution for future Runs 11 April '15 LHCb Run2 Readiness - StR
6
HLT Output Stream Splitting
12.5 kHz to Storage 10 kHz Full (+Parked) Stream 2.5 kHz Turbo Stream 10 kHz go to classic Offline reconstruction / stripping on distributed computing resources If needed part of this can be “parked” and processed in LS 2 New concept of “Turbo Stream” in Run 2 for ~ 2.5 kHZ i.e. wherever sufficient, take the HLT output with its event reconstruction directly for physics analysis Initially RAW information included, will be stripped off Offline S. Benson, “The LHCb Turbo Stream”, T1, Village Center, Thu 10am, 11 April '15 LHCb Run2 Readiness - StR
7
Data Processing
8
LHCb Run2 Readiness - StR
Legend: Offline Processing Workflow Application File Type Storage Element (M)DST Stripping RAW Reconstr. X 5GB, 1x FULL.DST Tape X RAW 24h Reconstr. 5GB, 1x FULL.DST Buffer 3GB, 1x Buffer 6h Stripping The RAW input file is available on Disk Buffer Reconstruction runs ~ 24 h, 1 input RAW, 1 output FULL.DST to Disk Buffer Asynchronous migration of FULL.DST from Disk Buffer to Tape Stripping (DaVinci) runs on 1 or 2 input files (~ 6h/file), output several unmerged (M)DST files (one per “stream”) to Disk Buffer Input FULL.DST removed from Disk Buffer asynchronously Rerun the above workflows for one run Once a stream reaches 5 GB of unmerged (M)DSTs (up to O(100) files), Merging runs ~ 15 – 30 mins, output one merged (M)DST file to Disk Input (M)DST files removed from Disk Buffer asynchronously … unmerged (M)DST O(MB) 1x Buffer X Merging 30m (M)DST … 5GB, 1x Disk 27 Jan '15 11 April '15 Run2 Operations - StR LHCb Run2 Readiness - StR 8
9
Offline Data Processing Changes
What is reconstructed offline is supposed to be the final reconstruction pass Calibration / Alignment from HLT used also offline No reprocessing (reco) foreseen before end of Run 2 Expecting a higher stripping retention because of calibration and alignment done ONLINE Partly damped by moving most physics streams to M(icro)DST format (Note: MDST O(10kB/Event), DST O(120kB/Event)) All files from one “LHCb Run” are forced to reside on the same storage A run is the smallest granularity for physics analysis files E.g. would reduce impact in case a disk breaks Workflow execution is now also possible on Tier 2 sites Needed because of increase of collected data 11 April '15 LHCb Run2 Readiness - StR
10
Workflow Execution Location
GRIDKA RAW Reco Strppg Merge FULL.DST unm. DST DST CNAF RAW Reco FULL.DST Manchester RAW Reco FULL.DST Manchester RAW Reco FULL.DST CNAF RAW Reco Strippg Merge FULL.DST unm. DST DST X X Data Processing workflow executed by default at Tier 0/1 sites (stays the same as in Run 1) For Run 2 in addition we allow A Tier 2 site to participate for a certain Job Type remotely (most useful would be Reconstruction) Any Tier 2 is allowed at any time to participate on any Job Type (no static 1 to 1 “attaching” anymore) In principle the system also allows for ANY site to participate on any Job Type remotely Monarc (technically) 11 April '15 LHCb Run2 Readiness - StR
11
LHCb Run2 Readiness - StR
All Workflows Run 1 Run 2 Data Processing T 0 / 1 T 0 / 1 / 2 Monte Carlo T 2 (can also run on T 0 / 1 sites if resources available) User analysis T 0 / 1 / 2D (without input data can also run on T2) Very flexible computing model allows almost all workflows to be executed on every tier level / resource type Interested in running multicore jobs – especially on VMs – but no pressing need for it “Elastic MC” – knows the work / event, at start of the payload will calculate on the fly how many events to produce until the “end of the queue” User analysis – least amount of work but highest priority in the central task queue F. Stagni, “Jobs masonry with elastic Grid Jobs”, T4, B250, Mo 5pm 11 April '15 LHCb Run2 Readiness - StR
12
LHCb Run2 Readiness - StR
Compute Resources Non virtualized Virtualized “Classic Grid” – CE, batch system, … BOINC – volunteer computing Non pledged – commercial, HPC, … Vac – self managed cloud resources HLT farm – little use during Run 2 Vcycle – interaction via IaaS A. McNab, “Managing virtual machines with Vac and Vcycle”, T7, C210, Mo 5pm Expect rampup of virtualized infrastructures during Run 2 All environments served by the same pilot infrastructure talking to one LHCb/DIRAC central task queue F. Stagni, “Pilots 2.0: DIRAC pilots for all the skies”, T4, B250, Mo 2pm 11 April '15 LHCb Run2 Readiness - StR
13
Data Management
14
LHCb Run2 Readiness - StR
Data Storage Introduced concept of Tier 2D(isk) sites i.e. Tier 2 sites with disk areas >= 300 TB No more direct processing from “tape caches” foreseen Interact with disk buffer via FTS3 and process from there E.g. pre-staging “Legacy Run 1 Stripping” data Should lead to reduction of disk cache size in front of tape 11 April '15 LHCb Run2 Readiness - StR
15
LHCb Run2 Readiness - StR
Data Storage (ctd) C. Haen, “Federating LHCb datasets using the Dirac File Catalog”, T3, C209, Mo 4.45pm Catalogs File Catalog: provides replica information, recently migrated from LCG File Catalog to the Dirac File Catalog Bookkeeping (unchanged) provides data provenance information Data Popularity Data collected since 2012 M. Hushchyn, “Disk storage mgmt for LHCb on Data Popularity”, T3, C209, Tue 6.15pm 11 April '15 LHCb Run2 Readiness - StR
16
Data Access Operations
Gaudi Federation In use since last fall. LHCb analysis jobs create a local replica catalog for their input data. If the local copy is not available -> fall back to next remote replica. Data access protocols SRM Shall be in use for tape interactions … and for writing to storage (job output upload, data replication) Xroot LHCb will be constructing turls for input data on the fly without SRM interaction for disk resident data access Need single and stable xroot endpoint per storage element HTTP/WEBDAV All storage sites are equipped with Http/Webdav access Could be used as second access protocol 11 April '15 LHCb Run2 Readiness - StR
17
Underlying Services
18
LHCb Run2 Readiness - StR
Services CVMFS Building block for LHCb distributed computing, distributes all software and conditions data CernVM Vac, vcycle, BOINC are using CernVM 3 FTS 3 Vital for LHCb WAN transfers and tape interaction (pre-staging of input data) Several WLCG monitoring services in use SAM 3, dashboards, network monitoring Working on perfSonar data extraction into LHCbDIRAC HTTP Federation Builds on top of http/webdav access, provides easy access to LHCb data namespace Development on top ongoing for data consistency checks F. Furano, “Seamless access to LHCb HTTP/WebDAV storage”, Mo/Tue, Poster Sess. A 11 April '15 LHCb Run2 Readiness - StR
19
LHCb Run2 Readiness - StR
Summary LHCb is ready for Run 2 Several changes introduced in Run 2 Calibration/Alignment in the HLT farm Closer integration of Tier 2 sites in data processing New Dirac file replica catalog deployed Disk resident data access via direct xroot CVMFS and FTS3 are key “external” services 11 April '15 LHCb Run2 Readiness - StR
20
LHCb Run2 Readiness - StR
Goodie page 11 April '15 LHCb Run2 Readiness - StR
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.