LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration
LHCb Data Management LHCb Computing Model in a nutshell m RAW files (3GB) are transferred to Tier0 o Verify migration and checksum m Transfer to Tier1s o Each file replicated once (whole run at a single site) m Reconstruction o At CERN and Tier1s (up to 300 HS06.hours) P If needed Tier2s can also be used as “Tier1 co-processor” m Stripping o On Tier1 where SDST is available m MC simulation o Complex workflow: simulation, digitization, reconstruction, trigger, filtering o Running everywhere with low priority P Tier2, Tier1, CERN and unpledged resources (some non-Grid) m User analysis o Running at Tier1s for data access, anywhere for MC studies m Grid activities under control of LHCbDirac o LHCb extensions of the DIRAC framework (cf Poster o Many presentations at CHEP (again this time, talks and posters) CHEP 2012, New York City Poster #272, S.Roiser Poster #145, F.Stagni
LHCb Data Management Real data applications m Reconstruction: o 50,000 events (3GB), =20 HS06.s/evt m Stripping: o 50,000 events (2.5 GB), =2 HS06.s/evt o 14 streams m Merge: o 40,000 events (5 GB) m Input files are downloaded on local disk CHEP 2012, New York City SDST Brunel (recons) Brunel (recons) DaVinci (stripping and streaming) DaVinci (stripping and streaming) RAW DST1 Merge DST1
LHCb Data Management Data replication m Replication performed using FTS o Active data placement m Split disk (for analysis) and archive o No disk replica for RAW and SDST (read few times) m Using SRM spaces (T1D0 and T0D1) CHEP 2012, New York City SDST Brunel (recons) Brunel (recons) DaVinci (stripping and streaming) DaVinci (stripping and streaming) RAW DST1 Merge DST1 1 Tier1 (scratch) CERN + 1 Tier1 CERN + 1 Tier1 CERN + 3 Tier1s
LHCb Data Management Datasets m Granularity at the file level o Data Management operations (replicate, remove replica, delete file) o Workload Management: input/output files of jobs m LHCbDirac perspective o DMS and WMS use LFNs to reference files o LFN namespace refers to the origin of the file P Constructed by the jobs (uses production and job number) P Hierarchical namespace for convenience P Used to define file class (tape-sets) for RAW, SDST, DST P GUID used for internal navigation between files (Gaudi) m User perspective o File is part of a dataset (consistent for physics analysis) o Dataset: specific conditions of data, processing version and processing level P Files in a dataset should be exclusive and consistent in quality and content CHEP 2012, New York City
LHCb Data Management Replica Catalog (1) m Logical namespace o Reflects somewhat the origin of the file (run number for RAW, production number for output files of jobs) o File type also explicit in the directory tree m Storage Elements o Essential component in the DIRAC DMS o Logical SEs: several DIRAC SEs can physically use the same hardware SE (same instance, same SRM space) o Described in the DIRAC configuration P Protocol, endpoint, port, SAPath, Web Service URL P Allows autonomous construction of the SURL SURL = srm: : o SRM spaces at Tier1s P Used to have as many SRM spaces as DIRAC SEs, now only 3 P LHCb-Tape (T1D0) custodial storage P LHCb-Disk (T0D1) fast disk access P LHCb-User (T0D1) fast disk access for user data CHEP 2012, New York City
LHCb Data Management Replica Catalog (2) m Currently using the LFC o Master write service at CERN o Replication using Oracle streams to Tier1s o Read-only instances at CERN and Tier1s P Mostly for redundancy, no need for scaling m LFC information: o Metadata of the file o Replicas P Use “host name” field for the DIRAC SE name P Store SURL of creation for convenience (not used) d Allows lcg-util commands to work o Quality flag P One character comment used to set temporarily a replica as unavailable m Testing scalability of the DIRAC file catalog o Built-in storage usage capabilities (per directory) CHEP 2012, New York City Poster: A.Tsaregorodtsev
LHCb Data Management Bookkeeping Catalog (1) m User selection criteria o Origin of the data (real or MC, year of reference) P LHCb/Collision12 o Conditions for data taking of simulation (energy, magnetic field, detector configuration… P Beam4000GeV-VeloClosed-MagDown o Processing Pass is the level of processing (reconstruction, stripping…) including compatibility version P Reco13/Stripping19 o Event Type is mostly useful for simulation, single value for real data P 8 digit numeric code ( , ) o File Type defines which type of output files the user wants to get for a given processing pass (e.g. which stream) P RAW, SDST, BHADRON.DST (for a streamed file) m Bookkeeping search o Using a path / / / / / CHEP 2012, New York City
LHCb Data Management Bookkeeping Catalog (2) m Much more than a dataset catalog! m Full provenance of files and jobs o Files are input of processing steps (“jobs”) that produce files o All files ever created are recorded, each processing step as well P Full information on the “job” (location, CPU, wall clock time…) m BK relational database o Two main tables: “files” and “jobs” o Jobs belong to a “production” o “Productions” belong to a “processing pass”, with a given “origin” and “condition” o Highly optimized search for files, as well as summaries m Quality flags o Files are immutable, but can have a mutable quality flag o Files have a flag indicating whether they have a replica or not CHEP 2012, New York City
LHCb Data Management Bookkeeping browsing m Allows to save datasets o Filter, selection o Plain list of files o Gaudi configuration file m Can return files with only replica at a given location CHEP 2012, New York City
LHCb Data Management Dataset based transformation m Same mechanism used for jobs (workload management tasks) and data management m Input datasets based on a bookkeeping query o Tasks are data driven, used by both WMS and DMS CHEP 2012, New York City
LHCb Data Management Data management transformations m Replication transformations o Uses a policy implementing the Computing Model o Creates transfer tasks depending on the original location and space availability o Replication using a tree (not all from the initial source) P Optimized w.r.t. channel recent bandwidth and SE locality o Transfers are whenever possible performed using FTS P If not possible, 3 rd party gridftp transfer d If no FTS channel or for user files (special credentials) o Replicas are automatically registered in the LFC m Removal transformations o Used for retiring datasets or reducing the number of replicas o Used exceptionally to completely remove files (tests, bugs…) o Replica removal protected against last replica removal! o Transfers and removal use the Data Manager credentials CHEP 2012, New York City
LHCb Data Management Staging: using files from tape m If jobs use files that are not online (on disk) o Before submitting the job o Stage the file from tape, and pin it on cache m Stager agent o Performs also cache management o Throttle staging requests depending on the cache size and the amount of pinned data o Requires fine tuning (pinning and cache size) P Caching architecture highly site dependent P No publication of cache sizes (except Castor and StoRM) m Jobs using staged files o Check first the file is still staged P If not reschedule the job o Copies the file locally on the WN whenever possible P Space is released faster P More reliable access for very long jobs (reconstruction) or jobs using many files (merging) CHEP 2012, New York City
LHCb Data Management Data Management Operational experience m Transfer accounting o Per site, channel, user… m Storage accounting o Per dataset, SE, user… o User quotas P Not strictly enforced… CHEP 2012, New York City
LHCb Data Management Outlook m Improvements on staging o Improve the tuning of cache settings P Depends on how caches are used by sites o Pinning/unpinning P Difficult if files are used by more than one job m Popularity o Record dataset usage P Reported by jobs: number of files used in a given dataset P Account number of files used per dataset per day/week o Assess dataset popularity P Relate usage to dataset size o Take decisions on the number of online replicas P Taking into account available space P Taking into account expected need in the coming weeks o First rely on Data Manager receiving an advice P Possibly move to automated dynamic management CHEP 2012, New York City
LHCb Data Management Conclusions m LHCb uses two views for data management: o Dataset view as seen by users and productions o Files view as seen by Data Management tools (replication, removal) m Datasets are handled by the LHCb Bookkeeping system (part of LHCbDirac) m File view is generic and handled by the DIRAC DMS m The LHCb DMS gives full flexibility for managing data on the Grid m In the future LHCb expect to use popularity criteria for deciding on the number of replicas for each dataset o Should give more flexibility to the Computing Model CHEP 2012, New York City