Download presentation
Presentation is loading. Please wait.
Published byEthan Jones Modified over 9 years ago
1
LHCb The LHCb Data Management System Philippe Charpentier CERN On behalf of the LHCb Collaboration
2
LHCb Data Management LHCb Computing Model in a nutshell m RAW files (3GB) are transferred to Tier0 o Verify migration and checksum m Transfer to Tier1s o Each file replicated once (whole run at a single site) m Reconstruction o At CERN and Tier1s (up to 300 HS06.hours) P If needed Tier2s can also be used as “Tier1 co-processor” m Stripping o On Tier1 where SDST is available m MC simulation o Complex workflow: simulation, digitization, reconstruction, trigger, filtering o Running everywhere with low priority P Tier2, Tier1, CERN and unpledged resources (some non-Grid) m User analysis o Running at Tier1s for data access, anywhere for MC studies m Grid activities under control of LHCbDirac o LHCb extensions of the DIRAC framework (cf Poster o Many presentations at CHEP (again this time, talks and posters) Philippe.Charpentier@cern.ch2 CHEP 2012, New York City Poster #272, S.Roiser Poster #145, F.Stagni
3
LHCb Data Management Real data applications m Reconstruction: o 50,000 events (3GB), =20 HS06.s/evt m Stripping: o 50,000 events (2.5 GB), =2 HS06.s/evt o 14 streams m Merge: o 40,000 events (5 GB) m Input files are downloaded on local disk Philippe.Charpentier@cern.ch3 CHEP 2012, New York City SDST Brunel (recons) Brunel (recons) DaVinci (stripping and streaming) DaVinci (stripping and streaming) RAW DST1 Merge DST1
4
LHCb Data Management Data replication m Replication performed using FTS o Active data placement m Split disk (for analysis) and archive o No disk replica for RAW and SDST (read few times) m Using SRM spaces (T1D0 and T0D1) Philippe.Charpentier@cern.ch4 CHEP 2012, New York City SDST Brunel (recons) Brunel (recons) DaVinci (stripping and streaming) DaVinci (stripping and streaming) RAW DST1 Merge DST1 1 Tier1 (scratch) CERN + 1 Tier1 CERN + 1 Tier1 CERN + 3 Tier1s
5
LHCb Data Management Datasets m Granularity at the file level o Data Management operations (replicate, remove replica, delete file) o Workload Management: input/output files of jobs m LHCbDirac perspective o DMS and WMS use LFNs to reference files o LFN namespace refers to the origin of the file P Constructed by the jobs (uses production and job number) P Hierarchical namespace for convenience P Used to define file class (tape-sets) for RAW, SDST, DST P GUID used for internal navigation between files (Gaudi) m User perspective o File is part of a dataset (consistent for physics analysis) o Dataset: specific conditions of data, processing version and processing level P Files in a dataset should be exclusive and consistent in quality and content Philippe.Charpentier@cern.ch5 CHEP 2012, New York City
6
LHCb Data Management Replica Catalog (1) m Logical namespace o Reflects somewhat the origin of the file (run number for RAW, production number for output files of jobs) o File type also explicit in the directory tree m Storage Elements o Essential component in the DIRAC DMS o Logical SEs: several DIRAC SEs can physically use the same hardware SE (same instance, same SRM space) o Described in the DIRAC configuration P Protocol, endpoint, port, SAPath, Web Service URL P Allows autonomous construction of the SURL SURL = srm: : o SRM spaces at Tier1s P Used to have as many SRM spaces as DIRAC SEs, now only 3 P LHCb-Tape (T1D0) custodial storage P LHCb-Disk (T0D1) fast disk access P LHCb-User (T0D1) fast disk access for user data Philippe.Charpentier@cern.ch6 CHEP 2012, New York City
7
LHCb Data Management Replica Catalog (2) m Currently using the LFC o Master write service at CERN o Replication using Oracle streams to Tier1s o Read-only instances at CERN and Tier1s P Mostly for redundancy, no need for scaling m LFC information: o Metadata of the file o Replicas P Use “host name” field for the DIRAC SE name P Store SURL of creation for convenience (not used) d Allows lcg-util commands to work o Quality flag P One character comment used to set temporarily a replica as unavailable m Testing scalability of the DIRAC file catalog o Built-in storage usage capabilities (per directory) Philippe.Charpentier@cern.ch7 CHEP 2012, New York City Poster: A.Tsaregorodtsev
8
LHCb Data Management Bookkeeping Catalog (1) m User selection criteria o Origin of the data (real or MC, year of reference) P LHCb/Collision12 o Conditions for data taking of simulation (energy, magnetic field, detector configuration… P Beam4000GeV-VeloClosed-MagDown o Processing Pass is the level of processing (reconstruction, stripping…) including compatibility version P Reco13/Stripping19 o Event Type is mostly useful for simulation, single value for real data P 8 digit numeric code (12345678, 90000000) o File Type defines which type of output files the user wants to get for a given processing pass (e.g. which stream) P RAW, SDST, BHADRON.DST (for a streamed file) m Bookkeeping search o Using a path / / / / / Philippe.Charpentier@cern.ch8 CHEP 2012, New York City
9
LHCb Data Management Bookkeeping Catalog (2) m Much more than a dataset catalog! m Full provenance of files and jobs o Files are input of processing steps (“jobs”) that produce files o All files ever created are recorded, each processing step as well P Full information on the “job” (location, CPU, wall clock time…) m BK relational database o Two main tables: “files” and “jobs” o Jobs belong to a “production” o “Productions” belong to a “processing pass”, with a given “origin” and “condition” o Highly optimized search for files, as well as summaries m Quality flags o Files are immutable, but can have a mutable quality flag o Files have a flag indicating whether they have a replica or not Philippe.Charpentier@cern.ch9 CHEP 2012, New York City
10
LHCb Data Management Bookkeeping browsing m Allows to save datasets o Filter, selection o Plain list of files o Gaudi configuration file m Can return files with only replica at a given location Philippe.Charpentier@cern.ch10 CHEP 2012, New York City
11
LHCb Data Management Dataset based transformation m Same mechanism used for jobs (workload management tasks) and data management m Input datasets based on a bookkeeping query o Tasks are data driven, used by both WMS and DMS Philippe.Charpentier@cern.ch11 CHEP 2012, New York City
12
LHCb Data Management Data management transformations m Replication transformations o Uses a policy implementing the Computing Model o Creates transfer tasks depending on the original location and space availability o Replication using a tree (not all from the initial source) P Optimized w.r.t. channel recent bandwidth and SE locality o Transfers are whenever possible performed using FTS P If not possible, 3 rd party gridftp transfer d If no FTS channel or for user files (special credentials) o Replicas are automatically registered in the LFC m Removal transformations o Used for retiring datasets or reducing the number of replicas o Used exceptionally to completely remove files (tests, bugs…) o Replica removal protected against last replica removal! o Transfers and removal use the Data Manager credentials Philippe.Charpentier@cern.ch12 CHEP 2012, New York City
13
LHCb Data Management Staging: using files from tape m If jobs use files that are not online (on disk) o Before submitting the job o Stage the file from tape, and pin it on cache m Stager agent o Performs also cache management o Throttle staging requests depending on the cache size and the amount of pinned data o Requires fine tuning (pinning and cache size) P Caching architecture highly site dependent P No publication of cache sizes (except Castor and StoRM) m Jobs using staged files o Check first the file is still staged P If not reschedule the job o Copies the file locally on the WN whenever possible P Space is released faster P More reliable access for very long jobs (reconstruction) or jobs using many files (merging) Philippe.Charpentier@cern.ch13 CHEP 2012, New York City
14
LHCb Data Management Data Management Operational experience m Transfer accounting o Per site, channel, user… m Storage accounting o Per dataset, SE, user… o User quotas P Not strictly enforced… Philippe.Charpentier@cern.ch14 CHEP 2012, New York City
15
LHCb Data Management Outlook m Improvements on staging o Improve the tuning of cache settings P Depends on how caches are used by sites o Pinning/unpinning P Difficult if files are used by more than one job m Popularity o Record dataset usage P Reported by jobs: number of files used in a given dataset P Account number of files used per dataset per day/week o Assess dataset popularity P Relate usage to dataset size o Take decisions on the number of online replicas P Taking into account available space P Taking into account expected need in the coming weeks o First rely on Data Manager receiving an advice P Possibly move to automated dynamic management Philippe.Charpentier@cern.ch15 CHEP 2012, New York City
16
LHCb Data Management Conclusions m LHCb uses two views for data management: o Dataset view as seen by users and productions o Files view as seen by Data Management tools (replication, removal) m Datasets are handled by the LHCb Bookkeeping system (part of LHCbDirac) m File view is generic and handled by the DIRAC DMS m The LHCb DMS gives full flexibility for managing data on the Grid m In the future LHCb expect to use popularity criteria for deciding on the number of replicas for each dataset o Should give more flexibility to the Computing Model Philippe.Charpentier@cern.ch16 CHEP 2012, New York City
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.