CMS Production Management Software Julia Andreeva CERN CHEP conference 2004
CHEP20042Julia Andreeva, CERN Overview Role of CMS production system in the CMS Data Challenge of 2004 (DC04) Software components of the CMS production system (functionality, implementation, communication between them), their use in DC04 Encountered problems and DC04 lessons Conclusions
CHEP20043Julia Andreeva, CERN CMS DC04 tasks 2004 Data Challenge was focused to implement the following tasks: Simulate a sustained data-taking rate equivalent to 25 Hz at a luminosity of 0,2x10 34 cm –2 s –1 for one month, which would correspond to 25% of LHC startup. Distribute data files produced at Tier-0 (CERN) to Tiers-1 Run analysis at Tiers-1 as soon as data files get available at Tier-1, “real-time” analysis User analysis Production system was responsible for the first task
CHEP20044Julia Andreeva, CERN CMS DC04 phases Pre-challenge Production, started in July 2003 RefDB MSS META information DATA files Generation Simulation Digitization Reconstruction at Tier-0 (CERN), March-April 2004 Transfer agent drop-box Castor POOL META information DATA files Data distribution to Tiers-1, March-April 2004 MSS Tier-0 RefDB MSS Tier-1 RLSTMDB Updates Data files transfer Real-time analysis, March-April 2004 User analysis
CHEP20045Julia Andreeva, CERN Pre-challenge production (PCP) Pre-challenge production included generation, simulation and digitization production steps Pre-challenge production ran in the distributed heterogeneous environment : - 35 regional centers in Asia, Europe and USA participated - Different local batch systems as well as two different Grid flavors (Grid3 and LCG) were used Big scale of requested/produced data as well as environment variety both contribute to the complexity of the production software.
CHEP20046Julia Andreeva, CERN PCP, planning and reality According to the original planning: –50 million events –Simulation: 5 months until ~October 2003 –Digitisation: October+November 2003 (1Mevt/day) –Shipping data to the T0: November+December 2003 ~1 TB per day for 2 months, not at all trivial at that time –DC04 rehearsals scaling up Q403 Jan04 The reality: –>75 million events requested –50 million events simulated by the start of DC04 –Digitization code delivered in January Only ~10 million of digitized events delivered before the start of DC04 Digitisation continued through DC04 –DC04 began (with essentially no rehearsal) on March 1st
CHEP20047Julia Andreeva, CERN Production progress for digitization requests July 2003-May 2004 ORCA_7_6_1 released 2x1033 digitisation Start of DC04 End of DC04 24 Mevents, 6 weeks
CHEP20048Julia Andreeva, CERN Subsystems of the OCTOPUS project ( CMS production software project) RefDB - CMS Monte Carlo Reference Database MySQL database + WEB interface to it (PHP,CGI) McRunjob – Monte Carlo Run Job, workflow planner for production processing, Framework (OOPython) for creating/submission of large batches of production jobs BOSS - Batch Object Submission System Local book-keeping and real –time monitoring system (C++, MySQL) UpdateRefDB – update of RefDB with the meta information sent by every successful job (Perl module running at CERN as a crontab job) DAR – Distribution After Release, distribution system for CMS application software (Perl module providing a system for distributing of a required version of a given CMS physics application to the production regional centers)
CHEP20049Julia Andreeva, CERN Overview of the CMS production cycle RefDB McRunjob Phys. Group Prod. Manager Site Manager Exec. Script JDL DAG LocalBatch System Grid (LCG) Scheduler DAGMan (MOP) BOSS Local Farm LCG GRID3
CHEP200410Julia Andreeva, CERN RefDB McRunjob Successfully Accomplished jobs RefDBUpdate All necessary instructions and meta information for jobs creation, job-splitting Job meta information Assigning requests, following request processing, getting production statistics, managing applications, templates, datacards, software distributions, monitoring components, publishing information, requesting DAR distribution Production manager Getting information about available collections, submitting requests, browsing existing and inserting new datacards, executables, applications Physicist Getting an assignment, following request processing, getting production statistics, updating publishing information RC manager Functionality Recording of the production requests done by the physicists Distribution of the work to the production RCs and tracing of the production progress Central source of production instructions for the workflow planner Book-keeping and metadata catalog of the produced data DAR Request for creating of DAR distribution with description of the distribution content
CHEP200411Julia Andreeva, CERN McRunjob Modular framework for - creating batches of production jobs, possible combining several processing steps in one job - submitting them to different type of environment: different batch systems or different Grid flavors - following the progress of job processing through a tracking directory - publishing of the processed data collections - creating of the POOL xml catalog for the collection chain - updating initialized COBRA META files with produced collection data
CHEP200412Julia Andreeva, CERN BOSS McRunjob Executable Scheduler BOSS BOSS DB Worker node Executable BOSS wrapper User creates the filters where he defines which parameters in the standard output he wants to trace, or what actions to implement when a given pattern is found in the STDOUT Filters used by CMS production jobs are stored in RefDB Provides local book-keeping and real time monitoring
CHEP200413Julia Andreeva, CERN Data formats and applications used for reconstruction Reconstruction is done with ORCA – Object Oriented Reconstructed for CMS Analysis which uses CMS COBRA (Coherent Object Oriented Base for Reconstruction, Analysis and Simulation) framework (C++) Reconstruction uses as an input an output of digitization step (simulation of the DAQ process). Output of digitization step is called “Digis” An output of reconstruction step is called DST ( data summary tapes). Both input and output files are in the POOL format
CHEP200414Julia Andreeva, CERN DC04 reconstruction Reconstruction was run at Tier-0 (CERN) No chance for testing of the reconstruction code in the production scale before the start of DC04 Several releases of ORCA during 2 months of data challenge, bug fixing and code improvements (ORCA 7_7_1, ORCA 8_0_0, ORCA 8_0_1) Many reruns on the same input data, in total ~24Mln events have been reconstructed 25 Hz corresponds to 1000 jobs per day using 100% of 500 CPUs, this rate was reached during limited periods of time, not permanently. Bottleneck was not running reconstruction at a given rate but further data distribution.
CHEP200415Julia Andreeva, CERN Integrating reconstruction in the production machinery Reconstruction code was released just before DC04 started Integrating of the reconstruction in the production machinery was done very quickly (order of couple of days) Development of the procedure for the publishing of the produced data took longer, since multiple output collections produced by writeStreams executable were not supported by RefDB schema and required some work around.
CHEP200416Julia Andreeva, CERN DC04 data flow at Tier-0 RefDB McRunjob T0 worker nodes GDB castor pool Export Buffers Transfer agent RLSTMDB Reconstruction Instructions and meta information Reconstruction jobs Reconstructed data Reconstructed data Checks what had arrived Updates Summaries of successful jobs Castor tapes Input Buffer Input data files (Digis) Transfer agent drop-box Job meta information
CHEP200417Julia Andreeva, CERN META information of the reconstruction job Sent by reconstruction job to RefDB: RunNumber (set equal to input run number, predefined by RefDB, key value) POOL XML catalog fragment, containing only data files produced by a given RUN ( no real PFN information there, ‘./’ instead) Used for recreating of the POOL XML catalog for the whole data collection RUNID value (used by COBRA for attachment of data files to COBRA META files) Time in seconds (clock time) to run executable Validation status This information is sent by mail to RefDB by every successfully accomplished job in a summary file, which gets processed by UpdateRefDB module. As a result run is validated in RefDB. Put to the transfer-agent drop-box POOL XML catalog fragment, used by transfer agent to update RLS catalog CheckSum file, containing check sums of the produced files, used by transfer agent to update TMDB “Go” flag, indicates that files of a given run are ready for distribution This information is used for data distribution to the Tiers-1
CHEP200418Julia Andreeva, CERN DC04 lessons Main problems in the production machinery had been discovered at the level of publishing information about available data collections to the users and distribution of the META information required for data access These problems are addressed by the current development driven by the production team: Publishing procedure was better automated and improved to solve performance problems. Distributed system for publishing of the catalogs for data collections available CMS-wide is developed. First prototype of PubDB (Publishing Data Base ) is already deployed at CERN, FZK, INFN and PIC User interface to RefDB for getting information about available collections is improved following suggestions of the physicists community
CHEP200419Julia Andreeva, CERN Conclusions CMS production software proved to be quite flexible and was very quickly updated for supporting of DC04 reconstruction No serious problems were discovered in the part related to running jobs at the given rate and in the production book-keeping system. Ongoing development is focused to improve a system for publication to the CMS physics community information about available data collections and for distribution of meta information required for data access.