LHCb Data Quality Check http://lhcb-comp. web. cern
Procedures and Tools for Miriam Gandelman, IF-UFRJ Eric van Herwijnen, CERN
Monte Carlo Production
Distributed Infrastructure with Remote Agent Control DIRAC Distributed Infrastructure with Remote Agent Control Distributed MC production system Production task definition Software installation Job scheduling and monitoring Data Quality checking (with versioning) Automates tasks for local production managers Implemented with software agents
MC production architecture Data Quality service Bookkeeping service Production service Bookkeeping data Data Quality info Get jobs Agent CERN Agent Etc… RIO UFRJ Lyon
Data challenge 1 (59 days during feb – may 2003) Distributed system Data to be checked in different places More than 3 Tbytes produced
Python tools Python tools Checked log files Checked log files Checked log files Checked log files Python tools Production log file (to be checked) Reference tables Rich: Velo 98.09 0.18 Forward 91.57 0.36 Seed 5.49 0.31 Match 1.94 0.17 Upstream 3.3 0.21 Velott 2.38 0.18 Final 98.84 0.13 Number of Events Processed: 49912.0 Number of Ks Selected: 46121.0 Number of Ks per event: 0.924 VeloMonitor INFO Number of VeloClusters / Event VeloMonitor INFO ------------------------------------- TrAnalyse INFO Type Ghosts Bad Chi2 TrAnalyse INFO MC TrAnalyse INFO Velo 556 3.9% 0 0.0% TrAnalyse INFO TrAnalyse INFO Forward 322 6.7% 0 0.0% TrAnalyse INFO Seed 480 11.6% 59 1.4% TrAnalyse INFO Match 66 29.6% 0 0.0% TrAnalyse INFO Upstream 1612 69.5% 9 0.4% TrAnalyse INFO Unique Fwd 322 6.7% 0 0.0% Trigger: Velo 98.09 0.18 Forward 91.57 0.36 Seed 5.49 0.31 Match 1.94 0.17 Upstream 3.3 0.21 Velott 2.38 0.18 Final 98.84 0.13 Number of Events Processed: 49912.0 Number of Ks Selected: 46121.0 Number of Ks per event: 0.924 Tracking: Velo 98.09 0.18 Forward 91.57 0.36 Seed 5.49 0.31 Match 1.94 0.17 Upstream 3.3 0.21 Velott 2.38 0.18 Final 98.84 0.13 Number of Events Processed: 49912.0 Number of Ks Selected: 46121.0 Number of Ks per event: 0.924 Python tools
Production checks: of digitization (Boole) and reconstruction (Brunel) programs identified quality quantities (efficiencies, occupancies) results printed on web page (3 sigma deviations in red) for local managers to check locally produced data for software managers to do basic checks of new versions for physicists in case problems are found during analysis negative results of tests could give an alarm first tests done in production
Hbooks of Checked data Python, C++ New Hbook file To be checked ROOT reference file Reference Web pages Python, C++
Detailed check of new versions of programs: of generation (SICBMC), digitization (Boole) and reconstruction (Brunel) programs check e.g. multiplicities, efficiencies, particle identification use Kolmogorov test to compare (1=good, 0=bad) works (wait for something better) for 1d histograms Geant 4 team developing new tools, will use them when available first (minimal) set of histograms in production need more experience with these comparisons