LHCb system for distributed MC production (data analysis) and its use in Russia NEC’2005, Varna, Bulgaria Ivan Korolko (ITEP Moscow)
Outline LHCb detector Russian participation in LHCb LHCb distributed computing system –DIRAC –GANGA Plans for the future
Yoke Vertex Shielding Tracker Calorimeters RICH-2 Coil Muon Detector RICH-1 LHCb detector Designed for comprehensive studies of CP violation with B 500 physicists from 60 institutes
LHCb in numbers LHCb nominal Luminosity→ 2x10 32 cm -2 s -1 rate of p-p interactions→ 2x10 7 per second HLT output→ 2000 Hz RAW data per year→ 2x10 10 events (500 TB) Events with b quarks → 10 5 per second (!) acceptance for b-events → 5-10% Br. for CP channels → ~10 -5 Number of CP channels → ~50 ~5 signal events in every second of LHCb operation – GREAT! Have to select them from 1.5x10 7 background events Signature of LHCb signals is not very bright (P t, vertex) Estimation of S/B ratio is a REAL CHALLENGE
Russian participation in LHCb IHEP(Protvino), INP (Novosibirsk), INR (Troitsk), ITEP (Moscow), PNPI (St.Petersburg) SPD and Preshower, ECAL, HCAL, MUON system and RICH mirrors Design, construction and maintenance of detectors Development of reconstruction algorithms Historical interests in B physics
History of LHCb DCs in Russia K events, 1% contribution only one centre (ITEP) M events, 3% contribution all 4 our centers (IHEP,ITEP,JINR,MSU) M events, 5% contribution started to use LCG 2005 PNPI and INR have joined
LHCb Computing (TDR) LHCb will use as much as possible LCG provided capabilities computing resources (CPU and storage) software components Generic basic services provided by LCG workload management (job submission and follow-up) data management (storage, file transfer) Higher level integration and LHCb-specific tools will be provided by LHCb collaboration software releases, packaging, software distribution bookkeeping database workload management tool (DIRAC) distributed analysis tool (GANGA)
DIRAC( Distributed Infrastructure with Remote Agents’ Control Project combining LHCb specific components together with LCG general purpose components DIRAC - lightweight system built with a following requirements: support rapid development cycle, be able to accommodate evolving GRID opportunities, be easy to deploy on various platforms, transparent, easy and possibly automatic updates LHCb grid system for Monte-Carlo simulation and analysis
DIRAC design goals Designed to be highly adaptable to the use of ALL computing resources available for the LHCb collaboration LCG grid resources (mainly) sites not participating in LCG (still) desktop workstations (even) Simplicity of installation, configuring and operation. DIRAC was running on PBS, Condor, LSF, LCG The design goal was to create robust and scalable system for Computing needs of LHCb collaboration. running 10K concurrent jobs queuing 100K jobs handling 10M datasets
DIRAC architecture Uses the paradigm of Service Oriented Architecture (SOA) inspired by OGSA/OGSI “grid services” concept followed LCG/ARDA RTAG architecture blueprint ARDA inspiration open architecture with well defined interfaces allowing for replaceable, alternative services providing choices and competition Implemented in PYTHON using XML-RPC service access protocol
Interfacing DIRAC to LCG 1) Use standard LCG middleware for job scheduling straightforward but not yet reliable enough approach 2) Reservation of computing resources with pilot-agent Send simple script to LCG RB, which downloads and installs Standard DIRAC agent (needs only PYTHON on LCG site) WORKS PERFECTLY in 2004 and 2005
DIRAC Authors DIRAC development team TSAREGORODTSEV Andrei, GARONNE Vincent, STOKES-REES Ian, GRACIANI-DIAZ Ricardo, SANCHEZ-GARCIA Manuel, CLOSIER Joel, FRANK Markus, KUZNETSOV Gennady, CHARPENTIER Philippe Production site managers BLOUW Johan, BROOK Nicholas, EGEDE Ulrik, GANDELMAN Miriam, KOROLKO Ivan, PATRICK Glen, PICKFORD Andrew, ROMANOVSKI Vladimir, SABORIDO-SILVA Juan, SOROKO Alexander, TOBIN Mark, VAGNONI Vincenzo, WITEK Mariusz, BERNET Roland
2004 DC Phase 1 Statistics 3 months – 65 TB of data produced, transferred and replicated 185M events, 425 CPU years across 60 sites
2004 DC Phase 1 Statistics 43 LCG Sites (8 also DIRAC sites) 20 DIRAC Sites 7 Russian sites: DIRAC- 4 LCG- 3
Distributed Analysis GANAGA application Developed in cooperation with ATLAS Uses DIRAC to submit jobs 185M events were produced in 3 months Nobody was able to analyze them in 9 months
Plans for the nearest future Participate in LHCb Data Challenges producing MC for the collaboration (planed in comp. model) Concentrate on Distributed Analysis testing GANGA system in ITEP and IHEP We know how to produce MC and need only more resources much more difficult task absolutely different pattern of computer usage work was started already in June
Russian Tier2 Cluster planning CPUDISK TAPE active TAPE shelved link to CERN Significant increase of resources is planned for the nearest future Participation in LHCb DC (MC production) Distributed analysis KSI2KTB Mbps
For further reading LHCb reoptimized detector design and performance TDR CERN/LHCC LHCb Computing morel LHCb LHCb Computing TDR CERN/LHCC DIRAC - Distributed Infrastructure with Remote Agent Control A.Tsaregorodtsev et al., Proc of CHEP2003, March 2003 Results of the LHCb Data Challenge 2004 J.Closier et al., Proc. Of CHEP2004, Sept 2004