Grid Computing Oxana Smirnova NDGF- Lund University R-ECFA meeting in Sweden Uppsala, May 9, 2008
Computing challenges at LHC
“Full chain” of HEP data processing Slide adapted from Ch.Collins-Tooth and J.R.Catmore
ATLAS Monte Carlo data production flow (10 Mevents) Very different tasks/algorithms (ATLAS experiment in this example) Single “job” lasts from 10 minutes to 1 day Most tasks require large amounts of input and produce large output data Very different tasks/algorithms (ATLAS experiment in this example) Single “job” lasts from 10 minutes to 1 day Most tasks require large amounts of input and produce large output data
LHC computing specifics Data-intensive tasks Large datasets, large files Lengthy processing times Large memory consumption High throughput is necessary Very distributed computing and storage resources CERN can host only a small fraction of needed resources and services Distributed computing resources of modest size Produced and processed data are hence distributed, too Issues of coordination, synchronization, data integrity and authorization are outstanding
Software for HEP experiments Written by very many different authors in different languages (C++, Java, Python, Fortran) Dozens of external components Occupy as much as ~10 GB of disk space each release Massive pieces of software Every experiment produces a release as often as once a month during the preparation phase (which is now for LHC) Frequent releases Experiments can not afford supporting different operating systems and different computer configurations Difficult to set up outside the lab ALICE, ATLAS, PHENIX etc – all in many versions For a small university group it is very difficult to manage different software sets and maintain hardware Solution: use the Grid
Grid is a result of IT progress Graph from “The Triumph of the Light”, G. Stix, Sci. Am. January 2001 Computer speed doubles every 18 months Network speed doubles every 9 months Network vs. computer performance: Computers: 500 times faster Networks: times faster 1986 to 2000: Computers: 60 times faster Networks: 4000 times faster 2001 to 2010 (projected): Excellent wide area networks provide for a distributed supercomputer – the Grid “Operating system” of such a computer is Grid middleware
Some Grid projects; originally byVicky White, FNAL
Grids in LHC experiments Almost all Monte Carlo and data processing today is done via Grid There are 20+ Grid flavors out there Almost all are tailored for a specific application and/or specific hardware LHC experiments make use of 3 Grid middleware flavors: gLite ARC OSG All experiments develop own higher-level Grid middleware layers ALICE – AliEn ATLAS – PANDA and DDM LHCb – DIRAC CMS – ProdAgent and PhEDEx
ATLAS Experiment at CERN - Multi- Grid Infrastructure Graphics from a slide by A.Vaniachine
Nordic DataGrid Facility Provides a unique distributed “Tier1” center via NorduGrid/ARC Involves 7 largest Nordic academic HPC centers …plus a handful of University centers (Tier2 service) Connected to CERN directly with GEANT 10GBit fiber Inter-Nordic shared 10Gbit network from NORDUnet Budget: staff only, 2 MEUR/year, by Nordic research councils
Swedish contribution: SweGrid InvestmentTime Cost, KSEK Six clusters (6x100 cores) including 12 TB FC disk Dec Disk storage part 1, 60 TB SATA May Disk storage part 2, 86.4 TB SATA May Centre Tape volume, TB Cost, KSEK HPC2N PDC NSC SweGrid in LocationProfile HPC2N (Umeå)IT UPPMAX (Uppsala)IT, HEP PDC (Stockholm)IT C3SE (Gothenburg)IT NSC (Linköping)IT Lunarc (Lund)IT, HEP Co-funded by the Swedish Research Council and the Knut and Alice Wallenberg foundation One technician per center Middleware: ARC, gLite 1/3 allocated to LHC Computing
SweGrid and NDGF usage
Swedish contribution to LHC-related Grid R&D NorduGrid (Lund, Uppsala, Umeå, Linköping, Stockholm and others) Produces ARC middleware, 3 core developers are in Sweden SweGrid: tools for Grid accounting, scheduling, distributed databases Used by NDGF, other projects NDGF: interoperability solutions EU KnowARC (Lund, Uppsala + 7 partners) 3 MEUR project (3 years), develops next generation ARC. Project’s technical coordinator is in Lund EU EGEE (Umeå, Linköping, Stockholm)
Summary and outlook Grid technology is vital for the success of LHC Sweden contributes very substantially with hardware, operational support and R&D Very high efficiency Sweden has signed MoU with LHC Computing Grid in March 2008 Pledge of long-term computing service for LHC SweGrid2 is coming A major upgrade of SweGrid resources Research Council granted 22.4 MSEK for investments and operation in 43 MSEK more are being requested for years Includes not just Tier1, but also Tier2 and Tier3 support