А.Минаенко Совещание по физике и компьютингу, 16 сентября 2009 г., ИФВЭ, Протвино Текущее состояние и ближайшие перспективы компьютинга для АТЛАСа в России.

Slides:

Advertisements

Similar presentations

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.

Advertisements

Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.

SLUO LHC Workshop, SLACJuly 16-17, Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed.

T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.

Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

David Adams ATLAS DIAL Distributed Interactive Analysis of Large datasets David Adams BNL March 25, 2003 CHEP 2003 Data Analysis Environment and Visualization.

ATLAS Analysis Model. Introduction On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo, I. Hinchliffe, S. Menke, ATL- GEN-INT )

The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.

DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.

Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.

December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.

External and internal data traffic in Tier-2 ATLAS farms. Sketch of farm organization Some approximate estimate s of internal and external data flows in.

Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

LHCb computing in Russia Ivan Korolko (ITEP Moscow) Russia-CERN JWGC, October 2005.

Data Import Data Export Mass Storage & Disk Servers Database Servers Tapes Network from CERN Network from Tier 2 and simulation centers Physics Software.

LHC: ATLAS Experiment meeting “Conditions” data challenge Elizabeth Gallas - Oxford - August 29, 2009 XLDB3.

LCG Middleware Testing in 2005 and Future Plans E.Slabospitskaya, IHEP, Russia CERN-Russia Joint Working Group on LHC Computing March, 6, 2006.

ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.

F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,

А.Минаенко Совещание по физике и компьютингу, 03 февраля 2010 г. НИИЯФ МГУ, Москва Текущее состояние и ближайшие перспективы компьютинга для АТЛАСа в России.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.

ATLAS Distributed Analysis Experiences in STEP'09 Dan van der Ster for the DA stress testing team and ATLAS Distributed Computing WLCG STEP'09 Post-Mortem.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

Tier 3 Computing Doug Benjamin Duke University. Tier 3’s live here Atlas plans for us to do our analysis work here Much of the work gets done here.

Architecture and ATLAS Western Tier 2 Wei Yang ATLAS Western Tier 2 User Forum meeting SLAC April

1 LCG-France sites contribution to the LHC activities in 2007 A.Tsaregorodtsev, CPPM, Marseille 14 January 2008, LCG-France Direction.

V.Ilyin, V.Gavrilov, O.Kodolova, V.Korenkov, E.Tikhonenko Meeting of Russia-CERN JWG on LHC computing CERN, March 14, 2007 RDMS CMS Computing.

1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.

1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.

10/03/2008A.Minaenko1 ATLAS computing in Russia A.Minaenko Institute for High Energy Physics, Protvino JWGC meeting 10/03/08.

USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.

SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.

Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,

PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.

Data Placement Intro Dirk Duellmann WLCG TEG Workshop Amsterdam 24. Jan 2012.

Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)

David Stickland CMS Core Software and Computing

Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.

ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.

UKI-LT2-RHUL ATLAS STEP09 report Duncan Rand on behalf of the RHUL Grid team.

Data processing Offline review Feb 2, Productions, tools and results Three basic types of processing RAW MC Trains/AODs I will go through these.

14/03/2007A.Minaenko1 ATLAS computing in Russia A.Minaenko Institute for High Energy Physics, Protvino JWGC meeting 14/03/07.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

8 August 2006MB Report on Status and Progress of SC4 activities 1 MB (Snapshot) Report on Status and Progress of SC4 activities A weekly report is gathered.

EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Response of the ATLAS Spanish Tier2 for.

Main parameters of Russian Tier2 for ATLAS (RuTier-2 model) Russia-CERN JWGC meeting A.Minaenko IHEP (Protvino)

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

LHCb Computing activities Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group.

LHCb 2009-Q4 report Q4 report LHCb 2009-Q4 report, PhC2 Activities in 2009-Q4 m Core Software o Stable versions of Gaudi and LCG-AA m Applications.

Analysis framework plans A.Gheata Offline week 13 July 2011.

WLCG November Plan for shutdown and 2009 data-taking Kors Bos.

ATLAS TIER3 in Valencia Santiago González de la Hoz IFIC – Instituto de Física Corpuscular (Valencia)

ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.

BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.

ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.

Data Challenge with the Grid in ATLAS

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Readiness of ATLAS Computing - A personal view

RDIG for ALICE today and in future

Artem Trunov and EKP team EPK – Uni Karlsruhe

Ákos Frohner EGEE'08 September 2008

Stephen Burke, PPARC/RAL Jeff Templon, NIKHEF

R. Graciani for LHCb Mumbay, Feb 2006

ATLAS DC2 & Continuous production

The LHCb Computing Data Challenge DC06

Presentation transcript:

А.Минаенко Совещание по физике и компьютингу, 16 сентября 2009 г., ИФВЭ, Протвино Текущее состояние и ближайшие перспективы компьютинга для АТЛАСа в России

Layout Ru-Tier-2 tasks Analysis Model for the First Year (AMFY) Ru-Tier-2 resources Ru-Tier-2 resources needed for 2010 Current activity STEP09 28/04/2009A.Minaenko2

28/04/2009 ATLAS RuTier-2 tasks Russian Tier-2 (RuTier-2) computing facility is planned to supply with computing resources all 4 LHC experiments including ATLAS. It is a distributed computing center including at the moment computing farms of 6 institutions: ITEP, RRC-KI, SINP (all Moscow), IHEP (Protvino), JINR (Dubna), PNPI (St.Petersburg) The main RuTier-2 task is providing facilities for physics analysis of collected data using mainly AOD, DPD and user derived data formats It includes also development of reconstruction algorithms using limited subsets of ESD and Raw data 50 active ATLAS users are supposed to carry on physics data analysis at RuTier-2 Now group ru is created in the framework of ATLAS VO. It includes physicists intending to carry analysis in RuTier-2 and the group list contains 61 names at the moment. The group will have privilege of write access to local RuTier-2 disk resources (space token LOCALGROUPDISK) All the data used for analysis should be stored on disks and some unique data (user, group DPD) to be saved on tapes The second important task is production and storage of MC simulated data The full size of data and CPU needed for their analysis are proportional to the collected statistics. The resources needed should constantly grow with the increase of the number of collected events 3A.Minaenko

Analysis data formats (AMFY) DPD format for first year analysis –Recommendation: only one format between AOD and ntuple (merged D1PD/D2PD), called dAOD (suggest also renaming PerfDPD to dESD, without however changing their definition at this point) –dAOD driven by group analysis needs, possibility for added group info (example: top D2PD – here directly produced from AOD: top- dAOD) –Coordinated via PC ( Signal of one group useful for background studies of others) –Should be exercised for MC09 with final formats for first year –Analyses not covered use AOD directly All data formats used for data need to be available in MC as well 28/04/20094A.Minaenko

Where to run your jobs (AMFY) Physics analysis should start from AOD (or dAOD) –Note dESD have AOD objects included –Analysing AOD in general needs DB access –Oracle servers are only available at CERN/Tier1s – will be supplemented in the next weeks/months by increasing amounts of Frontier/Squid caches on Tier2 (Tier3, laptop, …) sites, along with POOL files in appropriate data elements –DB Release (as used for reprocessing, simulation) will become backup solution  Run jobs on Tier2s to analyse (d)AOD and produce ntuples –Tier1s maybe involved for group production, Tier3s depending on availability of data sets 28/04/20095A.Minaenko

Skeleton physics analysis model 09/10 (AMFY) AOD dAOD AODfix Analysis group driven definitions coordinated by PC, May have added meta data to allow ARA- only analysis from here User file PAT ntuple dumper keep track of tag versions of meta-data, lumi-info etc Direct or Frontier/Squid DB access Pool files Use of TAG Athena [main selection & reco work] User format [final complex analysis steps] 2-3 times reprocessing from RAW in 2009/10 With release/cacheRe-produce for reprocessed data and significant meta-data updates May have several forms (left to the user): Pool file (ARA analysis) Root Tree Histograms … results Port developed analysis algorithms back to Athena as much as possible Data super-set of good runs for this period 28/04/20096A.Minaenko

28/04/2009 Current RuTier-2 resources for ATLAS CPU, kSI2kDisc, TBATLAS Disk, TB IHEP ITEP JINR RRC-KI PNPI SINP MEPhI FIAN Total Red – sites for user analysis of ATLAS data, the other for simulation only Now the main type of LHC grid jobs is official production jobs and CPU resources are at the moment dynamically shared by all 4 LHC VO on the base of equal access rights for each VO at each site Later, when analysis jobs will take larger part of RuTier-2 resources, the sharing of CPU resources will be proportional to VO disk share at each site This year CPU resources will be increased by 1000 kSI2k and disk – by 1000 TB 7A.Minaenko

28/04/2009 Estimate of resources needed to fulfil RuTier-2 tasks in 2010 The announced effective time available for physics data taking during long run is 0.6*10 7 seconds ATLAS DAQ system event recording rate is 200 events per second, i.e. whole expected statistics is 1.2*10 9 events The current AOD event size is 150 KB, 1.5 times larger than ATLAS computing model requirement and hardly it will be easily decreased to the moment of the data taking Full expected size of the current AOD version is equal to 180 TB It is necessary to keep 30-40% of the previous AOD version for comparisons. This gives full AOD size of 230 TB - DATADISK During first years of LHC work the very important task is study of the detector performance characteristics and the task requires more detailed information than the one available on the AOD. ATLAS plans to use for these task “performance DPDs” which are prepared on the base of ESD. Targeted full performance DPD size is equal to full AOD size, i.e. another 230 TB - DATADISK The expected physics DPD size (official physics DPD produced by physics analysis groups) is at the level of 0.5 of full AOD size, i.e. 120 TB more - GROUPDISK 50 TB should be reserved for local users usage (ntuples, histograms kept at LOCALGROUPDISK token, 1 TB per user) - LOCALGROUPDISK Expected size of simulated AOD for MC08 (10 TeV) events only is equal to 80 TB, so need to reserve for full simulated AOD about 150 TB -MCDISK It is necessary to keep some samples of ESD and RAW events and, probably calibration data So, minimal requirement for needed disk space is at the level of TB Using usual CPU/disk ratio 3/1 one gets kSI2k estimate for needed CPU resources 8A.Minaenko

28/04/2009 ATLAS RuTier-2 and data distribution The sites of RuTier-2 are associated with ATLAS Tier-1 SARA Now all 6 sites IHEP, ITEP, JINR, RRC-KI, SINP, PNPI are included in TiersOfAtlas list and FTS channels are tuned for data transfers to/from the sites 4 sites of them (IHEP, JINR, RRC-KI, PNPI) will be used by ATLAS data analysis and all physics data need for analysis will be kept at these sites. The other 2 sites will be used for MC simulations only All sites successfully participating in data transfer functional tests (next 2 slides). This is a coherent data transfer test Tier-0 →Tiers-1→Tiers-2 for all clouds, using existing SW to generate and replicate data and to monitor data flow. Now this is a regular activity done once per week. All the data transmitted during FTs are deleted at the end of each week. The volume of the data used for functional tests is at the level of 10% of data obtained during real data taking RuTier-2 is now subscribed to get all simulated AOD, DPD, Tags as well as cosmic AOD. The data transfer is done automatically under steering and control of central ATLAS DDM (Distributed Data Management) group The currently used shares (40%,30%,15%, 15% for RRC-KI, JINR, IHEP, PNPI) correspond to disk resources available for ATLAS at the sites The similar scheme will be used for real data transfer to RuTier-2 MC data are transferred to MCDISK space token and cosmic and future real data – to DATADISK space token at RuTier-2 9A.Minaenko

Main Tasks of ATLAS STEP09 (first two weeks of June) 28/04/200910A.Minaenko

ATLAS STEP09 tasks for Ru-Tier-2 The main activities in Ru-Tier-2 during STEP09 include data replication, providing facilities for simulation and analysis jobs. All at the rate and proportion which are expected during real data taking Data replication implies data transfer to the Ru-Tier-2 sites and it includes two types of data: –About 80 TB MC simulated AOD and DPD for long term storage. These data need to be used by analysis jobs submitted by ATLAS gangarobot during STEP09 and they will be used later for real analysis carried out by ATLAS users. The data to be stored at the MCDISK space token –About 110 TB of faked MC data which are needed only to imitate data flow with the rate corresponding to that during LHC work. These data to be stored at the DATADISK space token The following shares of DATADISK data were defined for our sites: RRC-KI – 50%, JINR – 20%, IHEP, PNPI – 15% each. This corresponds to free disk space available at the sites 28/04/200911A.Minaenko

STEP09 Data Distribution 28/04/200912A.Minaenko

ATLAS STEP09 tasks for Ru-Tier-2 Analysis jobs were submitted using two different backends: gLite WMS and Panda WMS jobs used POSIX IO to fetch data to WN, i.e. this is quasi online method. For this purpose usually native protocols are used: dcap for dCache SE, rfio for DPM SE. Input size – about 30 GB/job Panda used File stager. At the beginning of work all needed files were fetched to the local disk of the WN with lcg-cp command. This command in turn uses gsiftp protocol. Input size – about 10 GB/job ATLAS requested to tune local job schedulers to support during STEP09 the following shares between different job types –Simulation jobs (Role=production) – 50% –Panda analysis jobs (Role=pilot) – 25% –All the other including WNS analysis jobs – 25% For RDIG sites we additionally requested 5% for jobs of group atlas/ru This requirement is crucial for successful analysis because otherwise numerous long simulation jobs simply suppress short analysis jobs Nothing of this has been done at RDIG sites and, as I understand, at RRC-KI and PNPI even Role=pilot was not implemented for ATLAS 28/04/200913A.Minaenko

STEP09 Analysis Jobs Workflow 28/04/200914A.Minaenko

STEP09 DDM Results 28/04/200915A.Minaenko

STEP09 Analysis Global Results 28/04/200916A.Minaenko

STEP09 Analysis: 2 Backends 28/04/200917A.Minaenko

Analysis jobs in all clouds 28/04/2009 Next slides: Ru-Tier-2 sites marked in red; in yellow - sites with considerably larger contribution (better results) than IHEP 18A.Minaenko

Analysis jobs in NL cloud 28/04/200919A.Minaenko

Panda Analysis Jobs 28/04/200920A.Minaenko

Panda Analysis Jobs 28/04/200921A.Minaenko

WMS Analysis Jobs 28/04/200922A.Minaenko

Problems with Analysis in Ru-Tier-2 sites IHEP: the results are rather good for both backends. 40% of all analyzed in the NL cloud events were treated here and there are not so many sites in grid which make a considerably larger contribution (marked in yellow in the previous slides) 28/04/2009 IHEPefficiencyHzCPU/Wall#events WMS84%9.739%300 M Panda95%12.438%250 M Problems: Poor scheduling as in all the other sites. Even fare share between VOs was not provided properly. This is due to the fact that different job types (production, pilot, usual) had different priorities and job wait time had too large influence. Required shares between different groups of ATLAS jobs was not provided as well. In general the results of such scheduling were unpredictable. Further two slides from STE09 post-mortem are placed which hints that this is rather MAUI scheduler problem Atlas bug with libgfal.so name during ROOT library build. The problem affected WMS jobs decreasing their efficiency PNPI: there was one but severe problem – very narrow SE → WNs bandwidth which did not permit any real analysis at the site PNPIefficiencyHzCPU/Wall#events WMS21%0.811%4 M Panda0%--- 23A.Minaenko

Problems with Analysis in Ru-Tier-2 sites JINR: the results are not satisfactory especially for WMS but this is rather due to ATLAS than site problems 28/04/2009 JINRefficiencyHzCPU/Wall#events WMS4%11.080%6 M Panda55%8.628%215 M Problems: Atlas bug with libgfal.so name during ROOT library build. Killed practically all WMS jobs. Site uses gsidcap protocol but not dcap and in this case the library is always required Panda jobs use lcg-cp command to fetch input files to the local WN disk at the very beginning of work. lcg-cp in turn uses gsiftp protocol for the data transfer. In case of dCache SE all gsiftp traffic goes through gridftp doors placed at some nodes of the SE. At the very beginning of STEP09 there was only 1 gridftp door at the site then the number was increased up to 3 but anyway the value 1 or 3 Gbps is much less than really available SE → WNs bandwidth leading to a bottleneck and job fails. ATLAS must use only native protocols (dcap/gsidcap, rfio) for inter-farm data transfers 24A.Minaenko

Problems with Analysis in Ru-Tier-2 sites RRC-KI: the results are rather poor also. The efficiency and frequency are low for both backends as well as CPU/Walltime for WMS 28/04/2009 Problems –POSIX IO fails when trying to read next event –lcg-cp timeouts in 1800 sec. This and above problems are due to a bottleneck in SE → WNs pipe line. The farm network configuration need to be seriously corrected to solve the problems –lcg-cr fails at attempts to write small files (logfile and output ROOT file with histograms) to the SCRATCHDISK. This opposite traffic SE ← WNs is rather small and it should not suffer from the bottleneck. The reason is obviously due to some SE configuration error and has not been found and fixed yet RRC-KIefficiencyHzCPU/Wall#events WMS63%1.610%50 M Panda18%6.639%18 M 25A.Minaenko

Job Scheduling Problem, slide from STEP09 post-mortem 28/04/200926A.Minaenko

28/04/2009 Job Scheduling Problem, slide from STEP09 post-mortem 27A.Minaenko

Summary STEP09 Data transfer part of STEP09 was successful in general. The measures have been taken to make the RRC-KI SE work more stable; the external bandwidth at IHEP has been already increased up to 1 Gbps The main problem: the farms in RRC-KI and PNPI need to be reconfigured to remove bottlenecks. The farm design in the other sites need to be checked and, probably, revised also to take up future challenges. Possible solutions are presented in the enclosed presentation Fix all other found problems and repeat analysis exercises as soon as possible Very serious and urgent problem: new scheduler is needed or tune maui properly if it is possible. This is not only our problem but rather general LCG problem. Efficient analysis will not be possible without solution of the problem It is necessary to estimate output bandwidth of our separate fileservers of different types as well as full SE output bandwidth at each site for each VO. This is necessary to understand how many analysis jobs can accept each the site (see enclosed presentation) In future: try to increase frequency up to 20 Hz and CPU/wall up to 100% Raise question about libgfal.so library problem as well as the problem of usage by ATLAS analysis jobs gsiftp protocol for inter-farm data transfers 28/04/200928A.Minaenko