US ATLAS T2/T3 Workshop at UChicagoAugust 19, 2009 1 Profiling Analysis at Startup Charge from Rob & Michael Best guess as to the loads analysis will place.

Slides:

Advertisements

Similar presentations

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.

Advertisements

Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.

SLUO LHC Workshop, SLACJuly 16-17, Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed.

Duke Atlas Tier 3 Site Doug Benjamin (Duke University)

ATLAS Analysis Model. Introduction On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo, I. Hinchliffe, S. Menke, ATL- GEN-INT )

DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.

External and internal data traffic in Tier-2 ATLAS farms. Sketch of farm organization Some approximate estimate s of internal and external data flows in.

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

BINP/GCF Status Report BINP LCG Site Registration Oct 2009

3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.

LHC Computing Review - Resources ATLAS Resource Issues John Huth Harvard University.

Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.

OSG Area Coordinator’s Report: Workload Management April 20 th, 2011 Maxim Potekhin BNL

US ATLAS T2/T3 Workshop at UTANovember 11, Profiling Analysis at Startup Jim Cochran Iowa State Outline: - Computing model status - Expected Early.

Meeting, 5/12/06 CMS T1/T2 Estimates à CMS perspective: n Part of a wider process of resource estimation n Top-down Computing.

Brookhaven Analysis Facility Michael Ernst Brookhaven National Laboratory U.S. ATLAS Facility Meeting University of Chicago, Chicago 19 – 20 August, 2009.

DPDs and Trigger Plans for Derived Physics Data Follow up and trigger specific issues Ricardo Gonçalo and Fabrizio Salvatore RHUL.

PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.

Karsten Köneke October 22 nd 2007 Ganga User Experience 1/9 Outline: Introduction What are we trying to do? Problems What are the problems? Conclusions.

ATLAS Bulk Pre-stageing Tests Graeme Stewart University of Glasgow.

PD2P The DA Perspective Kaushik De Univ. of Texas at Arlington S&C Week, CERN Nov 30, 2010.

Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,

PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.

University user perspectives of the ideal computing environment and SLAC’s role Bill Lockman Outline: View of the ideal computing environment ATLAS Computing.

Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.

NA62 computing resources update 1 Paolo Valente – INFN Roma Liverpool, Aug. 2013NA62 collaboration meeting.

U.S. ATLAS Computing Facilities Overview Bruce G. Gibbard Brookhaven National Laboratory U.S. LHC Software and Computing Review Brookhaven National Laboratory.

Biweekly DOSAR Meeting, 6/26/08 1 A Week in the Life of University X ATLAS Group, 2010 (worrying about analysis paralysis) Jim Cochran Iowa State Univ.

Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)

The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.

David Stickland CMS Core Software and Computing

Performance DPDs and trigger commissioning Preparing input to DPD task force.

1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.

TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.

ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.

14/03/2007A.Minaenko1 ATLAS computing in Russia A.Minaenko Institute for High Energy Physics, Protvino JWGC meeting 14/03/07.

Victoria, Sept WLCG Collaboration Workshop1 ATLAS Dress Rehersals Kors Bos NIKHEF, Amsterdam.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.

Main parameters of Russian Tier2 for ATLAS (RuTier-2 model) Russia-CERN JWGC meeting A.Minaenko IHEP (Protvino)

David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.

Points from DPD task force First meeting last Tuesday (29 th July) – Need to have concrete proposal in 1 month – Still some confusion and nothing very.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.

1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.

WLCG November Plan for shutdown and 2009 data-taking Kors Bos.

ATLAS Physics Analysis Framework James R. Catmore Lancaster University.

PD2P Planning Kaushik De Univ. of Texas at Arlington S&C Week, CERN Dec 2, 2010.

PD2P, Caching etc. Kaushik De Univ. of Texas at Arlington ADC Retreat, Naples Feb 4, 2011.

11th September 2002Tim Adye1 BaBar Experience Tim Adye Rutherford Appleton Laboratory PPNCG Meeting Brighton 11 th September 2002.

Viewpoint from a University Group Bill Lockman with input from: Jim Cochran, Jason Nielsen, Terry Schalk WT2 workshop April 6-7, 2009.

ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon

INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.

ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.

CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.

ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.

Kevin Thaddeus Flood University of Wisconsin

Computing Operations Roadmap

Overview of the Belle II computing

Data Challenge with the Grid in ATLAS

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Readiness of ATLAS Computing - A personal view

ALICE Computing Model in Run3

ALICE Computing Upgrade Predrag Buncic

ATLAS DC2 & Continuous production

The ATLAS Computing Model

Presentation transcript:

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Profiling Analysis at Startup Charge from Rob & Michael Best guess as to the loads analysis will place on the facilities (starting from physics working group plans, job and task definitions) : - likely dataset distributions to the Tier 2 (content and type) - users submission and data retrieval activity Jim Cochran Iowa State Outline: - Review run1 load estimate (moving target) and concerns - Focus on expectations from 1st 1-2 months - Recommendation on dataset “distributions” (my personal suggestion)

US ATLAS T2/T3 Workshop at UChicagoAugust 19, From USATLAS db (Dec. 2008): Total people in US ATLAS = 664 Total students = 140 Total postdocs = 90 Total research scientists = 105 Total faculty = 140 ~375 potential analyzers To set the scale During STEP09, supported ~200 non-HC users (see Kaushik’s talk) - but activity profile likely was not what we will have with real data

US ATLAS T2/T3 Workshop at UChicagoAugust 19, ATLAS Computing Model may be different for early data (i.e. T2) analysis focus Note that user analysis on T1 not part of Computing Model (will be user analysis at BNL) D1PD Need to move bulk of User activity from T1→ T2

US ATLAS T2/T3 Workshop at UChicagoAugust 19, ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production (T0) - remade periodically on T1 Produced outside official production on T2 and/or T3 (by group, sub-group, or Univ. group) Streamed ESD/AOD thin/ skim/ slim D 1 PD 1 st stage anal D n PD root histo ATLAS Analysis Model – analyzer view T2 T3 T0/T1

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Expected analysis patterns for early data Assume bulk of group/user activity will happen on T2s/T3s (define user accessible area of T1 as a T3 [BAF/WAF] ) Two primary modes: Assume final stage of analysis (plots) happens on T3s (T2s are not interactive) [except for WAF] (1)Physics group/user runs jobs on T2s to make tailored dataset (usually D 3 PD) (potential inputs: ESD,AOD,D 1 PD) resultant dataset is then transferred to user’s T3 for further analysis (2) group/user copies input files to specified T3 (potential inputs: ESD,AOD,D 1 PD) On T3 group/user either generates reduced dataset for further analysis or performs final analysis on input data set Choice depends strongly on capabilities of T3, size of input data sets, etc. Also, expect some users to run D 3 PD analysis jobs directly on T2 analysis queues

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Analysis Requirements Study Group: Initial Estimate Charge: estimate resources needed for the analysis & performance studies planned for 2009 & 2010 Motivation: Management needs input from US physics community in order to make decisions/recommendations regarding current and future facilities - considerable overlap with some activities of T3 Task Force (worked together, many of the same people) Basic idea: (1) predict (based on institutional polling) US based analyses ( ) (2) classify as: performance, physics-early, physics-late (sort by input stream) (3) make assumptions about repetition rate (expect to vary with time) (4) compute needed storage and cpu-s (using benchmarks) guessed for missing 4 Received responses from 39/43 institutions early = months 1-4 late = months 5-11

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Additional inputs & assumptions egammamuontrackW/Z(e)W/Z(m)W/Z(T)/mETgamjetminbias egammamuonjet/mET Performance (ESD/pDPD) egammamuonjet/mET Physics: 2009 (AOD/D 1 PD) Physics: 2010 (AOD/D 1 PD) streaming fractions* * pDPD Streaming fractions were found on the pDPD TWiki in Jan 2009; they are no longer posted – regard as very preliminary GB pointed out that I missed the jet pDPD (will fix in next draft) months 1-4: 2  10 6 s  200 Hz = 4  10 8 events months 5-11: 4  10 6 s  200 Hz = 8  10 8 events # of data events based on current CERN 2009/2010 plan: 1.2  10 9 evts with 1/3 before April 1 and 2/3 after April 1

US ATLAS T2/T3 Workshop at UChicagoAugust 19, more inputs/assumptions Institutional response summary: egammamuontrackW/Z(e)W/Z(m)W/Z(T)/mETgamjetminbias egammamuonjet/mET Performance (ESD  D 3 PD) egammamuonjet/mET Physics: 2009 (AOD  D 3 PD) Physics: 2010 (AOD  D 3 PD) Assume analyses begun in 2009 continue in 2010 egammutrkW/Z(e)W/Z(m)W/Z(T)/mETgamjetminbiasegammujet/mETegammujet/mET egammutrkW/Z(e)W/Z(m)W/Z(T)/mETgamjetminbiasegammujet/mETegammujet/mET In anticipation of the need for institutional cooperation (common DPDMaker, common D3PD): Performance Phys (2009)Phys (2010) egammutrkW/Z(e)W/Z(m)W/Z(T)/mETgamjetminbiasegammujet/mETegammujet/mET Performance Phys (2009)Phys (2010) Performance Phys (2009)Phys (2010) minimal cooperation maximal cooperation supermax cooperation

US ATLAS T2/T3 Workshop at UChicagoAugust 19, m1-4m5-11 all analyses independent8.117 minimal cooperation maximal cooperation supermax cooperation kSI2k-s needed/10 10 US Tier2s413 kSI2k-s available/10 10 Note that having every analysis make its own D 3 PDs is not our model! We have always known that we will need to cooperate Tier2 CPU Estimation: Results compare needed cpu-s with available cpu-s: Available Tier2 cpu should be sufficient for analyses

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Storage plans (as I understand them) 2 copies of AODs/D 1 PDs (data+MC) are distributed over US T2s 1 copy of ESD (data only) distributed over US T2s (expect only for ) (may be able to use perfDPDs in some cases) Included in LCG pledge: T1: All AOD, 20% ESD, 25% RAW each T2: 20% AOD (and/or 20% D 1 PD ?) Decided D 1 PDs initially produced from AODs as part of T0 production, replicated to T1s, T2s D 1 PDs will be remade from AODs as necessary on the T1 Not Decided Too early to make decisions about D 2 PDs Streaming strategy for D 1 PDs (3 options under consideration - very active area of discussion) Final content of D 1 PDs

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Tier2 Storage Estimation: results Recall from slide 20, T2 beyond pledge storage must accommodate 1 set of AODs/D1PDs, 1 set of ESDs, & needs of users/groups (what we are trying to calculate) Available for individual users: m1-4: 0 TB m5-11: 0 TB 17 TB if we assume only 20% ESD subtracting AOD/D 1 PD and ESD storage from beyond pledge storage, we find We have insufficient analysis storage until Tier2 disk deficiency is resolved no level of cooperation is sufficient here

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Tier3 CPU & Storage Estimation T3 CPU and disk calculations essentially same as T2 calculations Sum # kSI2k-s needed for all analyses at institution (m1-4, m5-11) Compare with # kSI2k-s available in 1 hour at institution's T3 Each institution was sent estimate & comparison (week of 3/23) : asked to correct/confirm – many responses so far (still incorporating) Sum # TB needed for all analyses at institution, compare with # TB available most T3s have insufficient resources for planned activities - hope remedy from ARA/NSF

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Summing “missing” T3 resources (for both institutions with & without existing T3 hardware) : Scale of “missing” T3 resources # coresTB m m This sets the scale for a potential “single” US analysis facility (T3af) Or, if we sort by geographic region: # cores (m1-4) # cores (m5-11) TB (m1-4) TB (m5-11) Western Midwestern Eastern Assuming 2 kSI2k/core Already 2 de-facto T3af’s: Brookhaven Analysis Facility (BAF) (interactive T1) Western Analysis Facility (WAF) at SLAC (interactive T2) Exact sizes of these “facilites” difficult to pin down Assuming no ARA/NSF relief

US ATLAS T2/T3 Workshop at UChicagoAugust 19, T2→T3 data transfer Need: sustained rate of hundred of Mbps T2 Site Tuning 0 Tuning 1 AGLT2_GROUPDISK - 62 Mbps BNL-OSG_GROUPDISK 52 Mbps 272 Mbps SLACXRD_GROUPDISK 27 Mbps 347 Mbps SWT2_CPG_GROUPDISK 36 Mbps 176 Mbps NET2_GROUPDISK 83 Mbps 313 Mbps MWT2_UC_MCDISK 379 Mbps 423 Mbps Single dq2_get command from ANL ASC T2 Site Tuning 0 Tuning 1 AGLT2_GROUPDISK Mbps BNL-OSG_GROUPDISK 38 Mbps 42 Mbps SLACXRD_GROUPDISK 98 Mbps SWT2_CPG_GROUPDISK 28 Mbps ? Mbps NET2_GROUPDISK 38 Mbps 120 Mbps MWT2_UC_MCDISK 173 Mbps Single dq2_get command from Duke Tuning 0: No Tuning Tuning 1: see Sergei’s talk

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Readiness Summary US Resources Tier2 CPU – ok Tier2 Disk – analysis will be negatively impacted if deficiency is not resolved Tier3s – most have insufficient resources for planned activities - hope for ARA/NSF incorporating T3s into the US T1/T2 system (& testing them) is urgent priority support for T3s expected to be a major issue (see tomorrow’s talks) Readiness Testing Expect ~200 US-based analyses to start in 1 st few months of running Increasingly expansive robotic & user tests are being planned By now T2 analysis queues are in continuous use but larger scale testing is needed Not well tested: large scale data transfers from the T2s to T3s – this is urgent

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Expectations: 1st 1-2 months

US ATLAS T2/T3 Workshop at UChicagoAugust 19, months 1-2: 1  10 6 s  200 Hz = 2  10 8 events We already know T2 cpu is sufficient ESD 700 kB perfDPD NA AOD 170 kB D 1 PD 30 kB Sim AOD 210 kB Size/event 140 TB 34 TB 6 TB 126 TB 1 copy (m1-2) For now assuming no factor due to inclusive streaming Assuming # of MC events = 3x # of data events  will not duplicate this [will count only against pledge] This gives a total disk storage requirement (before user output needs) of m1-4: 90 TB Storage needed for Beyond LCG Pledge (not yet counting user D3PD needs) : Maybe ok if we’re not using all pledged resources

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Recommendation on dataset “distributions” (my personal suggestion)

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Recall 2 copies of AODs/D 1 PDs (data+MC) are distributed over US T2s 1 copy of ESD (data only) distributed over US T2s (expect only for ) (may be able to use perfDPDs in some cases) Included in LCG pledge: T1: All AOD, 20% ESD, 25% RAW each T2: 20% AOD (and/or 20% D 1 PD ?) AOD/D 1 PD storage should not be a problem (assume distribution is handled automatically ?) For ESDs and pDPDs, should have all streams available on T2s For 1st 2 months: Should we consider associating specific streams to specific T2s ? who decides these ? when ?

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Need to perform testing beyond the “20% of T2” level In lead up to big conference, we will likely be asked to suspend production And allocate all (or almost all) of T2s to analysis - we need to test that we can support this (more intensive HC ?) Final Comment

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Backup Slides

US ATLAS T2/T3 Workshop at UChicagoAugust 19, ESD 700 kB perfDPD NA AOD 170 kB D 1 PD 30 kB Sim AOD 210 kB Size/event 280 TB 68 TB 12 TB 252 TB 1 copy (m1-4) 840 TB 204 TB 36 TB 756 TB 1 copy (m5-11) For now assuming no factor due to inclusive streaming Assuming # of MC events = 3x # of data events  will not duplicate this [will count only against pledge] This gives a total disk storage requirement (before user output needs) of m1-4: 180 TB m5-11: 1080 TB 408 TB if we assume only 20% ESD Storage needed for Beyond LCG Pledge (not yet counting user D3PD needs)

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Available Tier2 resources & what’s left over for user D 3 PD output 2009 pledge 2009 installed 2009 users 2009 Q3 installed 2009 Q3 users cpu (kSI2k)621010, , disk (TB) recall ESD, AOD, D 1 PD expectations (previous slide) Available for individual users (D3PD output) : m1-4: 0 TB m5-11: 0 TB 17 TB if we assume only 20% ESD values from M. Ernst JOG Apr09 contribution Expect actual allocation to be somewhat dynamic, monitored closely by RAC US currently behind on 2009 disk pledge m1-4: 180 TB m5-11: 1080 TB 408 TB if we assume only 20% ESD assume m1-4 ~ 2009 m5-11 ~ 2009 Q3