SLUO LHC Workshop, SLACJuly 16-17, 2009 1 Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed.

Slides:



Advertisements
Similar presentations
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Advertisements

US ATLAS T2/T3 Workshop at UChicagoAugust 19, Profiling Analysis at Startup Charge from Rob & Michael Best guess as to the loads analysis will place.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
ATLAS Analysis Model. Introduction On Feb 11, 2008 the Analysis Model Forum published a report (D. Costanzo, I. Hinchliffe, S. Menke, ATL- GEN-INT )
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
External and internal data traffic in Tier-2 ATLAS farms. Sketch of farm organization Some approximate estimate s of internal and external data flows in.
Event Metadata Records as a Testbed for Scalable Data Mining David Malon, Peter van Gemmeren (Argonne National Laboratory) At a data rate of 200 hertz,
А.Минаенко Совещание по физике и компьютингу, 16 сентября 2009 г., ИФВЭ, Протвино Текущее состояние и ближайшие перспективы компьютинга для АТЛАСа в России.
Data Import Data Export Mass Storage & Disk Servers Database Servers Tapes Network from CERN Network from Tier 2 and simulation centers Physics Software.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
Computing for LHCb-Italy Domenico Galli, Umberto Marconi and Vincenzo Vagnoni Genève, January 17, 2001.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
Meeting, 5/12/06 CMS T1/T2 Estimates à CMS perspective: n Part of a wider process of resource estimation n Top-down Computing.
DPDs and Trigger Plans for Derived Physics Data Follow up and trigger specific issues Ricardo Gonçalo and Fabrizio Salvatore RHUL.
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory.
PROOF Farm preparation for Atlas FDR-1 Wensheng Deng, Tadashi Maeno, Sergey Panitkin, Robert Petkus, Ofer Rind, Torre Wenaus, Shuwei Ye BNL.
SC4 Planning Planning for the Initial LCG Service September 2005.
PD2P The DA Perspective Kaushik De Univ. of Texas at Arlington S&C Week, CERN Nov 30, 2010.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
University user perspectives of the ideal computing environment and SLAC’s role Bill Lockman Outline: View of the ideal computing environment ATLAS Computing.
EGI-InSPIRE EGI-InSPIRE RI DDM solutions for disk space resource optimization Fernando H. Barreiro Megino (CERN-IT Experiment Support)
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
NA62 computing resources update 1 Paolo Valente – INFN Roma Liverpool, Aug. 2013NA62 collaboration meeting.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
The ATLAS TAGs Database - Experiences and further developments Elisabeth Vinek, CERN & University of Vienna on behalf of the TAGs developers group.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
ATLAS Computing Requirements LHCC - 19 March ATLAS Computing Requirements for 2007 and beyond.
CMS Computing Model summary UKI Monthly Operations Meeting Olivier van der Aa.
Performance DPDs and trigger commissioning Preparing input to DPD task force.
David Adams ATLAS Datasets for the Grid and for ATLAS David Adams BNL September 24, 2003 ATLAS Software Workshop Database Session CERN.
US ATLAS Tier 1 Facility Rich Baker Deputy Director US ATLAS Computing Facilities October 26, 2000.
TAGS in the Analysis Model Jack Cranshaw, Argonne National Lab September 10, 2009.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Trigger Input to First-Year Analysis Model Working Group And some soul searching… Trigger Open Meeting – 29 July 2009.
14/03/2007A.Minaenko1 ATLAS computing in Russia A.Minaenko Institute for High Energy Physics, Protvino JWGC meeting 14/03/07.
Victoria, Sept WLCG Collaboration Workshop1 ATLAS Dress Rehersals Kors Bos NIKHEF, Amsterdam.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
The ATLAS Computing & Analysis Model Roger Jones Lancaster University ATLAS UK 06 IPPP, 20/9/2006.
GDB, 07/06/06 CMS Centre Roles à CMS data hierarchy: n RAW (1.5/2MB) -> RECO (0.2/0.4MB) -> AOD (50kB)-> TAG à Tier-0 role: n First-pass.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
Main parameters of Russian Tier2 for ATLAS (RuTier-2 model) Russia-CERN JWGC meeting A.Minaenko IHEP (Protvino)
LHCb Current Understanding of Italian Tier-n Centres Domenico Galli, Umberto Marconi Roma, January 23, 2001.
Points from DPD task force First meeting last Tuesday (29 th July) – Need to have concrete proposal in 1 month – Still some confusion and nothing very.
1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.
Oct 16, 2009T.Kurca Grilles France1 CMS Data Distribution Tibor Kurča Institut de Physique Nucléaire de Lyon Journées “Grilles France” October 16, 2009.
WLCG November Plan for shutdown and 2009 data-taking Kors Bos.
A proposal for the KM3NeT Computing Model Pasquale Migliozzi INFN - Napoli 1.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Viewpoint from a University Group Bill Lockman with input from: Jim Cochran, Jason Nielsen, Terry Schalk WT2 workshop April 6-7, 2009.
ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.
J. Shank DOSAR Workshop LSU 2 April 2009 DOSAR Workshop VII 2 April ATLAS Grid Activities Preparing for Data Analysis Jim Shank.
LHCb LHCb GRID SOLUTION TM Recent and planned changes to the LHCb computing model Marco Cattaneo, Philippe Charpentier, Peter Clarke, Stefan Roiser.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
Overview of the Belle II computing
Data Challenge with the Grid in ATLAS
for the Offline and Computing groups
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Readiness of ATLAS Computing - A personal view
Artem Trunov and EKP team EPK – Uni Karlsruhe
ALICE Computing Upgrade Predrag Buncic
 YongPyong-High Jan We appreciate that you give an opportunity to have this talk. Our Belle II computing group would like to report on.
ATLAS DC2 & Continuous production
The ATLAS Computing Model
Presentation transcript:

SLUO LHC Workshop, SLACJuly 16-17, Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed resources, an analysis model is assumed; the AMFY* report will supercede and may alter results Given time constraints, some details are in the back-up slides *AMFY = Analysis Model First Year

SLUO LHC Workshop, SLACJuly 16-17, ATLAS Computing Model may be different for early data (i.e. T2) analysis focus Note that user analysis on T1 not part of Computing Model D1PD

SLUO LHC Workshop, SLACJuly 16-17, ATLAS Analysis/Computing model – simpleton view ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) made in official production at T0; remake as needed on T1 Produced outside official production on T2 and/or T3 Streamed ESD/AOD thin/ skim/ slim D 1 PD 1 st stage anal D n PD root histo/ plots Produced on T3 claim all analyses (data reduction) can be broken down into a series of transforms [input  output] (i.e. AOD  D 1 PD, D 1 PD  D 3 PD, D 3 PD  plots)

SLUO LHC Workshop, SLACJuly 16-17, Expected analysis patterns for early data Assume bulk of group/user activity will happen on T2s/T3s (define user accessible area of T1 as a T3af) Two primary modes: Assume final stage of analysis (plots) happens on T3s (T2s are not interactive) [except for SLAC] (1)Physics group/(user ?) runs jobs on T2s to make tailored dataset (usually D 3 PD) (potential inputs: ESD,AOD,D 1 PD) resultant dataset is then transferred to user’s T3 for further analysis (2) group/user copies input files to specified T3 (potential inputs: ESD,AOD,D 1 PD) On T3 group/user either generates reduced dataset for further analysis or performs final analysis on input data set Choice depends strongly on capabilities of T3, size of input data sets, etc.

SLUO LHC Workshop, SLACJuly 16-17, Resources horizontal axis: fraction fully simulated Tier2 simulation for 1 year (~2012, not 2010) vertical axis: fraction fast-simulated (ATLFAST II) Assume only 20% of T2 resources available for analysis Is this sufficient ? What resources are needed for analysis ? from T3 report (Amir Farbin) Number and type of US-based analyses estimated (based on institutional polling) Using known benchmarks (+other assumptions) compute needed storage and cpu-s

SLUO LHC Workshop, SLACJuly 16-17, all analyses independent17 minimal cooperation8.5 maximal cooperation4.3 supermax cooperation1.0 kSI2k-s needed/10 10 US Tier2s13 kSI2k-s available/10 10 Note that having every analysis make its own D 3 PDs is not our model! We have always known that we will need to cooperate Tier2 CPU Estimation: Results compare needed cpu-s with available cpu-s: Available Tier2 cpu should be sufficient for 2010 analyses

SLUO LHC Workshop, SLACJuly 16-17, copies of AODs/D 1 PDs (data+MC) are distributed over US T2s 1 copy of ESD (data only) distributed over US T2s (expect only for ) (may be able to use perfDPDs in some cases) Included in LCG pledge: T1: All AOD, 20% ESD, 25% RAW each T2: 20% AOD (and/or 20% D 1 PD ?) US Plan US currently behind on LCG pledge for storage Tier2 Storage Estimation: results Available for user analysis: 0 TB 17 TB if we assume only 20% ESD We have insufficient analysis storage until Tier2 disk deficiency is resolved no level of cooperation is sufficient here 2010 all analyses independent3717 minimal cooperation2379 maximal cooperation613 supermax cooperation143 TB needed T2 pledge is somewhat beyond what’s needed for 20% AOD

SLUO LHC Workshop, SLACJuly 16-17, Analysis Commissioning/Testing T2 Analysis queues in existence and use since Fall 2008 Robotic stress tests have been running on some queues since fall08 running on all queues since April 09 User tests more difficult to organize Robotic tests reached peak as major component of STEP09 exercise (early June 2009) Queues tested with robotic submission (by experts) of example user jobs US held a 4-day 3-site Jamboree/Stress Test in Sep08 - useful but not nearly extensive enough US expert user testing included as part of STEP09 (and ongoing) - see Andy’s talk Expanded US tests next week (including T2  T3 data transfer test) single-day “all hands” US test in mid-August (?) ATLAS-wide user tests now in planning stages

SLUO LHC Workshop, SLACJuly 16-17, Backup

SLUO LHC Workshop, SLACJuly 16-17, Data Formats FormatSize(kB)/evt RAW – data output from DAQ ( streamed on trigger bits )1600 ESD – event summary data: reco info + most RAW500 AOD – analysis object data: summary of ESD data150 TAG – event level metadata with pointers to data files1 Derived Physics Data (DPDs): D 1 PD – subset, refined, little brother of AOD~25 D 2 PD – specific to physics (sub)group, augmented, undefined~30 D 3 PD – flat roottuple~5 perfDPD – performance DPD, calibrations, etc. (early data) claim that all analyses can be broken down into a series of transforms [input  output] (i.e. AOD  D 1 PD, D 1 PD  D 3 PD, D 3 PD  plots)

SLUO LHC Workshop, SLACJuly 16-17, Starting point: The Transforms claim that all analyses can be broken down into a series of transforms [input  output] (i.e. AOD  D 1 PD, D 1 PD  D 3 PD, D 3 PD  plots) Skimming – removing entire events Slimming – removing parts of objects Thinning – removing objects Augmenting – costs cpu, may increase output size Merging – concatenating files of same type ESD AOD D 1 PD D 2 PD D 3 PD ESD AOD D 1 PD D 2 PD D 3 PD plots InputOutputTransform Assume (for 2009 & 2010 user analysis): T2 activity will be ESD/PerfDPD  D 3 PD and AOD/D 1 PD  D 3 PD T3 activity will be D 3 PD  plots basic model most people are using now

SLUO LHC Workshop, SLACJuly 16-17, Transform rates Obtained from PanDA on FDR data – stable over expected range of file sizes (number of events) ESD  D 3 PD ESD  D 3 PD-small ESD  D 3 PD-verysmall AOD  D 3 PD AOD  D 3 PD-small AOD  D 3 PD-verysmall 13 Hz 30 Hz 82 Hz 14 Hz 35 Hz 91 Hz Rates correspond to kSpecInt2k Don’t yet have enough info to know which analyses will use standard, small, or verysmall Choose 10 kHz D 3 PD  plots Large variation (see ATL-COM-SOFT-002.pdf) HammerCloud tests on AOD find ~10 Hz Choose 10 Hz for both ESD/pDPD  D 3 PD AOD/D 1 PD  D 3 PD Assume all standard for Depending on input/output file size and choice of analysis software, rate varied from kHz