PD2P, Caching etc. Kaushik De Univ. of Texas at Arlington ADC Retreat, Naples Feb 4, 2011.

Slides:



Advertisements
Similar presentations
Computer System Organization Computer-system operation – One or more CPUs, device controllers connect through common bus providing access to shared memory.
Advertisements

T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
Virtual Memory Chapter 8. Hardware and Control Structures Memory references are dynamically translated into physical addresses at run time –A process.
December Pre-GDB meeting1 CCRC08-1 ATLAS’ plans and intentions Kors Bos NIKHEF, Amsterdam.
AMI S.A. Datasets… Solveig Albrand. AMI S.A. A set is… A number of things grouped together according to a system of classification, or conceived as forming.
Tier-2 Network Requirements Kors Bos LHC OPN Meeting CERN, October 7-8,
Tier-0: Preparations for Run-2 Armin NAIRZ (CERN) ADC Technical Interchange Meeting Chicago, 29 October 2014.
ATLAS Report 14 April 2010 RWL Jones. The Good News  At the IoP meeting before Easter, Dave Charlton said the best thing about the Grid was there was.
MC, REPROCESSING, TRAINS EXPERIENCE FROM DATA PROCESSING.
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
FZU participation in the Tier0 test CERN August 3, 2006.
Recall … Process states –scheduler transitions (red) Challenges: –Which process should run? –When should processes be preempted? –When are scheduling decisions.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
DDM-Panda Issues Kaushik De University of Texas At Arlington DDM Workshop, BNL September 29, 2006.
Daniela Anzellotti Alessandro De Salvo Barbara Martelli Lorenzo Rinaldi.
PanDA Summary Kaushik De Univ. of Texas at Arlington ADC Retreat, Naples Feb 4, 2011.
Costin Grigoras ALICE Offline. In the period of steady LHC operation, The Grid usage is constant and high and, as foreseen, is used for massive RAW and.
PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Changes in PD2P replication strategy S. Campana (CERN IT/ES) on.
EGI-InSPIRE EGI-InSPIRE RI DDM Site Services winter release Fernando H. Barreiro Megino (IT-ES-VOS) ATLAS SW&C Week November
08-Nov Database TEG workshop, Nov 2011 ATLAS Oracle database applications and plans for use of the Oracle 11g enhancements Gancho Dimitrov.
Precomputation- based Prefetching By James Schatz and Bashar Gharaibeh.
Computer Architecture Lecture 27 Fasih ur Rehman.
Network awareness and network as a resource (and its integration with WMS) Artem Petrosyan (University of Texas at Arlington) BigPanDA Workshop, CERN,
PD2P The DA Perspective Kaushik De Univ. of Texas at Arlington S&C Week, CERN Nov 30, 2010.
The ATLAS Cloud Model Simone Campana. LCG sites and ATLAS sites LCG counts almost 200 sites. –Almost all of them support the ATLAS VO. –The ATLAS production.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
Automatic Resource & Usage Monitoring Steve Traylen/Flavia Donno CERN/IT.
Data Management: US Focus Kaushik De, Armen Vartapetian Univ. of Texas at Arlington US ATLAS Facility, SLAC Apr 7, 2014.
EGI-InSPIRE EGI-InSPIRE RI DDM solutions for disk space resource optimization Fernando H. Barreiro Megino (CERN-IT Experiment Support)
2012 RESOURCES UTILIZATION REPORT AND COMPUTING RESOURCES REQUIREMENTS September 24, 2012.
LHCbComputing LHCC status report. Operations June 2014 to September m Running jobs by activity o Montecarlo simulation continues as main activity.
Storage Classes report GDB Oct Artem Trunov
GridKa Summer 2010 T. Kress, G.Quast, A. Scheurer Migration of data from old to new dCache instance finished on Nov. 23 rd almost 500'000 files (600.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Data processing Offline review Feb 2, Productions, tools and results Three basic types of processing RAW MC Trains/AODs I will go through these.
U.S. ATLAS Facility Planning U.S. ATLAS Tier-2 & Tier-3 Meeting at SLAC 30 November 2007.
Victoria, Sept WLCG Collaboration Workshop1 ATLAS Dress Rehersals Kors Bos NIKHEF, Amsterdam.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
ALICE Grid operations +some specific for T2s US-ALICE Grid operations review 7 March 2014 Latchezar Betev 1.
PanDA & Networking Kaushik De Univ. of Texas at Arlington ANSE Workshop, CalTech May 6, 2013.
LHCb Computing activities Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group.
WLCG November Plan for shutdown and 2009 data-taking Kors Bos.
PanDA Configurator and Network Aware Brokerage Fernando Barreiro Megino, Kaushik De, Tadashi Maeno 14 March 2015, US ATLAS Distributed Facilities Meeting,
PD2P Planning Kaushik De Univ. of Texas at Arlington S&C Week, CERN Dec 2, 2010.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
LHCONE Workshop Richard P Mount February 10, 2014 Concerns from Experiments ATLAS Richard P Mount SLAC National Accelerator Laboratory.
PanDA & Networking Kaushik De Univ. of Texas at Arlington UM July 31, 2013.
Computing Operations Roadmap
T1/T2 workshop – 7th edition - Strasbourg 3 May 2017 Latchezar Betev
U.S. ATLAS Tier 2 Computing Center
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
David Adams Brookhaven National Laboratory September 28, 2006
Data Challenge with the Grid in ATLAS
ATLAS activities in the IT cloud in April 2008
The Data Lifetime model
PanDA in a Federated Environment
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
Readiness of ATLAS Computing - A personal view
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
The ADC Operations Story
Evolution of the distributed computing model The case of CMS
Job Processing Database consolidation Task recovery De-cronification
Artem Trunov and EKP team EPK – Uni Karlsruhe
ALICE Computing Upgrade Predrag Buncic
Roadmap for Data Management and Caching
The ATLAS Computing Model
Presentation transcript:

PD2P, Caching etc. Kaushik De Univ. of Texas at Arlington ADC Retreat, Naples Feb 4, 2011

Introduction  Caching at T2 using PD2P and Victor works well  Have 6 months experience (>3 months with all clouds)  Almost zero complaint from users  Few operational headaches  Some cases of disk full, datasets disappearing…  Most issues addressed with incremental improvements like space checking, rebrokering, storage cleanup and consolidation  What I propose today should solve remaining issues  Many positives  No exponential growth in storage use  Better use of Tier 2 sites for analysis  Next step – PD2P for Tier 1  This is not a choice – but necessity (see Kors’ slides)  We should treat part of Tier 1 storage as dynamic cache Feb 4, 2011 Kaushik De 2

Life Without ESD  New plan – see document and Ueda’s slides  Reduction in storage requirement from 27 PB -> ~10 PB for 2011 for 400 Hz (but could be as much as 13 PB)  Reduction of 2010 data from 13PB to ~6 PB  But we should go farther  We are still planning to fill almost all T1 disks with pre-placed data  MC = = 24 PB = available space  Based on past experience, reality will be tougher, and disk crises will hit us sooner – we should do things differently this time  We must trust caching model Feb 4, 2011 Kaushik De 3

What can we do?  Make some room for dynamic caches  For discussion below, do not count T0 copy  Use DQ2 tags – custodial/primary/secondary – rigorously  Custodial = LHC Data = Tape only (1 copy)  Primary = minimal, disk at T1, so we have room for PD2P caching  LHC Data primary == RAW (1 copy), AOD, DESD, NTUP (2 copies)  MC primary == Evgen, AOD, NTUP (2 copies only)  Secondary = copies made by ProdSys (ESD, HITS, RDO), PD2P (all types except RAW, RDO, HITS) and DaTri only  Lifetimes – required strictly for all secondary copies (i.e. consider secondary == cached == temporary)  Locations – custodial ≠ primary; primary ≠ secondary  Deletions – any secondary copy can be deleted by Victor Feb 4, 2011 Kaushik De 4

Reality Check  Primary copy (according to slide 4)  2010 data ~ 4 PB  2011 data ~ 4.5 PB  MC ~ 5 PB  Total primary = 14 PB  Available space for secondaries > ~10 PB at Tier 1’s  Can accommodate additional copies, only if ‘hot’  Can accommodate some ESD’s (expired gracefully after n months)  Can accommodate large buffers during reprocessing (new release)  Can accommodate better than expected LHC running  Can accommodate new physics driven requests Feb 4, 2011 Kaushik De 5

Who Makes Replicas?  RAW - managed by Santa Claus (no change)  1 copy to TAPE (custodial), 1 copy DISK (primary) at a different T1  First pass processed data – by Santa Claus (no change)  Tagged primary/secondary according to slide 4  Secondary will have lifetime (n months)  Reprocessed data – by PanDA  Tagged primary/secondary according to slide 4, and set lifetime  Additional copies made to a different T1 disk, according to MoU share, automatically based on slide 4 (not by AKTR anymore)  Additional copies at Tier 1’s – only by PD2P and DaTri  Must always set lifetime  Note – only PD2P makes copies to Tier 2’s Feb 4, 2011 Kaushik De 6

Additional Copies by PD2P  Additional copies at Tier 1’s – always tagged secondary  If dataset is ‘hot’ (defined on next slide)  Use MoU share to decide which Tier 1 gets extra copy  Copies at Tier 2’s – always tagged secondary  No changes for first copy – keep current algorithm (brokerage), use age requirement if we run into space shortage (see Graeme’s talk)  If dataset is ‘hot’ (see next slide) make extra copy  Reminder – additional replicas are secondary = temporary by definition, may/will be removed by Victor Feb 4, 2011 Kaushik De 7

What is ‘Hot’?  ‘Hot’ decides when to make secondary replica  Algorithm is based on additive weights  w1 + w2 + w3 + wN… > N (tunable threshold) – make extra copy  w1 – based on number of waiting jobs  nwait/2*nrunning, averaged over all sites  Currently disabled due to DB issues – need to re-enable  Don’t base on number of reuse – did not work well  w2 – inversely based on age  Either Graeme’s table, or continuous, normalized to 1 (newest data)  w3 – inversely based on number of copies  wN – other factors based on experience Feb 4, 2011 Kaushik De 8

Where to Send ‘Hot’ Data?  Tier 1 site selection  Based on MoU share  Exclude site if dataset size > 5% (as proposed by Graeme)  Exclude site if too many active subscriptions  Other tuning based on experience  Tier 2 site selection  Based on brokerage, as currently  Negative weight – based on number of active subscriptions  Other tuning based on experience Feb 4, 2011 Kaushik De 9

What About Broken Subscriptions?  Becoming an issue (see Graeme’s talk)  PD2P already sends datasets within a container to different sites to reduce wait time for users  But what about datasets which take more than few hours?  Simplest solution  ProdSys imposes maximum limit on dataset size  Possible alternative  Cron/PanDA to break up datasets and rebuild container  Difficult but also possible solution  Use _dis datasets in PD2P  Search DQ2 for _dis datasets in brokerage (there will be performance penalty if we use this route)  But this is perhaps the most robust solution? Feb 4, 2011 Kaushik De 10

Data Deletions will be Very Important  Since we are caching everywhere (T1+T2), Victor plays equally important role as PD2P  Asynchronously cleanup all caches  Trigger based on disk fullness threshold  Algorithm based on (age+popularity)&secondary  Also automatic deletion of n-2 – by AKTR/Victor Feb 4, 2011 Kaushik De 11

How Soon Can we Implement?  Before LHC startup!  Big load initially on ADC operations to cleanup 2010 data, and to migrate tokens  Need some testing/tuning of PD2P before LHC starts  So, we need decision on this proposal quickly Feb 4, 2011 Kaushik De 12