EGI-InSPIRE EGI-InSPIRE RI- 261323 www.egi.eu DDM solutions for disk space resource optimization Fernando H. Barreiro Megino (CERN-IT Experiment Support)

Slides:



Advertisements
Similar presentations
15/07/2010Swiss WLCG Operations Meeting Summary of the last GridKA Cloud Meeting (07 July 2010) Marc Goulette (University of Geneva)
Advertisements

December Pre-GDB meeting1 CCRC08-1 ATLAS’ plans and intentions Kors Bos NIKHEF, Amsterdam.
Tier-2 Network Requirements Kors Bos LHC OPN Meeting CERN, October 7-8,
Staging to CAF + User groups + fairshare Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE Offline week,
AMOD Report Doug Benjamin Duke University. Hourly Jobs Running during last week 140 K Blue – MC simulation Yellow Data processing Red – user Analysis.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EG recent developments T. Ferrari/EGI.eu ADC Weekly Meeting 15/05/
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.
Introduction: Distributed POOL File Access Elizabeth Gallas - Oxford – September 16, 2009 Offline Database Meeting.
Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.
Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.
DDM-Panda Issues Kaushik De University of Texas At Arlington DDM Workshop, BNL September 29, 2006.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI AMOD report – Fernando H. Barreiro Megino CERN-IT-ES-VOS.
Chapter 10 Chapter 10: Managing the Distributed File System, Disk Quotas, and Software Installation.
Experiment Support ANALYSIS FUNCTIONAL AND STRESS TESTING Dan van der Ster, CERN IT-ES-DAS for the HC team: Johannes Elmsheuser, Federica Legger, Mario.
Towards a Global Service Registry for the World-Wide LHC Computing Grid Maria ALANDES, Laurence FIELD, Alessandro DI GIROLAMO CERN IT Department CHEP 2013.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Direct gLExec integration with PanDA Fernando H. Barreiro Megino CERN IT-ES-VOS.
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
Caitriana Nicholson, CHEP 2006, Mumbai Caitriana Nicholson University of Glasgow Grid Data Management: Simulations of LCG 2008.
US ATLAS Computing Operations Kaushik De University of Texas At Arlington U.S. ATLAS Tier 2/Tier 3 Workshop, FNAL March 8, 2010.
EGI-InSPIRE EGI-InSPIRE RI DDM Site Services winter release Fernando H. Barreiro Megino (IT-ES-VOS) ATLAS SW&C Week November
Storage cleaner: deletes files on mass storage systems. It depends on the results of deletion, files can be set in states: deleted or to repeat deletion.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Implementation and performance analysis of.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Data Management: US Focus Kaushik De, Armen Vartapetian Univ. of Texas at Arlington US ATLAS Facility, SLAC Apr 7, 2014.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Service Availability Monitor tests for ATLAS Current Status Tests in development To Do Alessandro Di Girolamo CERN IT/PSS-ED.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
Data management demonstrators Ian Bird; WLCG MB 18 th January 2011.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Data Management Highlights in TSA3.3 Services for HEP Fernando Barreiro Megino,
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
ATLAS Distributed Computing perspectives for Run-2 Simone Campana CERN-IT/SDC on behalf of ADC.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Disk Space publication Simone Campana Fernando Barreiro Wahid.
U.S. ATLAS Facility Planning U.S. ATLAS Tier-2 & Tier-3 Meeting at SLAC 30 November 2007.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.
Good user practices + Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CUF,
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
PanDA Configurator and Network Aware Brokerage Fernando Barreiro Megino, Kaushik De, Tadashi Maeno 14 March 2015, US ATLAS Distributed Facilities Meeting,
PD2P Planning Kaushik De Univ. of Texas at Arlington S&C Week, CERN Dec 2, 2010.
PD2P, Caching etc. Kaushik De Univ. of Texas at Arlington ADC Retreat, Naples Feb 4, 2011.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Regional tools use cases overview Peter Solagna – EGI.eu On behalf of the.
LHCONE Workshop Richard P Mount February 10, 2014 Concerns from Experiments ATLAS Richard P Mount SLAC National Accelerator Laboratory.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI MPI VT report OMB Meeting 28 th February 2012.
Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI solution for high throughput data analysis Peter Solagna EGI.eu Operations.
The ATLAS “DQ2 Accounting and Storage Usage Service”
ATLAS Grid Information System
Simone Campana CERN IT-ES
POW MND section.
The Data Lifetime model
Proposal for obtaining installed capacity
Readiness of ATLAS Computing - A personal view
WLCG Management Board, 16th July 2013
The ADC Operations Story
CMS staging from tape Natalia Ratnikova, Fermilab
RUCIO Consistency Status
Cloud Computing R&D Proposal
Data Lifecycle Review and Outlook
BACHELOR’S THESIS DEFENSE
Roadmap for Data Management and Caching
Presentation transcript:

EGI-InSPIRE EGI-InSPIRE RI DDM solutions for disk space resource optimization Fernando H. Barreiro Megino (CERN-IT Experiment Support) on behalf of ATLAS DDM WLCG Collaboration Workshop 8 July 2010

EGI-InSPIRE RI Optimize space resources –Minimize unused space in sites –Keep sites full with interesting data –Improve control and monitoring Proposed solutions –DDM Accounting –Data ranking –Automatic replica reduction –Automatic DDM site exclusion Motivation

EGI-InSPIRE RI DDM Accounting

EGI-InSPIRE RI Disk space and usage information at sites collected by different agents and presented in a web frontend –Volumes of space, number of datasets and files registered at sites according to DDM (Vincent Garonne) possibility to break down volumes by metadata –Total and used disk space according to SRM DDM Accounting

EGI-InSPIRE RI Original main purpose: Follow up consistency between data volumes known by DDM and SRM DDM Accounting Good example of consistency: FZK running regular checks (Cedric Serfon)

EGI-InSPIRE RI Federation reports Other federations deploy more than they have pledged Space deployed by federations Distribution of resources amongst spacetokens Space pledged per federation Some federations don’t deploy their pledge at once (IFAE, IFIC-LCG2, UAM) (PIC) (LIP-COIMBRA, LIP- LISBON, NCG- INGRID-PT)

EGI-InSPIRE RI Management resource overviews Plot by Vincent Garonne

EGI-InSPIRE RI Physics groups reports Usage very different between groups Group quota converging to 25TiB booked per site

EGI-InSPIRE RI Data ranking

EGI-InSPIRE RI Monitoring of data ranking evolution Distinguish Computing Model replicas from additional replicas Custodial: Master replica (Must be kept) Primary: Following Computing Model (CM) Secondary: In excess of CM, may be deleted Default: Not ranked yet Introduction of the custodiality concept Start of meaningful ranking and bootstrapping Bootstrapping complete Space that could be recovered

EGI-InSPIRE RI Data ranking Globally it seems we have significant unused and recoverable space, but this depends on the site

EGI-InSPIRE RI Automatic replica reduction

EGI-InSPIRE RI Motivation Estimated pledge for T1 DATADISK until June 2011: 13PB Motivation

EGI-InSPIRE RI Reminder on DDM Popularity Angelos Molfetas (CERN) Only actions by official tools can be considered Try it out! Popularity.cern.ch Try it out! Popularity.cern.ch

EGI-InSPIRE RI Automatic replica reduction in theory Andrii Tykhonov (Jozef Stefan Institute, Ljubljana, Slovenia ) Agent that looks once a day for full MCDISK and DATADISK endpoints –Full endpoint: free space<min(10%, 25.0 TB) For each full site it will try to select secondary replicas to delete –Successfully cleaned endpoint: free space>min(20%, 50.0 TB) Selection of replicas for deletion based on DDM popularity –Older than 15 days –From oldest to newest –Not accessed recently Datasets will be moved into the deletion queue after a grace period of 24 hours. Evolution of the cleaning strategy from previous presentations

EGI-InSPIRE RI Automatic replica reduction in practice Some sites can not be completely cleaned: The data is too fresh and/or too popular. Actual deletion started this week. Until now our ops team was manually deleting the published lists

EGI-InSPIRE RI Somes notes on automatic replica reduction Early stage of experience with the system –Reduce secondary replicas more aggressively –Follow closely un-cleaned DATADISKs and MCDISKs –Detect exceptions and provide fine tuning by sites Sites who deploy disk space at once will be the ones to fully profit of this model Replica reduction is allied with the Dynamic Data Placement (D2PD) model

EGI-InSPIRE RI Future work on automatic replica reduction Last week’s decision: expand automatic replica reduction to SCRATCHDISK –SCRATCHDISK stores analysis output for users –At the moment: Fixed policy - datasets older than 30 days are automatically deleted –In the future: More flexible policy Keep between 20 and 10% free in spacetoken Once spacetoken has less than 10% free, remove data from oldest to newest until 20% free Data ranking or popularity do not play a role for this spacetoken

EGI-InSPIRE RI Automatic DDM site exclusion

EGI-InSPIRE RI Automatic DDM site exclusion Published GOCDB and OIM downtimes are taken into account automatically Full spacetokens will be excluded as destination Write Read All the information is propagated to the Site Status Board

EGI-InSPIRE RI Effort from ATLAS Distributed Computing in optimizing space resources provided by sites Need to become more aggressive in secondary data deletion Reduce need of manual operations Monitoring and deletion tools are in place and will be improved and tuned with experience Conclusions

EGI-InSPIRE RI Acknowledgements Simone Campana Alessandro Di Girolamo Stephane Jézéquel I. Ueda

EGI-InSPIRE RI Backup slides

EGI-InSPIRE RI Slide from Kors Bos Computing model

EGI-InSPIRE RI PanDA Dynamic Data Placement (D2PD) Slide from Kaushik De