Efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop.

Slides:

Advertisements

Similar presentations

DPM Italian sites and EPEL testbed in Italy Alessandro De Salvo (INFN, Roma1), Alessandra Doria (INFN, Napoli), Elisabetta Vilucchi (INFN, Laboratori Nazionali.

Advertisements

Outline Network related issues and thinking for FAX Cost among sites, who has problems Analytics of FAX meta data, what are the problems  The main object.

Efi.uchicago.edu ci.uchicago.edu FAX update Rob Gardner Computation and Enrico Fermi Institutes University of Chicago Sep 9, 2013.

Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago US ATLAS Computing Integration.

LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.

ALICE DATA ACCESS MODEL Outline ALICE data access model - PtP Network Workshop 2  ALICE data model  Some figures.

Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.

CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.

ALICE data access WLCG data WG revival 4 October 2013.

US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.

Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.

José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.

Integration Program Update Rob Gardner US ATLAS Tier 3 Workshop OSG All LIGO.

INTRODUCTION The GRID Data Center at INFN Pisa hosts a big Tier2 for the CMS experiment, together with local usage from other HEP related/not related activities.

Take on messages from Lecture 1 LHC Computing has been well sized to handle the production and analysis needs of LHC (very high data rates and throughputs)

Tier 3 Data Management, Tier 3 Rucio Caches Doug Benjamin Duke University.

PhysX CoE: LHC Data-intensive workflows and data- management Wahid Bhimji, Pete Clarke, Andrew Washbrook – Edinburgh And other CoE WP4 people…

FAX UPDATE 1 ST JULY Discussion points: FAX failover summary and issues Mailing issues Panda re-brokering to sites using FAX cost and access Issue.

FAX UPDATE 26 TH AUGUST Running issues FAX failover Moving to new AMQ server Informing on endpoint status Monitoring developments Monitoring validation.

Efi.uchicago.edu ci.uchicago.edu Towards FAX usability Rob Gardner, Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago US ATLAS.

Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.

Efi.uchicago.edu ci.uchicago.edu FAX meeting intro and news Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Federated Xrootd.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

Status & Plan of the Xrootd Federation Wei Yang 13/19/12 US ATLAS Computing Facility Meeting at 2012 OSG AHM, University of Nebraska, Lincoln.

Efi.uchicago.edu ci.uchicago.edu FAX Dress Rehearsal Status Report Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group Computation.

PanDA Update Kaushik De Univ. of Texas at Arlington XRootD Workshop, UCSD January 27, 2015.

Efi.uchicago.edu ci.uchicago.edu Using FAX to test intra-US links Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group Computing Integration.

Efi.uchicago.edu ci.uchicago.edu FAX status developments performance future Rob Gardner Yang Wei Andrew Hanushevsky Ilija Vukotic.

WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.

Storage Federations and FAX (the ATLAS Federation) Wahid Bhimji University of Edinburgh.

Efi.uchicago.edu ci.uchicago.edu Status of the FAX federation Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 /

Network awareness and network as a resource (and its integration with WMS) Artem Petrosyan (University of Texas at Arlington) BigPanDA Workshop, CERN,

SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.

ATLAS XRootd Demonstrator Doug Benjamin Duke University On behalf of ATLAS.

Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group S&C week Jun 2, 2014.

PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.

PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.

FAX UPDATE 12 TH AUGUST Discussion points: Developments FAX failover monitoring and issues SSB Mailing issues Panda re-brokering to FAX Monitoring.

Efi.uchicago.edu ci.uchicago.edu Data Federation Strategies for ATLAS using XRootD Ilija Vukotic On behalf of the ATLAS Collaboration Computation and Enrico.

Federated Data Stores Volume, Velocity & Variety Future of Big Data Management Workshop Imperial College London June 27-28, 2013 Andrew Hanushevsky, SLAC.

Efi.uchicago.edu ci.uchicago.edu Ramping up FAX and WAN direct access Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.

U.S. ATLAS Facility Planning U.S. ATLAS Tier-2 & Tier-3 Meeting at SLAC 30 November 2007.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

Maria Girone, CERN CMS Experiment Status, Run II Plans, & Federated Requirements Maria Girone, CERN XrootD Workshop, January 27, 2015.

Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

CernVM-FS Infrastructure for EGI VOs Catalin Condurache - STFC RAL Tier1 EGI Webinar, 5 September 2013.

CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.

Data Analysis w ith PROOF, PQ2, Condor Data Analysis w ith PROOF, PQ2, Condor Neng Xu, Wen Guan, Sau Lan Wu University of Wisconsin-Madison 30-October-09.

ATLAS Distributed Computing ATLAS session WLCG pre-CHEP Workshop New York May 19-20, 2012 Alexei Klimentov Stephane Jezequel Ikuo Ueda For ATLAS Distributed.

An Analysis of Data Access Methods within WLCG Shaun de Witt, Andrew Lahiff (STFC)

SLACFederated Storage Workshop Summary Andrew Hanushevsky SLAC National Accelerator Laboratory April 10-11, 2014 SLAC.

DPM in FAX (ATLAS Federation) Wahid Bhimji University of Edinburgh As well as others in the UK, IT and Elsewhere.

Efi.uchicago.edu ci.uchicago.edu FAX splinter session Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 / Tier 2 /

Efi.uchicago.edu ci.uchicago.edu Federating ATLAS storage using XrootD (FAX) Rob Gardner on behalf of the atlas-adc-federated-xrootd working group Computation.

Efi.uchicago.edu ci.uchicago.edu Sharing Network Resources Ilija Vukotic Computation and Enrico Fermi Institutes University of Chicago Federated Storage.

LHCONE Workshop Richard P Mount February 10, 2014 Concerns from Experiments ATLAS Richard P Mount SLAC National Accelerator Laboratory.

HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.

Efi.uchicago.edu ci.uchicago.edu FAX status report Ilija Vukotic on behalf of the atlas-adc-federated-xrootd working group Computation and Enrico Fermi.

Efi.uchicago.edu ci.uchicago.edu Caching FAX accesses Ilija Vukotic ADC TIM - Chicago October 28, 2014.

Efi.uchicago.edu ci.uchicago.edu FAX splinter session Rob Gardner Computation and Enrico Fermi Institutes University of Chicago ATLAS Tier 1 / Tier 2 /

Federating Data in the ALICE Experiment

ALICE internal and external network

Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017

Global Data Access – View from the Tier 2

Future of WAN Access in ATLAS

PanDA in a Federated Environment

Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.

FDR readiness & testing plan

Brookhaven National Laboratory Storage service Group Hironori Ito

Presentation transcript:

efi.uchicago.edu ci.uchicago.edu Storage federations, caches & WMS Rob Gardner Computation and Enrico Fermi Institutes University of Chicago BigPanDA Workshop at CERN October 21, 2013

efi.uchicago.edu ci.uchicago.edu 2 Evolving computing model & federation Over the past few years the rigid T0/T1/T2/T3 hierarchy changing into a flatter mesh-like infrastructure Made possible by faster, reliable and affordable networks – & new virtual peering projects such as LHCONE Offers opportunity to create new ways of accessing data from production and analysis jobs – E.g. remove restriction that CPU and data located at same site Federation is a tool to allow jobs to flexibly reach datasets beyond the local site storage

efi.uchicago.edu ci.uchicago.edu 3 R & D project to explore using XRootD Explored for past two years as an ATLAS computing R&D project – Federated ATLAS XRootD (FAX) XRootD chosen as a mature technology widely used in HEP Efficient protocol for file discovery Provides a uniform interface to the various backend storage systems used in the WLCG (dCache, DPM are the most common) Close collaboration between XRootD and ROOT teams Good results in ALICE and CMS

efi.uchicago.edu ci.uchicago.edu 4 Data federation goals Create a common ATLAS namespace across all storage sites, accessible from anywhere Make easy to use, homogeneous access to data Identified initial use cases – Failover from stage-in problems with local storage o Now implemented, in production on several sites – Gain access to more CPUs using WAN direct read access o Allow brokering to Tier 2s with partial datasets o Opportunistic resources without local ATLAS storage – Use as caching mechanism at sites to reduce local data management tasks o Eliminate cataloging, consistency checking, deletion services

efi.uchicago.edu ci.uchicago.edu 5 A diversity of storage services ATLAS uses 80+ WLCG computing sites organized in Tiers Various storage technologies are used – dCache, DPM, Lustre, GPFS, Storm, XRootD and EOS Single source of deployment documentation: lasComputing/JoiningTheATLASFederationhttps://twiki.cern.ch/twiki/bin/view/At lasComputing/JoiningTheATLASFederation To ease support we have experts for different technologies To ease communications we have contact persons per national cloud Name lookup is in transition from LFC to Rucio namespace  expect performance impre

efi.uchicago.edu ci.uchicago.edu 6 A deployed federation Read only access to 177 PB Global namespace Currently 41 federated sites Regions covered: US, DE, UK, ES, and CERN FAX (Federated ATLAS Xrootd) is a way to unify direct access to a diversity of storage services used by ATLAS Main use cases today: – Tier 3 or opportunistic CPU – Failover for PanDA queues Future: – Intelligent job brokering – Event service Main use cases today: – Tier 3 or opportunistic CPU – Failover for PanDA queues Future: – Intelligent job brokering – Event service

efi.uchicago.edu ci.uchicago.edu 7 File lookup sequence Data can be asked for from any endpoint and redirector Data are transferred directly from server to user Searching for the file is fast, delivering is more important Ideally one should use the one with best connection Usually that’s the closest one bnl NET2 AGLT2 MWT2 BNL OCHEP SWT2 SLAC XRD-CENTRAL GLRD ATLAS-XRD-EUATLAS-XRD-ASIA redirector endpoint

efi.uchicago.edu ci.uchicago.edu 8 User at Tier 3 reading from “nearby” Tier 2 FAX read access of Tier2 from Tier3: 317 jobs ~ 500 MB/s aggregate read F. Luehring

efi.uchicago.edu ci.uchicago.edu 9 PanDA Job Failover using FAX Currently PanDA only sends jobs to sites having complete datasets. In case that an input file can not be obtained after 2 tries, the file will be obtained through FAX if it exists on another site. There are in average 2.8 copies of files from recent reprocessing in FAX so there is a large change of success.

efi.uchicago.edu ci.uchicago.edu 10 Failover comments In production in US and UK for a few months Very low rates of usage (production storage elements already very efficient) and no major problems seen However, what happens if an unstable storage service fails all together? – Not yet seen, do a region’s Tier 1 or neighboring Tier 2 become stressed? – Controlled testing planned

efi.uchicago.edu ci.uchicago.edu 11 FAX usage in Job Brokering One can broker jobs to sites that don’t have all or part of input data and use FAX to read them directly. Beneficial in cases where A site has very often full queues as some specific data exist only there A site has free CPUs, good connectivity but not enough storage/data One can use external sources of CPU cycles (OSG, Amazon/Google cloud based queues,…) For this approach to be efficient, system has to “know” expected available bandwidth between a queue and all of the FAX endpoints And control loads accordingly Continuously collecting and storing data needed to do this (Cost Matrix).

efi.uchicago.edu ci.uchicago.edu 12 Cost matrix Measures transfer rates (memory-to- memory) between 42 ANALY queues and each FAX endpoint Jobs submitted by HammerCloud Jobs send results to ActiveMQ, consumed by dedicated machine stored in SSB with other network performance measurements - perfSONAR and FileTransferService HammerCloud FAX REST SSB FAX cost matrix SSB FAX cost matrix SSB Sonar view SSB Sonar view FTS PanDA job brokerage

efi.uchicago.edu ci.uchicago.edu 13 Simulate with WAN accesses HammerCloud based test Running real analysis code remote access 100 concurrent jobs from each site Arrow width proportional to average event rate WAN performance can be as good as LAN MWT2 SLAC AGLT2 BNL Friedrich G. Hönig At this scale not unreasonable. More testing on-going

efi.uchicago.edu ci.uchicago.edu 14 Relative-to-local performance Friedrich G. Hönig

efi.uchicago.edu ci.uchicago.edu 15 Caching studies over WAN Compare caching or direct access read over WAN with RTT ~ 3-5 ms WAN

efi.uchicago.edu ci.uchicago.edu 16 Caching studies Direct access Copy to cache Local access

efi.uchicago.edu ci.uchicago.edu 17 The other meaning of “Cache” (in Rucio) A cache is storage service which keeps additional copies of files to reduce response time and bandwidth usage. In Rucio, a cache is an RSE (Rucio Storage Element), tagged as volatile. The control of the cache content is usually handled by an external process or applications (e.g. Panda) and not by Rucio. The information about replica location on volatile RSEs can have a lifetime. Replicas registered on volatile RSEs are excluded from the Rucio replica management system (replication rules, quota, replication locks). There will be much celebration by site admins when this is realized! From Vincent Garrone

efi.uchicago.edu ci.uchicago.edu 18 Conclusions Broadly deployed, functional Xrootd federation covering vast portions of ATLAS data First use cases are implemented – Tier 3 access, PanDA failover from production queues Inputs for intelligent re-brokering of jobs being collected Caching studies, and development of caching services – still to do