dCache – protocol developments and plans

Slides:



Advertisements
Similar presentations
Owen SyngeDCache Slide 1 D-Cache Status Report Owen Synge.
Advertisements

ATLAS T1/T2 Name Space Issue with Federated Storage Hironori Ito Brookhaven National Laboratory.
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
Storage Issues: the experiments’ perspective Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 9 September 2008.
Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.
Overview Managing a DHCP Database Monitoring DHCP
MW Readiness Verification Status Andrea Manzi IT/SDC 21/01/ /01/15 2.
Status & Plan of the Xrootd Federation Wei Yang 13/19/12 US ATLAS Computing Facility Meeting at 2012 OSG AHM, University of Nebraska, Lincoln.
1 User Analysis Workgroup Discussion  Understand and document analysis models  Best in a way that allows to compare them easily.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
BNL Service Challenge 3 Status Report Xin Zhao, Zhenping Liu, Wensheng Deng, Razvan Popescu, Dantong Yu and Bruce Gibbard USATLAS Computing Facility Brookhaven.
Storage Interfaces Introduction Wahid Bhimji University of Edinburgh Based on previous discussions with Working Group: (Brian Bockelman, Simone Campana,
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Storage Interfaces and Access pre-GDB Wahid Bhimji University of Edinburgh On behalf of all those who participated.
SESEC Storage Element (In)Security hepsysman, RAL 0-1 July 2009 Jens Jensen.
Enabling Grids for E-sciencE INFSO-RI Enabling Grids for E-sciencE Gavin McCance GDB – 6 June 2007 FTS 2.0 deployment and testing.
Wahid Bhimji (Some slides are stolen from Markus Schulz’s presentation to WLCG MB on 19 June Apologies to those who have seen some of this before)
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
EMI is partially funded by the European Commission under Grant Agreement RI Future Proof Storage with DPM Oliver Keeble (on behalf of the CERN IT-GT-DMS.
Implementation of GLUE 2.0 support in the EMI Data Area Elisabetta Ronchieri on behalf of JRA1’s GLUE 2.0 Working Group INFN-CNAF 13 April 2011, EGI User.
WLCG Operations Coordination Andrea Sciabà IT/SDC GDB 11 th September 2013.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
The Grid Information System Maria Alandes Pradillo IT-SDC White Area Lecture, 4th June 2014.
Sitecore upgrades The Past, The Present, The Future.
Nov 05, 2008, PragueSA3 Workshop1 A short presentation from Owen Synge SA3 and dCache.
Security recommendations for dCache
dCache Paul Millar, on behalf of the dCache Team
Jean-Philippe Baud, IT-GD, CERN November 2007
Dynamic Extension of the INFN Tier-1 on external resources
WLCG IPv6 deployment strategy
WLCG Workshop 2017 [Manchester] Operations Session Summary
Scalable sync-and-share service with dCache
Ricardo Rocha ( on behalf of the DPM team )
OGF PGI – EDGI Security Use Case and Requirements
Report from WLCG Workshop 2017: WLCG Network Requirements GDB - CERN 12th of July 2017
StoRM: a SRM solution for disk based storage systems
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
gLite->EMI2/UMD2 transition
Storage Interfaces and Access: Introduction
CREAM Status and Plans Massimo Sgaravatto – INFN Padova
WLCG experiments FedCloud through VAC/VCycle in the EGI
SRM v2.2 / v3 meeting report SRM v2.2 meeting Aug. 29
Gfal/lcg-util -> Gfal2/gfal2-util
dCache Scientific Cloud
Introduction to Data Management in EGI
Storage Protocol overview
Taming the protocol zoo
SRM2 Migration Strategy
Marian Babik Andrea Sciabà
T-StoRM: a StoRM testing framework
Quality Control in the dCache team.
GFAL 2.0 Devresse Adrien CERN lcgutil team
EGI UMD Storage Software Repository (Mostly former EMI Software)
The INFN Tier-1 Storage Implementation
DCache things Paul Millar … on behalf of the dCache team.
WLCG and support for IPv6-only CPU
dCache: new and exciting features
SRM over SSL Paul Millar, on behalf of Alex Sim, Jean-Philippe Baud, Ricardo Rocha, Giuseppe LoPresti, Patrick Fuhrmann.
WLCG Collaboration Workshop;
Data Management cluster summary
Status and Future Steps
Monitoring of the infrastructure from the VO perspective
DGAS Today and tomorrow
CHIPP - CSCS F2F meeting CSCS, Lugano January 25th , 2018.
Summary of the dCache workshop
Presentation transcript:

dCache – protocol developments and plans Paul Millar (on behalf of the dCache team) 2015-01-13 pre-GDB: on Data Management & Data Preservation

Why am I here? We, WLCG, are re-evaluating the “protocol zoo” From a dCache point-of-view: dCache provides significant storage for WLCG. dCache provides significant storage capacity for many other communities. The dCache team needs to be efficient – not waste resources.

as a consequence, the dCache mantra Use standards.

What should we do with SRM? There has been considerable investment in SRM, SRM has some unique features: Transfer protocol negotiation, Asynchronous operations, Bulk operations, Transactions / 2-stage-commit upload. (Space reservations, Access-Latency, …) Battle tested, with over 10 years of production use → it works.

SRM unique features If WLCG continues to makes use of SRM unique features then it must either: Continue using SRM for those unique features Invent a new protocol to support unique features (NB. a protocol extension == a new protocol) Inventing a new protocol is bad: takes effort away from all software teams WLCG has a finite effort → implementing a new protocol means all WLCG storage will suffer. Recommendation: if WLCG uses SRM-unique features then stay with SRM for those features We (dCache) are continuing to improve SRM, both server and client.

Non-unique SRM features Storage accounting (WebDAV+RFC-4331) Not equivalent, but if space-reservations tied to namespace, it may be used instead. Along with DPM, we plan to implement support for this. Direct data transfers (not actually an SRM feature) Suggest using WebDAV for WAN and NFS for LAN (like GPFS+StoRM). We plan to drop dcap support once NFS is proven to work in production. 3rd party transfers (FTP, WebDAV+extension, xrootd+extension, NFSv4.2) Solution space is quite complex. Suggest using FTP for now and watch how WebDAV develops. N.B. SRM two-stage commit can prevent dark data. Namespace operations (FTP, WebDAV, xrootd, NFS) Recommend bulk operations use SRM; for a “small” number of operations, any protocol is fine – suggest WebDAV or NFS if available.

WebDAV We have added 3rd-party transfer support Supports current specification. Working with FTS/DPM developers to verify this. Some limitations of the current approach, some development may be needed. Will be adding support for RFC 4331. Provides information like du -ks <directory>. For many VOs, may be a substitute for SRM space accounting.

NFS Deployed at DESY for CMS (WNs and NAF) and other users: “Interactive” NAF has CMS permanently mounted. Grid WN was for a limited period, with some fraction of WNs using NFS. Operations are basically OK, but performance is under investigation. When network is working well, performance is comparable with dcap Statistics show a slight decrease in performance, but a good starting point. NFS protocol also allows the clients (WNs) reading data through the door, which is used as a fall-back if there's a problem. This works, but there is a large impact on performance. We're working on this by: tuning the client to be more resilient to minor, transitory problems, reducing this impact when the client falls back and reads through the door.

xrootd Plugins allow communities to extend basic functionality: CMS, ATLAS and ALICE make use of this Used for monitoring, name-to-name translation (federation), access-control, redirection, … Plugin framework will change in the near future: dCache v2.12 (~1st March) or v2.13 (~1st July) (don't panic) 3rd-party copy: currently no road-map.

Authentication Many people looking at SAML (or OpenID- Contect) to replace X.509 dCache team are investigating this in collaboration with LSDMA, EGI FedCloud and OGF. Medium term solutions will likely involve gateway services: Infrastructure continues to use X.509; online CAs generate X.509 certificates. Does not exclude any of the listed protocols.

Release versions: x.y.z Bug-fix releases Fix bugs and (very occasionally) back-port important features A “just upgrade the RPM” release Major releases New features, changes to configuration Try to minimise impact, but sites likely have to do something Golden releases: Popular with sites Signals that deprecated features will be removed in the next major release. Support period overlaps with previous golden release for ~1 year

Fix bugs and (very occasionally) back-port important features 2.13: Next Golden Release Release versions: x.y.z Bug-fix releases Fix bugs and (very occasionally) back-port important features A “just upgrade the RPM” release Major releases New features, changes to configuration Try to minimise impact, but sites likely have to do something Golden releases: Popular with sites Signals that deprecated features will be removed in the next major release. Support period overlaps with previous golden release for ~1 year

Fix bugs and (very occasionally) back-port important features 2.13: Next Golden Release Release versions: x.y.z Bug-fix releases Fix bugs and (very occasionally) back-port important features A “just upgrade the RPM” release Major releases New features, changes to configuration Try to minimise impact, but sites likely have to do something Golden releases: Popular with sites Signals that deprecated features will be removed in the next major release. Support period overlaps with previous golden release for ~1 year

RUN 2 Release versions: x.y.z Bug-fix releases 2.13: Next Golden Release Release versions: x.y.z Bug-fix releases Fix bugs and (very occasionally) back-port important features A “just upgrade the RPM” release Major releases New features, changes to configuration Try to minimise impact, but sites likely have to do something Golden releases: Popular with sites Signals that deprecated features will be removed in the next major release. Support period overlaps with previous golden release for ~1 year RUN 2

RUN 2 Release versions: x.y.z Bug-fix releases 2.13: Next Golden Release Release versions: x.y.z Bug-fix releases Fix bugs and (very occasionally) back-port important features A “just upgrade the RPM” release Major releases New features, changes to configuration Try to minimise impact, but sites likely have to do something Golden releases: Popular with sites Signals that deprecated features will be removed in the next major release. Support period overlaps with previous golden release for ~1 year RUN 2

One more thing...

Prometheus The goal: try to discover bugs before software hits production services The problem: We can't test all (ATLAS, CMS, LHCb, Alice, …) use-cases. The offer: prometheus.desy.de A test instance, rebuilt daily (data is lost overnight), Always the tip of development branch (currently 2.12.0-SNAPSHOT), Anyone from atlas, cms, lhcb, alice and dteam VOs can use this service right now. Verify their software-stack works with the next major-version dCache. Bug-fixes (which will be rolled out on production services) are available first in prometheus. If you're interested, start testing – contact me if there's any problems.

Thanks for listening, any questions?

Backup slides

DCAP vs NFS performance