CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on.

Slides:



Advertisements
Similar presentations
30-31 Jan 2003J G Jensen, RAL/WP5 Storage Elephant Grid Access to Mass Storage.
Advertisements

Wahid Bhimji SRM; FTS3; xrootd; DPM collaborations; cluster filesystems.
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
HDFS and S3 plugins Andrea Manzi Martin Hellmich 13/12/2013.
HEP Data Sharing … … and Web Storage services Alberto Pace Information Technology Division.
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
10 May 2007 HTTP - - User data via HTTP(S) Andrew McNab University of Manchester.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
1 Apache. 2 Module - Apache ♦ Overview This module focuses on configuring and customizing Apache web server. Apache is a commonly used Hypertext Transfer.
Ricardo Rocha ( on behalf of the DPM team ) Standards, Status and Plans.
PhysX CoE: LHC Data-intensive workflows and data- management Wahid Bhimji, Pete Clarke, Andrew Washbrook – Edinburgh And other CoE WP4 people…
IT-SDC : Support for Distributed Computing An HTTP federation prototype for LHCb Fabrizio Furano 1.
D C a c h e Michael Ernst Patrick Fuhrmann Tigran Mkrtchyan d C a c h e M. Ernst, P. Fuhrmann, T. Mkrtchyan Chep 2003 Chep2003 UCSD, California.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
CERN IT Department CH-1211 Geneva 23 Switzerland GT WG on Storage Federations First introduction Fabrizio Furano
Alejandro Alvarez Ayllon on behalf of the LCGDM developer team IT/SDC 13/12/2013 DAV support in DPM.
GridSite Web Servers for bulk file transfers & storage Andrew McNab Grid Security Research Fellow University of Manchester, UK.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM Collaboration Motivation and proposal Oliver Keeble CERN On.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM / LFC and FTS news Ricardo Rocha ( on behalf of the IT/GT/DMS.
CERN IT Department CH-1211 Geneva 23 Switzerland GT Davix A toolkit for efficient data access with HTTP/DAV based protocols Fabrizio Furano.
Managing Data DIRAC Project. Outline  Data management components  Storage Elements  File Catalogs  DIRAC conventions for user data  Data operation.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Standard Interfaces to Grid Storage DPM and LFC Update Ricardo.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
CERN IT Department CH-1211 Geneva 23 Switzerland GT WG on Storage Federations Sept 2012 Usages and Goals Summary Fabrizio Furano on behalf.
Storage Interfaces and Access pre-GDB Wahid Bhimji University of Edinburgh On behalf of all those who participated.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Overview of DMLite Ricardo Rocha ( on behalf of the LCGDM team.
IT-SDC : Support for Distributed Computing Dynamic Federations: scalable, high performance Grid/Cloud storage federations Fabrizio Furano - Oliver Keeble.
DMLite GridFTP frontend Andrey Kiryanov IT/SDC 13/12/2013.
Storage Interfaces and Access: Interim report Wahid Bhimji University of Edinburgh On behalf of WG: Brian Bockelman, Philippe Charpentier, Simone Campana,
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
IT-SDC : Support for Distributed Computing Dynafed FTS3 Human Brain Project use cases Fabrizio Furano Alejandro Alvarez.
1 EMI INFSO-RI Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar.
New Features of Xrootd SE Wei Yang US ATLAS Tier 2/Tier 3 meeting, University of Texas, Arlington,
DPM: Future Proof Storage Ricardo Rocha ( on behalf of the DPM team ) EMI INFSO-RI
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Standard Protocols in DPM Ricardo Rocha.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Solutions for WAN data access: xrootd and NFSv4.1 Andrea Sciabà.
EMI is partially funded by the European Commission under Grant Agreement RI Future Proof Storage with DPM Oliver Keeble (on behalf of the CERN IT-GT-DMS.
1 EMI INFSO-RI Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar.
Dynamic Federation of Grid and Cloud Storage Fabrizio Furano, Oliver Keeble, Laurence Field Speaker: Fabrizio Furano.
CERN IT Department CH-1211 Geneva 23 Switzerland GT Dynamic Federations Seamless aggregation of open-protocol-based storage endpoints Fabrizio.
Riccardo Zappi INFN-CNAF SRM Breakout session. February 28, 2012 Ingredients 1. Basic ingredients (Fabric & Conn. level) 2. (Grid) Middleware ingredients.
Grid Data Access: Proxy Caches and User Views EGI Technical Forum 19 September 2011 Jan Just Keijser Cristian Cirstea Jeff Templon.
EGEE Data Management Services
Federating Data in the ALICE Experiment
Dynamic Storage Federation based on open protocols
Ricardo Rocha ( on behalf of the DPM team )
Third Party Transfers & Attribute URI ideas
Vincenzo Spinoso EGI.eu/INFN
Status of the SRM 2.2 MoU extension
Future of WAN Access in ATLAS
dCache “Intro” a layperson perspective Frank Würthwein UCSD
Dynafed, DPM and EGI DPM workshop 2016 Speaker: Fabrizio Furano
GT Dynamic Federations
Gfal/lcg-util -> Gfal2/gfal2-util
Introduction to Data Management in EGI
Taming the protocol zoo
GFAL 2.0 Devresse Adrien CERN lcgutil team
EGI UMD Storage Software Repository (Mostly former EMI Software)
Ákos Frohner EGEE'08 September 2008
DCache things Paul Millar … on behalf of the dCache team.
DPM and SRM-less operation
SRM over SSL Paul Millar, on behalf of Alex Sim, Jean-Philippe Baud, Ricardo Rocha, Giuseppe LoPresti, Patrick Fuhrmann.
Data Management cluster summary
INFNGRID Workshop – Bari, Italy, October 2004
Presentation transcript:

CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on behalf of the IT-GT/DMS crew

2 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. Outline Intro –Features http+dav –Our challenge DPM Federations FTS Davix

3 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. Intro HTTP: open and extensible can be used in many app domains –Standard in the way things can be embedded –Supports features that became a standard, like WebDAV, or metalinks Data access, data xfer and federation CAN be done –We all agree. At the same time it’s difficult to find elsewhere a complete HEP production level storage/data access system that does what seems obvious to us. There are quality components that support the low level features (e.g. protocol implementation) –The Apache framework is a good example Our challenge: build on them quality components that fulfil the HEP requirements –Clustering of disk backends into storage pools + Clustering of pools into sites (DPM) Keep the historical technologies available until they are used (SRM, dpns) Provide the needed data access protocols (GridFTP, HTTP, Xrootd,...) HTTP/DAV support for XROOTD is also being worked on –Clustering of sites into federations Static, explicit indexing of content of sites (LFC) Dynamic high performance discovery (Dynamic feds) –Sophisticated, scheduled transfers (FTS) This comes from EMI, the goal and the effort is shared among DPM, dCache and STORM –NEWS: the STORM people now have a prototype for HTTP/DAV

4 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. SRM support The idea is to keep it until it’s used Missing areas of SRM that could be implemented over HTTP Get/set checksums –Can already get checksums –Could extend DAV implementation to write them Write to spacetoken is possible – ?spacetoken=reservation Query usage would need some implementation via namespace –RFC2518 “Quota and size properties for DAV”

5 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. DPM Rationale: –Evolve smoothly from the historical design –Make integration with other frontends/backends easier Goals –support the big DPM community –accommodate many types of data access frontends –evolve frontends to HTTP/DAV with new high quality components –accommodate many types of backend (legacy pools, S3, HDFS, etc.)

6 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. DPM Frontends: HTTP / DAV Frontend based on Apache2 + mod_dav, using DMLite Supports all the historical features, mapped to DAV –We implemented the old Cns API on top of DAV, no need to change the applications Can be for both get/put style (=GridFTP) or direct access –Some extras for full GridFTP equivalence Multiple streams with Range/Content-Range Third party copies using WebDAV COPY + Gridsite Delegation –Random I/O Possible to do vector reads and other optimizations Metalink support (failover, retrial) Since it is already DMLite based

7 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. DPM Performance.. Getting there Together with our partners (in particular ASGC) we run many rounds of performance tests The conclusion is that the HTTP data access is becoming competitive for the main use cases (Xfer, analysis, LAN, WAN) Many aspects –Tuning (Apache+backends in our case) –Technology TTreeCache in ROOT is very important for some use cases No space here for tables... please ask if you are interested

8 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. Clusterize & federate One element in the HTTP ecosystem is the potential of doing this WAN loose clustering, that we often call “federations” HTTP is a good data access protocol. Any browser supports the three points: –Native support for redirections –Usable (=Good performance and robustness) over WAN –Support for direct data access and metadata access HTTP can also be used as a clustering control protocol. –We can federate HTTP and DAV servers on the fly

9 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. HTTP support in GFAL2 + FTS3 We've demonstrated that HTTP WAN transfers have the necessary performance Enabled 3rd party transfer on DPM (and it's coming in dCache) –Currently works in “push mode”, any sink is supported We developed a complete client API using HTTP We’ve integrated HTTP support into GFAL2 –now gfal2 offers a way to integrate HTTP resources easily into grid data management in general FTS3's HTTP support is based on gfal2's HTTP support

10 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. LFC and the Dynamic Feds Significant effort in the federation domain. The same DMLite + WebDAV stack supports: –LFC: can be thought as an HTTP redirector that allows us to browse the full namespace maps logical names into physical names accepts file replica registration requests –Dynamic Federations: namespace parts can also be reconstructed quickly on the fly With no need of registering replicas, they are where they can be found (and their locations cached) High resiliency to sites/clusters up-downtimes High resiliency to network glitches Great flexibility, same access/browsing interface as LFC

CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. FTS support for HTTP 3 rd party copy

12 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. The client For clusterization and HEP access to work we need some basic bricks: –Redirection support in all the primitives of HTTP and DAV –Flexible support for the various security protocols (e.g. GSI proxies, krb, pwd, x509) –Map the “posix-like” things (open, stat, close,...) into DAV requests/responses Data access, data xfer and federation CAN be done with standard clients –We all agree. At the same time it’s difficult to find elsewhere a complete HEP production level storage/data client that does what seems obvious to us. –The features that the users see is also proportional to the completeness of the clients they use and to their quality. –Forcing app developers to implement features over XML parsing is ugly. This is the option without a complete client. –We can use a standard client to implement what we need, on top of it.

13 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. Full-featured client wanted Support for all these things had to be encapsulated in a good client –Maybe this had to be a contribution from HEP to the HTTP world –Feature support in the various clients is sparse –We can use the standard clients as low-level clients, like libNEON, and add the features HTTP/WebDAV is a protocol, Data access is an application. –Users want quality tools, not a specification –Current HTTP tools are designed around generic HTTP requests –Our use case is high performance file access and file management Our DAVIX client could be integrated in ROOT, modulo the TWebFile/TWebSystem wrapper –This part is under discussion and investigation

CERN IT Department CH-1211 Geneva 23 Switzerland GT Thank you... questions?

15 CERN IT Department CH-1211 Geneva 23 Switzerland GT Jan F.Furano - HTTP access, xfer and federation for HEP data. Contacts & info LCGDM Wiki: – –Contains links to tech documents, papers, presentations, information, etc. Some papers: –DPM: –Dyn Feds: Some people to talk to: –DPM, DMLite: Ricardo Rocha –Dynamic Federations: Fabrizio Furano –DAV, DMLite: Alejandro Alvarez –DAVIX, GFAL: Adrien Devresse –FTS3: Michail Salichos