Dynamic Federation of Grid and Cloud Storage Fabrizio Furano, Oliver Keeble, Laurence Field Speaker: Fabrizio Furano.

Slides:



Advertisements
Similar presentations
The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Advertisements

The GridSite Toolbar Shiv Kaushal The University of Manchester All Hands Meeting 2006.
Kyle Thurow, Kyle Neuschaefer, Alexander Matusiak, and Justin Carroll.
Enterprise Content Management Departmental Solutions Enterprisewide Document/Content Management at half the cost of competitive systems ImageSite is:
Volunteer Computing Laurence Field IT/SDC 21 November 2014.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 6 2/13/2015.
USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
1 CS6320 – Why Servlets? L. Grewe 2 What is a Servlet? Servlets are Java programs that can be run dynamically from a Web Server Servlets are Java programs.
The Prototype Laurence Field IT/SDC 11 November 2014.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
EUROPEAN UNION Polish Infrastructure for Supporting Computational Science in the European Research Space The Capabilities of the GridSpace2 Experiment.
CaGrid 2.0 December What is caGrid 2.0??? Provides a patch for caGrid 1.x to support SHA2 OSGi implementation of WSRF on the new technical stack.
Chapter 6 The World Wide Web. Web Pages Each page is an interactive multimedia publication It can include: text, graphics, music and videos Pages are.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
The Data Bridge Laurence Field IT/SDC 6 March 2015.
PAPI Points of Access to Providers of Information.
2005 JACoW Team Meeting Thomas Baron/Jose Benito Gonzalez – CERN – IT Managing Events with Indico.
IT-SDC : Support for Distributed Computing An HTTP federation prototype for LHCb Fabrizio Furano 1.
Dudok de Wit David.  Documents management in a deskless company  SharePoint Online as a solution  Redesigning the documentary organization  Interoperability.
Copyright © cs-tutorial.com. Overview Introduction Architecture Implementation Evaluation.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
Discover the Newest Solution from Expertime: Magento + PimCore Running on Microsoft Azure MICROSOFT AZURE ISV PROFILE: EXPERTIME Expertime works with clients.
Afresco Overview Document management and share
CERN IT Department CH-1211 Geneva 23 Switzerland GT Davix A toolkit for efficient data access with HTTP/DAV based protocols Fabrizio Furano.
CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on.
Ben Robb MVP, SharePoint Server CTO, cScape Ltd Interoperability Overview: All Roads Lead to SharePoint.
Apache Web Server Architecture Chaitanya Kulkarni MSCS rd April /23/20081Apache Web Server Architecture.
UNDERSTANDING YOUR OPTIONS FOR CLIENT-SIDE DEVELOPMENT IN OFFICE 365 Mark Rackley
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
IT-SDC : Support for Distributed Computing Dynamic Federations: scalable, high performance Grid/Cloud storage federations Fabrizio Furano - Oliver Keeble.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
IT-SDC : Support for Distributed Computing Dynafed FTS3 Human Brain Project use cases Fabrizio Furano Alejandro Alvarez.
1 EMI INFSO-RI Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar.
Trimantra Software Solution Offshore Software Development Outsourcing Company Visit :
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
ILC DMS – 8 th November 2005 Thomas Baron – CERN – IT Managing Events with Indico.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Standard Protocols in DPM Ricardo Rocha.
IT-SDC : Support for Distributed Computing Dynamic Federation of Grid and Cloud Storage Fabrizio Furano, Oliver Keeble, Laurence Field Speaker: Fabrizio.
1 EMI INFSO-RI Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
BOF-1147, JavaTM Technology and WebDAV: Standardizing Content Management Java and WebDAV Juergen Pill Team Leader Software AG Remy Maucherat Software Engineer.
CERN IT Department CH-1211 Geneva 23 Switzerland GT Dynamic Federations Seamless aggregation of open-protocol-based storage endpoints Fabrizio.
PaaS services for Computing and Storage
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
Managing State Chapter 13.
Dynamic Storage Federation based on open protocols
Scalable Web Apps Target this solution to brand leaders responsible for customer engagement and roll-out of global marketing campaigns. Implement scenarios.
Ricardo Rocha ( on behalf of the DPM team )
The Data Grid: Towards an architecture for Distributed Management
Vincenzo Spinoso EGI.eu/INFN
Unified Data Access and MGMT. in Distributed hybrid Cloud
Dynafed, DPM and EGI DPM workshop 2016 Speaker: Fabrizio Furano
Warm Handshake with Websites, Servers and Web Servers:
GT Dynamic Federations
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
The GEMBus Architecture and Core Components
Introduction to Data Management in EGI
Power BI Security Best Practices
Scalable Web Apps Target this solution to brand leaders responsible for customer engagement and roll-out of global marketing campaigns. Implement scenarios.
DCache things Paul Millar … on behalf of the dCache team.
Lecture 1: Multi-tier Architecture Overview
Web Page Concept and Design :
Chengyu Sun California State University, Los Angeles
Presentation transcript:

Dynamic Federation of Grid and Cloud Storage Fabrizio Furano, Oliver Keeble, Laurence Field Speaker: Fabrizio Furano

Dynamic Storage Federations

Dynamic Federations: Dynafed Project started in 2011 in EMI as an exploration on storage federations with open protocols. Collaboration with dCache team. Now the core is a stable protocol-agnostic component Relies only on standard services in sites/endpoints Our interest is in scalable performance, HTTP, WebDAV, cloud storage, S3, Microsoft Azure and friendly tools Various projects are using or evaluating it, in HEP and outside HEP Its features with cloud storage are particularly interesting Interplays well with FTS, as a file movement workhorse

What’s Dynafed Dynafed is a browser-friendly realtime scalable aggregator of HTTP/WebDAV/S3/MS-Azure metadata sources. Aggregates/caches/presents metadata, redirects clients to resources for reading or writing. Geography-aware redirections Realtime detection of site up-ness, no need of installing anything special at the sites Presentation is usually through WebDAV and HTML Low latency realtime behavior, can be used in LAN and WAN, or both With S3 and Azure it keeps keys secret, natively exploiting the S3/Azure delegation scheme Supports folders on S3 with no overhead Supports sophisticated filename xlations (e.g. the Rucio plugin) Applies uniform Apache-based authentication Applies uniform authorization rules: Apache modules, libgridsite or its own flexible plugin-based rule engine

01 Oct 2015 DESY Prototype: 14/15 LHCb sites 60 ATLAS sites Geography-based Client-aware redirections Flexible authentication/authorizat ion, friendly with identity federations Realtime detection of sites’ up-ness Makes S3/Azure storage easy to use and mix Scales it up and applies uniform security.../dir1/file1.../dir1/file2.../dir1/file3 With 2 replicas Site A (HTTP/S3/Azure) Site B (HTTP/S3/Azure) /dir1 /dir1/file1 /dir1/file2 /dir1/file3 On the fly friendly visualization Full WebDAV access Redirection-based Robust against failures Fully scalable

Easy access Main Dynafed testbed with dozens of Grid endpoints and several demos The file being accessed is hosted in an XrdHTTP instance… somewhere Data discovery is dynamic, no static indexing involved The HTTP ecosystem can give unprecedented flexibility to Grid data access, fully supporting the Grid workflows

Browser view

Metalink

Pluggable authorization Embedded plugin that applies rules, e.g. For path /a/b/c group1 can read/list For path /a/b/c group2 and user3 can read/write/list/delete Pluggable interface, can load plugins implementing authZ Native python plugin, uses the internal Python C API to execute a function. No spawn, very fast ! Authorization rules can be written as a Python function executed natively The func is passed all the authentication info Total flexibility of writing whatever rule as a python function Caveat: It must be fast.

Seamless Cloud support S3 and Microsoft Azure provide different REST interfaces that are a sort of HTTP dialect S3 in particular is scalability-oriented on the server side, somehow makes non-scalable usage difficult Simple and very fast access delegation mechanism S3 Supports hierarchical content ( directories! ) in buckets in a way that a vanilla client can’t easily exploit No concept of directory, just path prefix It defines a tree in the opposite direction with respect to a regular file system That’s the same choice as Dynafed ! We wrote a simple DynaFed C++ plugin that exploits all these in a friendly way and matches them on the fly

Dynamic Cloud support Dynafed can federate any number of remote S3 buckets or Azure shares together with other non cloud storages This fed will work as a unique read/write WebDAV storage, totally seamless, extremely fast and scalable. This fed will avoid having to distribute cloud keys to the clients, works with short-term delegations Users/jobs do not need to bother with S3 or Azure mechanics, just use a clean URL and their credentials Tested with MS Azure, Amazon S3, Ceph S3 implementations This fed can apply uniform, flexible authorization/authentication Can be X509, login/pwd, in principle whatever mechanism that works as an Apache module, plus authZ rules/funcs We used this mechanism in …

For more info Dynafed homepage: Full Dynafed documentation: doc/whitepaper/Doc_DynaFeds.pdf doc/whitepaper/Doc_DynaFeds.pdf Demo testbed: Web FTS homepage: DAVIX (powerful HTTP/WebDAV/S3/Azure client) :