IT-SDC : Support for Distributed Computing Dynamic Federation of Grid and Cloud Storage Fabrizio Furano, Oliver Keeble, Laurence Field Speaker: Fabrizio.

Slides:



Advertisements
Similar presentations
Implementing Tableau Server in an Enterprise Environment
Advertisements

NDN in Local Area Networks Junxiao Shi The University of Arizona
Volunteer Computing Laurence Field IT/SDC 21 November 2014.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
The Prototype Laurence Field IT/SDC 11 November 2014.
WebFTS as a first WLCG/HEP FIM pilot
Understanding and Managing WebSphere V5
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
The Data Bridge Laurence Field IT/SDC 6 March 2015.
Csi315csi315 Client/Server Models. Client/Server Environment LAN or WAN Server Data Berson, Fig 1.4, p.8 clients network.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
PanDA Multi-User Pilot Jobs Maxim Potekhin Brookhaven National Laboratory Open Science Grid WLCG GDB Meeting CERN March 11, 2009.
IT-SDC : Support for Distributed Computing An HTTP federation prototype for LHCb Fabrizio Furano 1.
Virtual Workspaces Kate Keahey Argonne National Laboratory.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
WHAT OUR CUSTOMERS ARE SAYING “After thorough market research and a review process, Qorus Breeze Proposals stood out from the competitors because of its.
WebFTS File Transfer Web Interface for FTS3 Andrea Manzi On behalf of the FTS team Workshop on Cloud Services for File Synchronisation and Sharing.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM Collaboration Motivation and proposal Oliver Keeble CERN On.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
CERN IT Department CH-1211 Geneva 23 Switzerland GT HTTP solutions for data access, transfer, federation Fabrizio Furano (presenter) on.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
The new FTS – proposal FTS status. EMI INFSO-RI /05/ FTS /05/ /05/ Bugs fixed – Support an SE publishing more than.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Upcoming Features and Roadmap Ricardo Rocha ( on behalf of the.
Andrea Manzi CERN On behalf of the DPM team HEPiX Fall 2014 Workshop DPM performance tuning hints for HTTP/WebDAV and Xrootd 1 16/10/2014.
EGI-Engage Data Services and Solutions Part 1: Data in the Grid Vincenzo Spinoso EGI.eu/INFN Data Services.
Microsoft Azure and DataStax: Start Anywhere and Scale to Any Size in the Cloud, On- Premises, or Both with a Leading Distributed Database MICROSOFT AZURE.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Overview of DMLite Ricardo Rocha ( on behalf of the LCGDM team.
IT-SDC : Support for Distributed Computing Dynamic Federations: scalable, high performance Grid/Cloud storage federations Fabrizio Furano - Oliver Keeble.
DCache/XRootD Dmitry Litvintsev (DMS/DMD) FIFE workshop1Dmitry Litvintsev.
Tutorial on Science Gateways, Roma, Catania Science Gateway Framework Motivations, architecture, features Riccardo Rotondo.
Excel Services Displays all or parts of interactive Excel worksheets in the browser –Excel “publish” feature with optional parameters defined in worksheet.
Active Directory Domain Services (AD DS). Identity and Access (IDA) – An IDA infrastructure should: Store information about users, groups, computers and.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Andrea Manzi CERN EGI Conference on Challenges and Solutions for Big Data Processing on cloud 24/09/2014 Storage Management Overview 1 24/09/2014.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
IT-SDC : Support for Distributed Computing Dynafed FTS3 Human Brain Project use cases Fabrizio Furano Alejandro Alvarez.
1 EMI INFSO-RI Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
REST API to develop application for mobile devices Mario Torrisi Dipartimento di Fisica e Astronomia – Università degli Studi.
Improve the Performance, Scalability, and Reliability of Applications in the Cloud with jetNEXUS Load Balancer for Microsoft Azure MICROSOFT AZURE ISV.
EMI is partially funded by the European Commission under Grant Agreement RI Future Proof Storage with DPM Oliver Keeble (on behalf of the CERN IT-GT-DMS.
1 EMI INFSO-RI Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar.
Discover How You Can Increase Collaboration with External Partners While Reducing Your Cost in Managing an Extranet from the Azure Cloud MICROSOFT AZURE.
Dynamic Federation of Grid and Cloud Storage Fabrizio Furano, Oliver Keeble, Laurence Field Speaker: Fabrizio Furano.
Docker for Ops: Operationalize Your Apps in Production Vivek Saraswat Sr. Product Evan Hazlett Sr. Software
CERN IT Department CH-1211 Geneva 23 Switzerland GT Dynamic Federations Seamless aggregation of open-protocol-based storage endpoints Fabrizio.
CERN IT-Storage Strategy Outlook Alberto Pace, Luca Mascetti, Julien Leduc
Onedata Eventually Consistent Virtual Filesystem for Multi-Cloud Infrastructures Michał Orzechowski (CYFRONET AGH)
Dynamic Storage Federation based on open protocols
Ricardo Rocha ( on behalf of the DPM team )
Vincenzo Spinoso EGI.eu/INFN
Dynafed, DPM and EGI DPM workshop 2016 Speaker: Fabrizio Furano
Open Source distributed document DB for an enterprise
GT Dynamic Federations
Nope OS Prepared by, Project Guides: Ms. Divya K V Ms. Jucy Vareed
Introduction to Data Management in EGI
Sergio Fantinel, INFN LNL/PD
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
DCache things Paul Millar … on behalf of the dCache team.
Lecture 1: Multi-tier Architecture Overview
Media365 Portal by Ctrl365 is Powered by Azure and Enables Easy and Seamless Dissemination of Video for Enhanced B2C and B2B Communication MICROSOFT AZURE.
Unit# 5: Internet and Worldwide Web
Presentation transcript:

IT-SDC : Support for Distributed Computing Dynamic Federation of Grid and Cloud Storage Fabrizio Furano, Oliver Keeble, Laurence Field Speaker: Fabrizio Furano

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Grid+Cloud storage  We report on work about seamless integration of Cloud storage resources in HTTP-enabled workflows  I.e. How to use Cloud resources together with existing Grid and HEP distributed storage  Limit effort needed (possibly next to zero)  Agility in adding/removing storage  Make usage/management seamless  Promote scalability, performance and sw quality  Preserve sites’ admin autonomy  Allow “opportunism” in resource management  Very simple tech requirements, easy to share with other communities

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Why HTTP/DAV?  Interesting technical features  Multitalented, covers most existing use cases, while allowing new stuff  Applications just access the data, wherever they are (very different from distributed FSs)  Supports WAN direct access  Performance can be very high for applications using it efficiently  It’s there, whatever platform we consider  HTTP is moving much more data than HEP worldwide, although in different ways  We like browsers, they give a feeling of simplicity  Goes towards convergence  One technology can accommodate multiple use cases, also interactive  Users can use their preferred devices and apps to access their data  Sophisticated custom applications are allowed  Can more easily be connected to commercial systems and apps  Attractive for a professional to be formed in these systems  Greater chances to be understood when you mention it 3

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage The Grid DM problem  The reality of a production-grade distributed model is challenging  Just locating a resource in a huge index can be a challenge for scalability and speed  With 10s or 100s sites the normality is that they come and go  With 1000s disks the normality is that quite a few break  Cloud storage is not immune, as it can be added, removed without notice, or have downtimes  How to track “Where is file X now” ?  This is different from “Where is file X supposed to be ?”  How to reduce the data mgmt cost for finding it ?

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Where is file X … check now  An idea pioneered since years 2000s with the Xrootd framework   Where is file X ? Spread the load by asking the working endpoints  Pays just a network round trip  Even through WAN, most of the times it’s quicker than a loaded DBMS  By construction it’s correct in that moment, naturally models data losses  Locations can be cached for some time  If a file is accessed now, then it likely will be again shortly (temporal locality principle)  We can then have a frontend system that is able to locate files by asking the endpoints “Do you have file X in this moment” ?  If properly done, this mechanism can be extended to file listings, VERY useful to administrators

IT-SDC : Support for Distributed Computing Dynamic Storage Federations

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Dynamic Federations: Dynafed  Project started in 2011 in EMI as an exploration on storage federations with open protocols. Collaboration with dCache team.  Now the core is a stable protocol-agnostic component  Relies only on standard services in sites/endpoints  Our interest is in scalable performance, HTTP, WebDAV, S3 and friendly tools  Various projects are using or evaluating it, in HEP and outside HEP  Its features with S3-based cloud storage are particularly interesting  Interplays well with FTS, as a file movement workhorse

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage What’s Dynafed  Dynafed is a browser-friendly realtime scalable aggregator of HTTP/WebDAV/S3 metadata sources.  Aggregates/caches/presents metadata, redirects clients to resources for reading or writing. Geography-aware redirections  Realtime detection of site up-ness, no need of installing anything special at the sites  Presentation is usually through WebDAV and HTML  Low latency realtime behavior, can be used in LAN and WAN, or both  With S3 it keeps keys secret, natively exploiting the S3 delegation scheme  Supports folders on S3 with no overhead  Supports Rucio file semantics (plugin)  Can talk to external DBs or services. Can just be seen by them as a large WebDAV site  Applies uniform Apache-based authentication  Applies uniform authorization rules: Apache modules, libgridsite or its own rules

Dynamic Federation of Grid and Cloud Storage IT-SDC 01 Oct 2015 DESY Prototype: 14/15 LHCb sites 60 ATLAS sites Geography-based Client-aware redirections Flexible authentication/authorizat ion, friendly with identity federations Realtime detection of sites’ up-ness Makes S3 storage easy to use, scales it up and applies uniform security.../dir1/file1.../dir1/file2.../dir1/file3 With 2 replicas Site A (HTTP/S3) Site B (HTTP/S3) /dir1 /dir1/file1 /dir1/file2 /dir1/file3 On the fly friendly visualization Full WebDAV access Redirection-based Robust against failures Fully scalable

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Easy access  Main Dynafed testbed with dozens of Grid endpoints and several demos  The file being accessed is hosted in an XrdHTTP instance… somewhere  Data discovery is dynamic, no static indexing involved  The HTTP ecosystem can give unprecedented flexibility to Grid data access, fully supporting the Grid workflows

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage S3 support in the HTTP ecosystem  S3 is a sort of very smart HTTP dialect  Scalability-oriented on the server side, somehow makes non-scalable usage difficult  Simple and very fast access delegation mechanism  Supports hierarchical content ( directories! ) in buckets in a way that a vanilla client can’t easily exploit  No concept of directory, just path prefix  It defines a tree in the opposite direction with respect to a regular file system  We wrote a simple DynaFed C++ plugin that exploits all these in a friendly way

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage S3 support in the HTTP ecosystem  Dynafed can federate any number of remote S3 buckets together with other non S3 storages  This S3 fed will work as a unique read/write WebDAV storage, totally seamless, extremely fast and scalable.  This S3 fed will avoid having to distribute S3 keys to the clients, works with short-term delegations  Users/jobs do not need to bother with S3 mechanics, just use a clean URL  Tested with Amazon and Ceph S3 implementations  This S3 fed can apply a uniform authorization/authentication schema  Can be X509, login/pwd, in principle whatever mechanism that works as an Apache module  We used this mechanism in …

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage The Data Bridge  The Data Bridge is a component that bridges authentication domains for storage access  Context: volunteer computing (BOINC) and Grid environment  Built on Apache plus the Dynafed technology with S3 buckets as backends  Dynafed exploit that, plus giving scalability, easy data presentation, uniform authorization and flexibility  Apache can also host other ‘standard’ authentication plugins  It’s a generic idea to harmonize Cloud storage and multiple authentication domains, including Grid/X509/VOMS  BOINC users (user/pwd) need to receive the Job desc AND write the output  Grid agents (FTS, X509) need to read what the BOINC user wrote 13

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Multiple S3 buckets anywhere Multiple S3 buckets anywhere Apache The Data Bridge 14 X509 FTS mysql Workload Manager Workload Manager BOINC User BOINC User PUT/GET HTTPS redirect & sign PUT/GET Grid DynaFed

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Initiatives with Data Bridge   Pioneered the adoption of the data-bridge  Test4Theory  Pre-production mode, running a fraction of the prod jobs through the Data Bridge  BNL  Evaluation of ATLAS workflows involving Amazon S3, dCache, Grid clients and FTS3. Good feedbacks  Human Brain Project and EGI  Under evaluation for accessing/sharing large repos of brain scans from browser-based 3D apps 15

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Dynafed project status  The project is in a stable, low dev overhead state, actively maintained, getting increased exposure  Ideas for the next development cycles:  Redirection monitoring, to allow the logging of federator behavior for real-time monitoring and subsequent analytics  Metadata integration, beginning with the incorporation of space usage information, allowing the federator to expose grid-wide storage metrics  An HTTP-based endpoint realtime status/management subsystem  Semantic enhancements to the embedded rule-based authZ implementation. Or maybe pluggable authZ  Study other similar Cloud storages, e.g. MS Azure  Deployment tests with other Apache security plugins, to support natively Identity Federations on Cloud storage  Big potential and unprecedented flexibility could be the prize. Anyone willing to try and let us know ?

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage For more info  Dynafed homepage: federationshttp://lcgdm.web.cern.ch/dynamic- federations  Full Dynafed documentation: whitepaper/Doc_DynaFeds.pdf whitepaper/Doc_DynaFeds.pdf  Demo testbed:  Web FTS homepage:  DAVIX (powerful HTTP/WebDAV/S3 client) :

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Conclusion  We can seamlessly federate Grid and Cloud storage, in a very flexible and “opportunistic” way  Among us we tend to name all this “HTTP ecosystem”, referring to an ensemble of components that sustain each other’s usage and are very open to usage by “normal” professionals  Easier to share services to other communities  Easier to develop new services  Easier to use Grid services for interactivity, outreach and others  Easier to be understood by non-HEP professionals  Focus on usability, flexibility, performance, scalability  Everything available as high-quality RPMs, mostly in EPEL  Dynafed is a generic component, we foresee other applications (e.g. clustering remote or local caches)  We encourage collaborations and new ideas 18

IT-SDC : Support for Distributed Computing Backup hardcore slides

Dynamic Federation of Grid and Cloud Storage IT-SDC 01 Oct 2015 Federator Plugi n Frontend (Apache2+DMLite) Where is file X ? Plug in SE Metadata cache SE The cache remembers what happened The next metadata interactions will very likely be fed by the cache The 2 nd level cache can be shared among federators (memcached) The cache remembers what happened The next metadata interactions will very likely be fed by the cache The 2 nd level cache can be shared among federators (memcached)

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Dynafed: Simple assumptions  Each site’s content is accessible via HTTP/DAV to the other sites  Same path/name means same file  Modulo a site prefix, e.g.  / /data/prod01/myimage.jpg    Federated as:  More complex forms of translation are possible. Simpler is better.  An identity recognized by the federator has to be recognized by the sites too  In some cases the federator can apply a form of delegation  All participating sites allow the federator to read file metadata (HEAD and PROPFIND)

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Dynafed Focus: performance  Performance and scalability have primary importance  Otherwise it’s useless...  Fully C++, designed for parallelism  No limit to the number of outstanding clients/tasks  No global locks/serializations!  The endpoints are treated in a completely independent way  Thread pools, prod/consumer queues used extensively (e.g. to stat N items in M endpoints while X clients wait for some items)  Aggressive metadata caching  A relaxed, hash-based, in-memory partial name space  Juggles info in order to always contain what’s needed  Spurred a high performance DAV client implementation (DAVIX)  Wraps DAV calls into a POSIX-like API, saves from the difficulty of composing requests/responses  Loaded by the core as a “location” plugin   Available in ROOT 5 and 6 as TDavixFile 22

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage Dynafed system design  A system that only works is not sufficient  To be usable, it must privilege speed, parallelism, scalability  The core component is a plugin-based component called originally “Uniform Generic Redirector” (Ugr)  Can plug into an Apache server thanks to the DMLITE and DAV-DMLITE modules (by IT-GT)  Composes on the fly the aggregated metadata views by managing parallel tasks of information location  Never stacks up latencies!  Makes browsable a sparse collection of file/directory metadata  Able to redirect clients to hosts known to be working in that moment  Built on the concept of partial, volatile namespace made of objects  Objects are kept in LRU  a fast 1st level namespace cache  Peak performance is ~500K->1M hits/second per core 23

01 Oct 2015 IT-SDC Dynamic Federation of Grid and Cloud Storage  A sophisticated scheme of name translation is a key to be able to federate almost any source of metadata  UGR implements algorithmic translations and can accommodate non algorithmic ones as well.  Algorithmic is technically a way better choice  A plugin could also ask to an external service (e.g. an LFC or a private DB)  The metadata caching keeps the performance high, especially in the case of external translators Dynafed: Name translations

Dynamic Federation of Grid and Cloud Storage IT-SDC 01 Oct 2015 Clients come and are distributed through: different machines (DNS alias) different processes (Apache config) Clients are served by the UGR. They can browse/stat or be redirected for action. The architecture is multi/manycore friendly and uses a fast parallel caching scheme