Presentation is loading. Please wait.

Presentation is loading. Please wait.

IT-SDC : Support for Distributed Computing Dynamic Federations: scalable, high performance Grid/Cloud storage federations Fabrizio Furano - Oliver Keeble.

Similar presentations


Presentation on theme: "IT-SDC : Support for Distributed Computing Dynamic Federations: scalable, high performance Grid/Cloud storage federations Fabrizio Furano - Oliver Keeble."— Presentation transcript:

1 IT-SDC : Support for Distributed Computing Dynamic Federations: scalable, high performance Grid/Cloud storage federations Fabrizio Furano - Oliver Keeble - Adrien Devresse 1

2 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC  A project started a few years ago  Context: Promote and improve the usage of WebDAV/HTTP for high performance computing in the geographically distributed Grid environment  Goal: a frontend that presents what a certain number of remote or local endpoints would present if put together  Without indexing them beforehand  Emphasis on scalability and flexibility  Flexible algorithmic name translations to “mount” remote endpoints into an apparent namespace  Use “industry standard” building blocks whenever possible  These endpoints can be a very broad range of objects that act as data or metadata stores  WebDAV and S3 are included 2

3 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC 3 Aggregation /dir1 /dir1/file1 /dir1/file2 /dir1/file3.../dir1/file1.../dir1/file2 Storage/MD endpoint 1.../dir1/file2.../dir1/file3 Storage/MD endpoint 2 This is What we want to see as users Sites remain independent and participate to a global view All the metadata interactions are hidden and done on the fly NO metadata persistency needed here, just efficiency and parallelism With 2 replicas

4 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC 4 Federator Plugin Frontend (Apache2+DMLite) Where is file X ? Plugin SE Metadata cache SE The cache remembers what happened The next metadata interactions will very likely be fed by the cache The 2 nd level cache can be shared among federators (memcached) The cache remembers what happened The next metadata interactions will very likely be fed by the cache The 2 nd level cache can be shared among federators (memcached)

5 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC HTTP/WebDAV federation  For an HTTP/WebDAV client it’s just a huge, distributed repository to query  A solution to the Where is file X? problem  high performance (Ks client transactions per sec) and reliability  takes realtime redirection choices, considering the worldwide status (instead of a static catalogue)  never out of sync with the storage elements’ status  can scale up the size of the repo  can scale up the number of clients  On top of data/metadata access it also allows to browse the federated apparent namespace  Gives a friendly feel to users and sysadmins  A solution to the What’s in "directory” (=path prefix) Y? problem 5

6 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC HTTP/WebDAV federation  For the sysadmins it’s a frontend service that contacts a set of remote endpoints  Based on Apache + some solibrary plugins  Fully C++ code  No service-side metadata persistency needed  Each endpoint provides WebDAV/HTTP/S3 content  Spurred DAVIX: a complete, high performance HTTP/DAV/S3 client library  Available in Fedora/EPEL  The Federator needs only metadata r/o access  Each endpoint is “mounted” according to a directory prefix 6

7 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC Dynamic Federations  This approach of dynamically “mounting” is very powerful  Opens to a multitude of use cases, by composing a worldwide system from macro building blocks speaking HTTP and/or WebDAV  Federate Grid storage  Federate WebDAV Cloud services  Add the content of fast changing things, like file caches  Add native S3 storage backends (a supported dialect)  Accommodate whatever metadata sources, even two or more remote catalogues at the same time  Clients are redirected to the replica that is closer to them  The metric is pluggable, any other metric could be implemented  Redirect only to endpoints that are working in that moment 7

8 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC Look and feel  What we see in the browser is an HTML rendering of a listing  Everything is done on the fly  Click on a file to download it (if your client is authorized by the endpoint SE through X509)  Feed the URL of that file to any other client to download it  Click on the strange icon to get a metalink  A standard representation of the locations of a file sorted by increasing distance from the requestor  It’s supported by multi-source download apps 8

9 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC 9 Look and feel, like a normal list

10 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC 10 Interesting deployments and use cases

11 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC The demo frontend  Our historical public testbed is a powerful machine at DESY  http://federation.desy.de/fed http://federation.desy.de/fed  Provided by the dCache team, cooperating with us  Hosts several demos  E.g. the Interleaved path, containing interleaved files from two sources  Now hosts the stable LHCb prototype 11

12 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC 12 Constructing an on the fly namespace /fed /interleaved 2 sites here CERN (odd files) DESY (even files) 2 sites here CERN (odd files) DESY (even files) 14/19 LHCb sites here 15PB online 14/19 LHCb sites here 15PB online /lhcb /XrdHTTP_README Bonus file coming from yet another endpoint placed on /fed Bonus file coming from yet another endpoint placed on /fed

13 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC LHCb federation prototype  The namespace of the storage elements of the LHCb experiment is quite simple and clean  Many sites now are deploying WebDAV access  Setup was simple, and now 14 sites (~15PB) are stably online  It just works  GeoIP-based redir optimization is active  Official site downtimes were always detected automatically 13 /lhcb/LHCb/Collision12/BHADRONCOMPLETEEVENT.DST/ 00030613/0000/00030613_00000134_1.bhadroncompleteevent.dst remains constant, despite the prefix it may have, like: https://ccdavlhcb.in2p3.fr:2880/ or https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/ /lhcb/LHCb/Collision12/BHADRONCOMPLETEEVENT.DST/ 00030613/0000/00030613_00000134_1.bhadroncompleteevent.dst remains constant, despite the prefix it may have, like: https://ccdavlhcb.in2p3.fr:2880/ or https://fly1.grid.sara.nl:2882/pnfs/grid.sara.nl/data/

14 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC The BOINC data bridge  BOINC lets you contribute computing power on your home PC to projects doing research in many scientific areas  http://boinc.berkeley.edu/  The LHC experiments have some interest on it, and dedicated some effort into it. Some challenges were:  Seamlessly integrating the Grid storage auth domain with an external user-based auth domain  Optimizing and ruggedizing the data access to/from users (slow lines, distant home users, processes put to sleep, …)  Many clients, potentially large data bridge needed, scalability  The BOINC Data Bridge is basically a Dynamic Federation with:  Write to the Data Bridge enabled  On-the fly resource location among an undefined number of S3 backends  GeoIP optimization of file locations  Double-headed authentication X509 (strong) and BOINC (Apache username/pwd) … it’s a bridge!  https://indico.cern.ch/event/272793/ https://indico.cern.ch/event/272793/ 14

15 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC 15 The BOINC data bridge Apache ssl FTS S3 mysql CRAB3 (X509) BOINC User (Apache auth) PUT/GET HTTP redirect & sign PUT/GET Grid (X509) DynaFed S3 Any number of S3 instances In any place (prototype at CERN) Redirections will be optimized based on client’s location Good files are moved asynchronously to the official Grid repos

16 17 Nov 2014 HTTP Dynamic Federations IT-SDC NEP-101 NEP-101 is a project to enable data-intensive applications to run on distributed clouds  Batch services, Software distribution, Storage Federation, Image Distribution Need to use standard protocols, open-source components, avoid anything HEP-specific Have multiple clouds and SEs in various locations; cloud jobs need to find SEs Planning to use EMI Dynamic Federation Ryan Taylor – University of Victoria 16

17 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 North American Clouds Ryan Taylor – University of Victoria

18 17 Nov 2014 HTTP Dynamic Federations IT-SDC Federation Test Deployment Storage Element Web server More SEs could be added for production deployment (e.g. Melbourne did) 18 Ryan Taylor – University of Victoria

19 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC Different frontends  Any application that can rely on a WebDAV namespace can work seamlessly on top of a fed  Only caveat: it must support redirections, (which several clients support)  Hence, in principle things like OwnCloud could work on a federation of endpoints or S3 buckets  A geographically distributed repository  We will be willing to try 19

20 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC 20 Federation of Cloud Storages Amazon S3 /atlas/bucket1/file2 /atlas/bucket2/file1 /lhcb/bucket3/file6 Ceph S3 /atlas/specialdata/bucket5 /atlas/bucket2/file1 Openstack Swift /atlas/bucket2/file1 /userBob/kitty.png /atlas/specialdata/bucket5 /atlas/bucket2/file1 /lhcb/bucket3/file2 /userBob/kitty.png /atlas/bucket1/file2 /lhcb/bucket3/file6 DynaFed   One namespace for several Cloud   Design to scale   Fully in memory   Metadata caching   Geo-Redirection   Standard interface

21 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC 21 Why federating Cloud Storage   Extend your existing resoures with Cloud Storage   Add/remove resources on demand   Inter-Cloud data Replication, huge composite repositories   Geo-Redirection   Load-balancing / failover   Federated Identity   Use your own authorization scheme Answer: Combine the advantages of a federation with the flexibility of the Cloud

22 17 Nov 2014 HTTP Dynamic Federations IT-SDC 17 Nov 2014 HTTP Dynamic Federations IT-SDC Conclusions and next steps  Getting close to large prod setups  Bridge Web, Grid and Cloud tech with tools that scale  Keep the usual metadata DBs for more batch-like use cases, use the dynamic system for realtime data access  Make high perf data storage/access scale geographically in the Web+Grid+Cloud case  A way to explore user-friendly interfaces for large geographically distributed repos  New Dynafed release in one week  Add missing endpoints to the LHCb prototype  Use the new features to evaluate the ATLAS case (>40 sites, >200 spacetokens [=places where to search])  Support the BOINC data bridge and the Canadian project  Get experience with aggregating many S3 endpoints  Get experience in managing file caches instead of stable storages  Investigate the possible usage in the context of personal file sync tools with very large, distributed repos 22


Download ppt "IT-SDC : Support for Distributed Computing Dynamic Federations: scalable, high performance Grid/Cloud storage federations Fabrizio Furano - Oliver Keeble."

Similar presentations


Ads by Google