Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bringing cloud technology to distributed data infrastructures EGI CF 2013 Martin Hellmich (presenter) Jedrzej Rybicki Maciej Brzeźniak Date :

Similar presentations


Presentation on theme: "Bringing cloud technology to distributed data infrastructures EGI CF 2013 Martin Hellmich (presenter) Jedrzej Rybicki Maciej Brzeźniak Date :"— Presentation transcript:

1 Bringing cloud technology to distributed data infrastructures EGI CF 2013 Martin Hellmich (presenter) Jedrzej Rybicki Maciej Brzeźniak Date :

2 A bit of context 2 Towards a pan-European Collaborative Data Infrastructure Production Services Safe Replication Data Staging Metadata AAI Research & Development Scalable Federation Architectures Data Preservation Data Access and Transfer Workflows

3 Three Projects Cloud storage integration –iRODS managing an OpenStack Swift backend –Extending DPM with S3 storage In-storage processing –Call Hadoop jobs from iRODS 3

4 My Goal Show the projects Find interest in the communities (we are interdisciplinary) Start discussion about cloud integration –Backend or frontend? –Outsource or restructure? –Where are limitations? 4

5 The Cloud Integration Projects iRODS-OpenStack Expose existing S3/OpenStack storage (managed otherwise) iRODS frontend protocols Local storage as cache 5 DPM-S3 Add new storage to DPM Expose HTTP only (but grid-aware, X509, VOMS) Outsource storage and network traffic

6 iRODS-OpenStack Swift Maciej Brzezniak Date :

7 Sidestep: iRODS compound resources 7 iRODS resources: Cache Archive Virtual iRODS compound resources: Virtual resource Maps from PUT/GET to POSIX Provides a cache

8 iRODS managing an S3 backend Ingredients: iRODS server S3 Driver (in C) iRODS-S3 Driver Glue Swift-to-S3 frontend 8 iRODS Site Disks OpenStack Swift/S3

9 Achievements Transparent cloud storage Cloud auth through central accounts Low Overhead through iRODS Speedups with caching Limitations: Filesize limit (2/5GB) Issue moving files inside the cloud 9 iRODS Site Disks S3/OpenStack

10 DPM-S3 Martin Hellmich Date :

11 DPM now uses dmlite 11 S3

12 Sidestep: the S3 protocol HTTP + custom headers Access ID + Secret Key + HTTP Cmd + Time => Signature Can be: Header: Authorization: AWS WSAccessKeyId:Signature In URL: ?AWSAccessKeyId=AKIAIOSFODNN7EXAMPLE&Signature= NpgCjnDzr%2BWFzoENXmpNDUsSn8%3D&Expires=1175139620 12

13 Extending DPM with S3 Storage 13 Site Disks S3 Signed URL redirect Ingredients: dmlite dmlite-plugins-s3 Amazon S3 OpenStack Swift S3 frontend Ceph/RadosGW

14 Achievements Only nameserver traffic local Cloud storage managed with central account Grid-enabled HTTP Standard HTTP clients Filesize limit (or S3 client) 14 Site Disks S3 Signed URL redirect

15 In-Storage Processing Jedrzej Rybicki & Benedikt von St. Vieth Date :

16 Motivation Example HPC workflow: 16 Site High Performance Computing Storage preprocessing Site High Performance Computing Storage + preprocessing

17 Sidestep: iRODS rules 17 Condition: $objPath like /x/y/z/* Or $rescName == demoResc8 Rule: printHello { print_hello; } Act freely on certain triggers At least C and Python

18 Benedikt von St. Vieth & Jedrzej Rybicki 18 In-Storage Processing

19 Achievements 19 Everything is a file Easy job specification in Apache Pig Caching of results Predefined scripts or custom jobs?

20 Summary 20 There are different ways to integrate cloud storage for different scenarios Storage-based computing can be made transparent

21 Thank you! OpenStack/iRODS –Maciej Brzezniak (PSNC) DPM-S3 –Martin Hellmich (CERN) In-storage processing on iRODS –Jedrzej Rybicki / Benedikt von St. Vieth (JSC) 21 Projects contacts Any Questions?


Download ppt "Bringing cloud technology to distributed data infrastructures EGI CF 2013 Martin Hellmich (presenter) Jedrzej Rybicki Maciej Brzeźniak Date :"

Similar presentations


Ads by Google