Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC.

Similar presentations


Presentation on theme: "A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC."— Presentation transcript:

1 A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC Center for Enabling Distributed Petascale Science (CEDpS) www.cedps.org

2 Overview Brief CEDPS overview Focus on data movement Managed Object Placement Service (MOPS) –Internal resource management (awareness) GFork capability –External awareness & interaction NEST (Network Storage Technology)

3 Petascale Data Challenge DOE facilities generate many petabytes of data (2 petabytes = all U. S. academic research libraries!) Massive data U U U U U DOE facilities Remote users (at labs universities, industry) need data! Rapid, reliable access key to maximizing value of $B facilities U Remote distributed users U U

4 Reliable: recover from many failures Predictable: data arrives when scheduled Secure: protect expensive resources & data Scalable: deal with many users & much data Bridging the Divide (1): Move Data to Users When & Where Needed C B A Fast: >10,000x faster than usual Internet “Deliver this 100 Terabytes to locations A, B, C by 9am tomorrow”

5 Flexible: easy integration of functions Secure: protect expensive resources & data Scalable: deal with many users & much data Bridging the Divide (2): Allow Users to Move Computation Near Data A Science services: provide analysis functions near data source “Perform my computation F on datasets X, Y, Z” Y Z X F

6 Instrument: include monitoring points in all system components Monitor: collect data in response to problems Diagnose: identify the source of problems Bridging the Divide (3): Troubleshoot End-to-End Problems C B A “Why did my data transfer (or remote operation) fail?” Identify & diagnose failures & performance problems

7 What is GridFTP Widely used, open source, production quality data mover –Separate control and data channels –Parallel streams (~3-5x faster than TCP/IP) –Parallel stripes (multiple servers) –Partial file transfer –Multiple security options (GSI, SSH) –Third party control –Extensible for both file system & protocols

8 GridFTP Modularity Data Storage Interfaces (DSI) -POSIX -SRB -HPSS -NEST GridFTP Server -separate control, data -striping XIO Drivers -TCP -UDT (UDP) -parallel streams -GSI -SSH Client Interfaces -Globus-URL-Copy -C Library -RFT (3 rd party) I/O File Systems Clients

9 GridFTP Advanced Configurations GFork (Internal awareness) –Robust unix fork/setuid model –Allows server state to be maintained across connections Dynamic backends –Stability in the event of backend failure –Growing resource pools for peak demands Storage/Access Allocation (External awareness) –NEST (Network Storage Technology)

10 Why is awareness important? Currently, GridFTP does everything it is asked If asked, GridFTP in a worst case scenario could: –Use all available memory & buffers on the server –Write until the file system is full –Slow down all the transfers when overloaded (Worst case scenarios do not happen very often) Many tools designed to work around these limitations –SRM, DCache, … Services should be able to protect both themselves and their environments

11 GFork (Internal Awareness) Client Server Host GFork Server GridFTP Plugin GridFTP Server Instance Fork GridFTP Server Instance GridFTP Server Instance State Sharing Link Client Inherited Links Control Channel Connections

12 External Awareness: Why storage allocations ? Users need both temporary storage, and long-term guaranteed storage. Administrators need a storage solution with configurable limits and policy. Administrators will benefit from NeST’s autonomous reclamations of expired storage allocations.

13 External Awareness: GridFTP + NeST GridFTP Server NeST Callout Disk Storage NeST Server NeST Client Negotiator globus-url-copy (Lot operations, etc.) (File transfers) (GSI-FTP)

14 Overview of NeST NeST: Network Storage Technology Lightweight: Configuration and installation can be performed in minutes. Multi-protocol: Supports Chirp, GridFTP, NFS, HTTP –Chirp is NeST’s internal protocol Secure: GSI authentication Allocation: NeST negotiates “mini storage contracts” between users and server.

15 Storage allocations in NeST Lot – abstraction for storage allocation with an associated handle –Handle is used for all subsequent operations on this lot Client requests lot of a specified size and duration. Server accepts or rejects client request.

16 External Awareness Architecture Client GridFTP Server ACL Plugin DSI Plugin Main Codebase NEST

17 ACL Plugin Authorize/Init –Grant access Yes/No –Plugin establishes context (initializes state for future requests) Create/Modify/Read a file –Given pathname and size –Creates a transaction Update Transaction –Plug in may timeout waiting –Progessively commit bytes as ‘complete’ –Finished flag

18 Granting Access Client GridFTP Server ACL Plugin DSI Plugin Main Codebase Client connects GSI ID Allow? Y 230 Enter GSI Handshake Now known ID sent to auth plugin Do whatever needed to determine if allowed Notify client of access NEST

19 Recieving a File Client GridFTP Server ACL Plugin DSI Plugin Main Codebase Path/size Allow? Y 150 Begin RECV file Reserve Space Start transfer 0101010 1010101 010101 Receive Bytes Update Transaction Transaction Complete NEST

20 Notes Sending a file –Same interactions as receiving, only simpler (no space reservation) ACLs can be chained together –Chaining semantics still being worked out

21 Using NeST Init –NeST can use the client username/GSI subject to initialize. Create/modify –Reserve space with a given timeout Pathname is key to transaction If expires reservation and uncommitted data is lost Update –Commit bytes, reset timeout. Complete –Clean up state

22 Conclusion Services Must be able to protect themselves Awareness of environment (Internal & External) is key Managed Object Placement Service –Straight-forward technology advancements –Capability greater than sum of parts Invitation to work together…


Download ppt "A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC."

Similar presentations


Ads by Google