GlobusWORLD 2006Globus XIO1 The Globus Project™ Argonne National Laboratory USC Information Sciences Institute Copyright (c) 2002 University of Chicago and The University of Southern California. All Rights Reserved. This presentation is licensed for use under the terms of the Globus Toolkit Public License. See for the full text of this license.
GlobusWORLD 2006Globus XIO2 Motivation –Application Development –Protocol Development/Experimentation Asynchronous Programming Refresher Basic Architecture –User API –Data Type and Common Functions –Example Program/globus-xioperf Performance/Overhead Driver Development –Wrapblock eXtensible Input Output library
GlobusWORLD 2006Globus XIO3 Motivation –Application Development –Protocol Development/Experimentation Asynchronous Programming Refresher Basic Architecture –User API –Data Type and Common Functions –Example Program/globus-xioperf Performance/Overhead Driver Development –Wrapblock eXtensible Input Output library
GlobusWORLD 2006Globus XIO4 Grid Communication Geographically Disperse Resources –Compute Nodes –Large Data Stores –Specialized Scientific Devices >APS, Telescopes, Environmental Sensors –Collaborative Sessions Varying Networks Characteristics –un/congested. LFNs. Dedicated pipes/QOS Varying Network Protocols –GridFTP, HTTP, UDT, TCP, RBUDP, etc.
GlobusWORLD 2006Globus XIO5 Network Protocol Typical Approach Application Disk Network Protocol Special Device Protocol API POSIX IO Proprietary API
GlobusWORLD 2006Globus XIO6 Example Application Application Collaborative Display Video Input Stream Proprietary API Remote Compute Nodes HPSS Grid FTP Client Library HPSS Client Library
GlobusWORLD 2006Globus XIO7 Varying Environments Application RBUDP Dedicated LFN RBUDP Mode E TCP Shared Different networks have different optimal protocols Separate APIs for each protocol
GlobusWORLD 2006Globus XIO8 Development Time –Application must use many different APIs. –Each API has its own semantics (and bugs). >Asynchronous/Synchronous >Threaded/non-thread >Different programming models in the same application. Scalability and Compatibility –New devices or protocols >Application must be modified as new protocols are invented. >Application must keep up with orthogonal research issues. Problems
GlobusWORLD 2006Globus XIO9 Data Stream IO Abstraction –Many IO needs can be treated as a streams of bytes. –open/close/read/write functionality satisfies most requirements. Protocol details –Rarely does the application need (or want) to deal with protocol details. –Most needs can be satisfied at initialization time. >Example: TCP buffer sizes. Observations
GlobusWORLD 2006Globus XIO10 Solution Globus XIO user API –Single API/Single Set of semantics. –Simple open/close/read/write API Driver Abstraction –Hides protocol details –Allows for extensibility –Drivers can be selected at runtime –Driver optimization parameters can be described at runtime –New protocols can be added to an already compiled application
GlobusWORLD 2006Globus XIO11 Network Protocol Globus XIO Approach Application Disk Network Protocol Special Device Globus XIO Driver
GlobusWORLD 2006Globus XIO12 Example Application With Globus XIO Application Collaborative Display Video Input Stream Remote Compute Nodes HPSS Globus XIO (GridFTP driver) Globus XIO (HPSS Driver) Globus XIO (video capture driver) Globus XIO (display driver)
GlobusWORLD 2006Globus XIO13 Varying Environments Application Dedicated LFN Shared Globus XIO Mode E Driver RBUDP Driver TCP Driver
GlobusWORLD 2006Globus XIO14 Globus XIO Motivation –Application Development –Protocol Development/Experimentation Asynchronous Programming Refresher Basic Architecture –User API –Data Type and Common Functions –Example Program/globus-xioperf Performance/Overhead Driver Development –Wrapblock eXtensible Input Output library
GlobusWORLD 2006Globus XIO15 Creating New Drivers Faster Takes away mundane issues –Error handling –API design –Faster prototyping >wrapblock Assist API –Timeouts/cancels –Asynchronous/Thread polling –Close/EOF Barriers
GlobusWORLD 2006Globus XIO16 Testing/Applications Fair performance evaluation –Same applications test performance –Apples to apples comparison –Variables are limited to protocol details New protocols can immediately be relevant to existing applications –Ex: XIO under MPIG/GridFTP
GlobusWORLD 2006Globus XIO17 Globus XIO Motivation –Application Development –Protocol Development/Experimentation Asynchronous Programming Refresher Basic Architecture –User API –Data Type and Common Functions –Example Program/globus-xioperf Performance/Overhead Driver Development –Wrapblock eXtensible Input Output library
GlobusWORLD 2006Globus XIO18 Quick Asynchronous Model Refresher Non-blocking –Many unrelated operations make progress at once Register Events and wait for callback –Function pointer given when the event is registered –When even completes that function is called to notify completion. Shared state between operations –User defined memory is shared across events as a void * –Finite state machines for defining transitions
GlobusWORLD 2006Globus XIO19 Example Async App struct user_memory { int count_up; int count_down; } mutex; cond; main() { user_memory * um1; user_memory * um2; um1 = (user_memory *) calloc(1, sizeof(user_memory)); um2 = (user_memory *) calloc(1, sizeof(user_memory)); lock(mutex); register_callback(up_cb, um1); register_callback(up_cb, um2); while(um1->count count < 100) { condwait(cond, mutex); } unlock(mutex); } void up_cb(void * user_arg) { user_memory * um; um = (user_memory *) user_arg; lock(mutex); um->count_up++; if(um->count_up < 100) { register_callback(up_cb, um); } else { condsignal(cond); } unlock(mutex); }
GlobusWORLD 2006Globus XIO20 Example Async App user_memory structure –Thread through the callbacks –State re-inflated Multiple registers at once –2 execution threads, progress at independent rates pthread mutex/cond variables Polling –In threaded build there are polling threads –In non-threaded cond_wait is a macro to polling code
GlobusWORLD 2006Globus XIO21 Globus XIO Motivation –Application Development –Protocol Development/Experimentation Asynchronous Programming Refresher Basic Architecture –User API –Data Type and Common Functions –Example Program/globus-xioperf Performance/Overhead Driver Development –Wrapblock eXtensible Input Output library
GlobusWORLD 2006Globus XIO22 Architecture Application User API XIO Framework Drivers Stacks Handles Attributes User API Transform Driver Transform Driver Transport Driver Stack Framework net
GlobusWORLD 2006Globus XIO23 User API Simple open/close/read/write. Hides protocol details from users –Hooks to hit driver directly >Set protocol specific optimization parameters Asynchronous Operations –Register operations –Callbacks when operation completes –Many operations can be outstanding at a time –Thread pools maximize concurrency –Blocking functions for convenience
GlobusWORLD 2006Globus XIO24 Sample User API globus_result_t result; globus_xio_handle_t handle; globus_byte_t buffer[256]; globus_size_tsize = 256; result = globus_xio_register_read( handle, buffer, size, 1, NULL, read_callback, NULL); if(result != GLOBUS_SUCCESS) { /* handle error */ } void read_callback( globus_xio_handle_t handle, globus_result_t result, globus_byte_t * buffer, globus_size_t len, globus_size_t nbytes, globus_xio_data_descriptor_t data_desc, void * user_arg) { if(result != GLOBUS_SUCCESS) { /* handle error */ } fwrite(buffer, 1, nbytes, stdout); } globus_xio_open(); result = globus_xio_read(handle, buffer[ndx], size, size, &nbytes, NULL); fwrite(buffer, 1, nbytes, stdout); globus_xio_close();
GlobusWORLD 2006Globus XIO25 Drivers Do all of the heavy lifting Implement the protocols –Ship data across a wires –Frame user data buffers –Monitor/log/alter user data buffers Dynamically Loadable Libraries –Implement a well know interface –Loaded at runtime by the framework Many drivers can be chained together –Form a stack.
GlobusWORLD 2006Globus XIO26 Transport Drivers Ship data in and out of the process space –Across a kernel boundary to a device –Across a wire –The last driver in a stack >Source or sync of all data –XIO provides assistance for sockets Examples –TCP, UDP, File IO
GlobusWORLD 2006Globus XIO27 Transform Drivers Transform –Alter/Monitor/Log or manipulate user buffers –Do not move data outside of process space –Change the order of/add operations >GSI handshake –Rely on a transport driver Examples –Compression, Security, Logging –Framing: >HTTP/MODE E
GlobusWORLD 2006Globus XIO28 Stack An arrangement of drivers Transport –Exactly one per stack –Must be on the bottom Transform –Zero or many per stack Operation requests from from the user and down the stack. Read data flows up the stack Write data flows down the stack Example Driver Stack Compression Logging TCP
GlobusWORLD 2006Globus XIO29 Globus XIO Framework Moves the interactions between users and drivers Manages the data buffers and IO operations Assist in the creation of drivers. –Error handling and parameter checking >The driver can assume a friendly user –Asynchronous support. –Close and EOF Barriers. >Guarantees no outstanding operations >Very helpful to minimize race conditions –Internal API for leveraging other drivers
GlobusWORLD 2006Globus XIO30 Handles Driver Stack Compression Logging TCP User Handle Can be thought of as a connections –Similar to an FD Handle is bound to the stack. User performs data operations on the handle. The data operation is passed down the stack. –The data is compressed by the first driver. –The logging driver logs the exchange in syslog. –The TCP driver sends the compressed data across the wire. operations
GlobusWORLD 2006Globus XIO31 Sample Program globus_xio_stack_init(&stack, NULL); globus_xio_driver_load("tcp", &driver); globus_xio_stack_push_driver(stack, driver); globus_xio_driver_load("gsi", &driver); globus_xio_stack_push_driver(stack, driver); globus_xio_handle_create(&handle, stack); /* use blocking open for example */ globus_xio_open( handle, “ NULL); globus_xio_register_read( handle, buffer, size, size, NULL, read_callback, NULL); globus_xio_register_write( handle, buffer, size, size, NULL, write_callback, NULL); while(count < 2) { globus_poll(); } globus_xio_close(handle, NULL); void read_callback( globus_xio_handle_t handle, globus_result_t result, globus_byte_t * buffer, globus_size_t len, globus_size_t nbytes, globus_xio_data_descriptor_t data_desc, void * user_arg) { count++; } void write_callback( globus_xio_handle_t handle, globus_result_t result, globus_byte_t * buffer, globus_size_t len, globus_size_t nbytes, globus_xio_data_descriptor_t data_desc, void * user_arg) { count++; }
GlobusWORLD 2006Globus XIO32 Sample Program Just an example –Not thread safe –Return codes not checked Loading drivers –GSI transform/TCP transport –Via string name (could be argv[1]) Asynchronous Programming –Multiple Operations at once –Read and write not serialized –Could be many operations on many handles
GlobusWORLD 2006Globus XIO33 Attributes and Controls Change default behaviour –Set timeouts/change cancel behaviour/etc Handle Attributes –Associated with a handle –Immutable attributes: set at init/open time –Mutable attributes: changed thought lifetime Driver specific attributes –Hooks to the protocol optimization parameters –Ex: TCP buffer size, GSI Subject name
GlobusWORLD 2006Globus XIO34 Standard Controls Similar to UNIX ioctl() Enumeration of Commands –Scoped to driver Parameters depend on the specific command Need a reference to a driver Need a driver header Mainly at open/init time Throughout lifetime of handle –globus_xio_handle_cntl() globus_result_t globus_xio_attr_cntl( globus_xio_attr_t attr, globus_xio_driver_t driver, int cmd,...); example_func() { globus_xio_attr_init(&attr); /* non driver specific cnt */ globus_xio_attr_cntl( attr, NULL, GLOBUS_XIO_ATTR_SET_TIMEOUT_OPEN, time); /* tcp specific control */ globus_xio_driver_load(“tcp”, &tcp_driver); globus_xio_attr_cntl( attr, tcp_driver, GLOBUS_XIO_TCP_SET_NODELAY, GLOBUS_TRUE); /* build stack... */ globus_xio_open( handle, “ attr); /* read/write logic */ }
GlobusWORLD 2006Globus XIO35 String Controls Makes the application less driver aware –Allows the driver AND the driver options to be set via argv[1] Individual cntls() are encapsulated as a string Driver defines a string format according to a conventions –key=value pairs –Separated by a semicolon –TCP Driver example: >“keepalive=Y;nodelay=N;port=5555;” Illustrated with XIOPerf
GlobusWORLD 2006Globus XIO36 XIOPerf Example BW measuring tool –Similar to IPerf –Driver independent transfer application –Dynamic Building of a stack –Transparent setting of driver options
GlobusWORLD 2006Globus XIO37 XIOPerf Server % globus-xioperf -D "tcp:rcvbuf=64K;sndbuf=128K;port=5555" -D gsi:subject="/DC=org/DC=doegrids/OU=People/CN=John Bresnahan " -s server listening on: gridftp.mcs.anl.gov: Connection established Closing connection Time: 00: Bytes recv: M Read BW: m/s Client % globus-xioperf -D "tcp:rcvbuf=128K;sndbuf=64K;port=5555" -D gsi:subject="/DC=org/DC=doegrids/OU=People/CN=John Bresnahan " -c gridftp.mcs.anl.gov: Connection established Time exceeded. Terminating. Closing connection Time: 00: Bytes sent: M Write BW: m/s
GlobusWORLD 2006Globus XIO38 Globus XIO Motivation –Application Development –Protocol Development/Experimentation Asynchronous Programming Refresher Basic Architecture –User API –Data Type and Common Functions –Example Program/globus-xioperf Performance/Overhead Driver Development –Wrapblock eXtensible Input Output library
GlobusWORLD 2006Globus XIO39 Performance How much overhead does the abstraction add? Interval: register to driver interface –Time from user space until protocol can work –Exact time of overhead UC Teragrid 1.5GHz Dual Itanium Each additional driver adds about.125us
GlobusWORLD 2006Globus XIO40 Performance Throughput with noop drivers Overhead doesn't effect bulk transfer BW
GlobusWORLD 2006Globus XIO41 Globus XIO Motivation –Application Development –Protocol Development/Experimentation Asynchronous Programming Refresher Basic Architecture –User API –Data Type and Common Functions –Example Program/globus-xioperf Performance/Overhead Driver Development –Wrapblock eXtensible Input Output library
GlobusWORLD 2006Globus XIO42 Driver Development Native API drivers –Asynchronous model –Most scalable and efficient –More difficult to write –Can trip assertions if miss-used –Most internally developed drivers >TCP, GSI, UDP, HTTP, Token Bucket,... Implemented with Globus XIO User API –Mode E, BIDI
GlobusWORLD 2006Globus XIO43 Wrapblock Driver Development Blocking API –Thread pooling/event callbacks to morph async to sync –Recommend threaded builds Easier to create –Designed to work with existing libraries UDT_Ref –Implemented in one day
GlobusWORLD 2006Globus XIO44 GPT Packages Drivers should be GPT packages Much easier to create Dynamically loaded with libtool –GPT gives this for free Easier to plug into the Globus C / XIO environment Support for including third party libraries
GlobusWORLD 2006Globus XIO45 Interface functions A set function pointers –open/close/read/write implemented by driver –cntl() functions for driver specific hooks –Wrapped into a structure and registered with Globus XIO Calls to these functions are made expecting specific behaviours –Ex: the read() interface function should produce some data, and the write() interface function should consume data, etc
GlobusWORLD 2006Globus XIO46 Example Interface Functions static globus_result_t globus_l_xio_udt_ref_read( void * driver_specific_handle, const globus_xio_iovec_t * iovec, int iovec_count, globus_size_t * nbytes) { globus_result_t result; xio_l_udt_ref_handle_t * handle; handle = (xio_l_udt_ref_handle_t *) driver_specific_handle; *nbytes = (globus_size_t) UDT::recv( handle->sock, (char *)iovec[0].iov_base, iovec[0].iov_len, 0); /* need to figure out eof */ if(*nbytes <= 0) { result = GlobusXIOUdtError("UDT::recv failed"); goto error; } return GLOBUS_SUCCESS; error: return result; } static globus_result_t globus_l_xio_udt_ref_write( void * driver_specific_handle, const globus_xio_iovec_t * iovec, int iovec_count, globus_size_t * nbytes) { globus_result_t result; xio_l_udt_ref_handle_t * handle; handle = (xio_l_udt_ref_handle_t *) driver_specific_handle; *nbytes = (globus_size_t) UDT::send( handle->sock, (char*)iovec[0].iov_base, iovec[0].iov_len, 0); if(*nbytes < 0) { result = GlobusXIOUdtError("UDT::send failed"); goto error; } return GLOBUS_SUCCESS; error: return result; }
GlobusWORLD 2006Globus XIO47 Applications GridFTP –4.1 server allows setting data channel stack >-–protocol_stack >--file_stack MPIG –Set different drivers for different networks >WAN, LAN, System Globus WS C core –Flexibility for future bindings
GlobusWORLD 2006Globus XIO48 Other Bits Data Descriptors –Associate meta data with buffers –Associate offsets if data is out of order Driver semantics –un/reliable, un/ordered, {bi,uni}- directional –Automatically converts –bidi driver, ordering driver
GlobusWORLD 2006Globus XIO49 Related Work UNIX Streams –XIO is entirely in user space –Not asynchronous –XIO followed asynchronous model already in Globus toolkit and GridFTP
GlobusWORLD 2006Globus XIO50 Summary Simple abstraction to many protocols –User API –Wrapblock drivers Future work –Driver Development guide (in progress) –RBUDP Driver More Info – index.htmlhttp:// index.html –