Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL Alex Sim Arie Shoshani
Super Computing 2000 What is Earth System Grid? Climate modeling: a mission critical application area High resolution, long-duration of climate modeling simulations produces tens of petabytes of data Earth System Grid (ESG): a virtual collaborative environment connecting distributed centers, users, models and data
Super Computing 2000 Earth System Grid ESG provides scientists with —Virtual proximity to the distributed data —Resources comprising collaborative environment ESG supports —Rapid transport of climate data between storage centers and users upon user’s request —Integrated middleware and network mechanisms that broker and manage high speed, secure, and reliable access to data and other resource in a wide area system —Persistent testbed that provides virtual proximity and shows reliable high performance data transport across a heterogeneous environment Data volume and Transmittal problems —High speed data transfer in heterogeneous Grid environments
Super Computing 2000 In this Demo Will show managing user requests accessing files from multiple sites in a secure manner selecting the “best” replica participating institutions for file replicas —SDSC: all the files for the demo on HPSS —about 15 disjoint files on disk in each of 5 locations: ISI, ANL, NCAR, LBNL, LLNL —some files are only on tape —size of files MBs the entire dataset stored on HPSS at NERSC (LBNL) —use HRM (via CORBA) to request staging of files to HRM’s disk —use GSI-ftp (security enhanced FTP) to transfer the file after it is staged
Super Computing 2000 Request Manager Coordination
Super Computing 2000 Request Manager Request Manager: developed at LBNL —accepts a request to cache a set of logical file names —checks replica locations for each file —gets NWS bandwidth/latency for each replica location —selects “lowest” cost location —initiates transfer using GSI-FTP —monitors file transfer progress, responds to status command Client: PCMDI software (LLNL) —it has its own “metadata” catalog —lookup in the catalog generates a set of files that are needed to satisfy a user’s request
Super Computing 2000 FTP Services for GRID Secured FTPs used for GRID: —GridFTP (developed at ANL) Support for both client and server Secured with grid security infrastructure (GSI) Parallel streaming capability —gsi-wuftpd server (developed at WU) Wuftp server with grid security infrastructure —gsi-ncftp client (ncftp.com) Ncftp client with grid security infrastructure —gsi-pftpd (developed at SDSC) For access to HPSS Parallel ftp server with grid security infrastructure
Super Computing 2000 Replica Catalog Service Globus Replica Catalog —developed using LDAP —has concept of a logical file collection —registers logical file name by collection —uses URL format for location of replica this includes host machine, (port), path, file_name —may contain other parameters, e.g. file size —provides hierarchical partitioning of a collection in the catalog (does not have to reflect physical organization at any site) —provides C-API
Super Computing 2000 Network Weather Service Network weather service (NWS) —developed by U of Tennessee —require installation at each participating host —provides pair-wise bandwidth/latency estimates —accessible through LDAP query
Super Computing 2000 Hierarchical Resource Manager Hierarchical Resource Manager (HRM) —HRM: for managing the access to tape resources (and staging to local disk) A HRM uses a disk cache for staging functionality generic but needs to be specialize for specific mass storage systems e.g. HRM-HPSS, HRM-Enstore,... —DRM: for managing disk resources Under development
Super Computing 2000 HRM Functionality HRM functionality includes : —queuing of file transfer requests —reordering of request to optimize Parallel FTP (ordered by files on the same tape) —monitoring progress and error messages —re-schedules failed transfers —enforces local resource policy number of simultaneous file transfer requests number of total file transfer requests per user priority of users fair treatment of users
Super Computing 2000 Current implementation of an HRM system Currently implemented for HPSS system All transfers go through HRM disk —reasons: flexibility of pre-staging —disk is sufficiently cheap for a large cache —opportunity to optimize for same file requests Functionality —Queuing file transfers —File queue management —File clustering parameter —Transfer rate estimation —Query estimation - total time —Error handling
Super Computing 2000 Queuing File Transfers Number of Parallel FTPs to HPSS are limited —limit set by a parameter —parameter can be changed dynamically HRM is multi-threaded —issues and monitors multiple Parallel FTPs All requests beyond PFTP limit are queued File Catalog used to provide for each file —HPSS path/file_name —Disk cache path/file_name —File size —tape ID
Super Computing 2000 File Queue Management Goal — minimize tape mounts — still respect the order of requests — do not postpone unpopular tapes forever File clustering parameter - FCP —If the file at top of queue is in Tape i and FCP > 1 (e.g. 4) then up to 4 files from Tape i will be selected to be transferred next —then, go back to file at top of queue Parameter can be set dynamically F 1 (T i ) F 3 (T i ) F 2 (T i ) F 4 (T i ) Order of file service
Super Computing 2000 Reading Order from Tape for different File Clustering Parameters File Clustering Parameter = 1 File Clustering Parameter = 10
Super Computing Typical Processing Flow
Super Computing 2000 Typical Processing Flow with HRM
Super Computing 2000 Conclusion Demo ran successfully at SC2000 Received “hottest infrastructure” award Proved the ability to put together multiple middleware components using common standards, interfaces, and protocols Proved usefulness of Storage Resource Management (SRM) concept for grid applications Most difficult problem for future: robustness in the face of —hardware failures —network failures —system failures —client failures