Overview of GT4 Data Services. Globus Data Services Talk Outline Summarize capabilities and plans for data services in the Globus Toolkit Version 4.0.2.

Slides:

Advertisements

Similar presentations

Giggle: A Framework for Constructing Scalable Replica Location Services Ann Chervenak, Ewa Deelman, Ian Foster, Leanne Guy, Wolfgang Hoschekk, Adriana.

Advertisements

The Globus Striped GridFTP Framework and Server Bill Allcock 1 (presenting) John Bresnahan 1 Raj Kettimuthu 1 Mike Link 2 Catalin Dumitrescu 2 Ioan Raicu.

The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.

Case Study 1: Data Replication for LIGO Scott Koranda Ann Chervenak.

RLS and DRS Roadmap Items Ann Chervenak Robert Schuler USC Information Sciences Institute.

Globus DataGrid Overview Bill Allcock, ANL GridPP Meeting 30 June 2003.

Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.

High Performance Computing Course Notes Grid Computing.

Data Grids Darshan R. Kapadia Gregor von Laszewski

GridFTP: File Transfer Protocol in Grid Computing Networks

Technical Architectures

The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.

Applying Data Grids to Support Distributed Data Management Storage Resource Broker Reagan W. Moore Ian Fisk Bing Zhu University of California, San Diego.

NextGRID & OGSA Data Architectures: Example Scenarios Stephen Davey, NeSC, UK ISSGC06 Summer School, Ischia, Italy 12 th July 2006.

4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.

Magda – Manager for grid-based data Wensheng Deng Physics Applications Software group Brookhaven National Laboratory.

Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.

GridFTP Guy Warner, NeSC Training.

QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.

1 The Google File System Reporter: You-Wei Zhang.

The Data Replication Service Ann Chervenak Robert Schuler USC Information Sciences Institute.

Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago.

Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.

Globus Data Replication Services Ann Chervenak, Robert Schuler USC Information Sciences Institute.

DataGrid Middleware: Enabling Big Science on Big Data One of the most demanding and important challenges that we face as we attempt to construct the distributed.

A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.

GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.

The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.

Secure, Collaborative, Web Service enabled and Bittorrent Inspired High-speed Scientific Data Transfer Framework.

File and Object Replication in Data Grids Chin-Yi Tsai.

Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.

Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and.

Globus GridFTP and RFT: An Overview and New Features Raj Kettimuthu Argonne National Laboratory and The University of Chicago.

CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.

UDT as an Alternative Transport Protocol for GridFTP Raj Kettimuthu Argonne National Laboratory The University of Chicago.

Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.

Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.

Part Four: The LSC DataGrid Part Four: LSC DataGrid A: Data Replication B: What is the LSC DataGrid? C: The LSCDataFind tool.

Globus Replica Management Bill Allcock, ANL PPDG Meeting at SLAC 20 Sep 2000.

Communicating Security Assertions over the GridFTP Control Channel Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Frank Siebenlist 1,2 and Ian Foster 1,2,3 1.

The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.

NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.

Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.

LEGS: A WSRF Service to Estimate Latency between Arbitrary Hosts on the Internet R.Vijayprasanth 1, R. Kavithaa 2,3 and Raj Kettimuthu 2,3 1 Coimbatore.

Data Management and Transfer in High-Performance Computational Grid Environments B. Allcock, J. Bester, J. Bresnahan, A. L. Chervenak, I. Foster, C. Kesselman,

Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda.

GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid

CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.

GridFTP Richard Hopkins

Globus – Part II Sathish Vadhiyar. Globus Information Service.

7. Grid Computing Systems and Resource Management

A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC.

Rights Management in Globus Data Services Ann Chervenak, ISI/USC Bill Allcock, ANL/UC.

Scott Koranda, UWM & NCSA 14 January 2016www.griphyn.org Lightweight Data Replicator Scott Koranda University of Wisconsin-Milwaukee & National Center.

The Globus eXtensible Input/Output System (XIO): A protocol independent IO system for the Grid Bill Allcock, John Bresnahan, Raj Kettimuthu and Joe Link.

ALCF Argonne Leadership Computing Facility GridFTP Roadmap Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory.

Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.

Globus Data Storage Interface (DSI) - Enabling Easy Access to Grid Datasets Raj Kettimuthu, ANL and U. Chicago DIALOGUE Workshop August 2, 2005.

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

GridFTP Guy Warner, NeSC Training Team.

1 GridFTP and SRB Guy Warner Training, Outreach and Education Team, Edinburgh e-Science.

Protocols and Services for Distributed Data- Intensive Science Bill Allcock, ANL ACAT Conference 19 Oct 2000 Fermi National Accelerator Laboratory Contributors:

A Sneak Peak of What’s New in Globus GridFTP John Bresnahan Michael Link Raj Kettimuthu (Presenting) Argonne National Laboratory and The University of.

Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.

Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.

The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.

GGF OGSA-WG, Data Use Cases Peter Kunszt Middleware Activity, Data Management Cluster EGEE is a project funded by the European.

A Replica Location Service

Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel

Database System Architectures

Presentation transcript:

Overview of GT4 Data Services

Globus Data Services Talk Outline Summarize capabilities and plans for data services in the Globus Toolkit Version l Extensible IO (XIO) system u Flexible framework for I/O l GridFTP u Fast, secure data transport l The Reliable File Transfer Service (RFT) u Data movement services for GT4 l The Replica Location Service (RLS) u Distributed registry that records locations of data copies l The Data Replication Service (DRS) u Integrates RFT and RLS to replicate and register files

The eXtensible Input / Output (XIO) System GridFTP The Reliable File Transfer Service (RFT) Bill Allcock, ANL

Technology Drivers l Internet revolution: 100M+ hosts u Collaboration & sharing the norm l Universal Moore’s law: x10 3 /10 yrs u Sensors as well as computers l Petascale data tsunami u Gating step is analysis l & our old infrastructure? 114 genomes 735 in progress You are here Slide courtesy of Ian Foster

Extensible IO (XIO) system l Provides a framework that implements a Read/Write/Open/Close Abstraction l Drivers are written that implement the functionality (file, TCP, UDP, GSI, etc.) l Different functionality is achieved by building protocol stacks l GridFTP drivers will allow 3 rd party applications to easily access files stored under a GridFTP server l Other drivers could be written to allow access to other data stores. l Changing drivers requires minimal change to the application code.

Globus XIO Framework l Moves the data from user to driver stack. l Manages the interactions between drivers. l Assist in the creation of drivers. u Internal API for passing operations down the stack User API Framework Driver Stack Transform Transport

GridFTP l A secure, robust, fast, efficient, standards based, widely accepted data transfer protocol l GridFTP Protocol u Defined through the Global/Open Grid Forum u Multiple Independent implementations can interoperate l The Globus Toolkit supplies a reference implementation: u Server u Client tools (globus-url-copy) u Development Libraries

GridFTP Protocol l FTP protocol is defined by several IETF RFCs l Start with most commonly used subset u Standard FTP: get/put operations, 3 rd -party transfer l Implement standard but often unused features u GSS binding, extended directory listing, simple restart l Extend in various ways, while preserving interoperability with existing servers u Parallel data channels u Striped transfers u Partial file transfers u Automatic & manual TCP buffer setting u Progress monitoring u Extended restart

GridFTP Protocol (cont.) l Existing standards u RFC 959: File Transfer Protocol u RFC 2228: FTP Security Extensions u RFC 2389: Feature Negotiation for the File Transfer Protocol u Draft: FTP Extensions u GridFTP: Protocol Extensions to FTP for the Grid l Grid Forum Recommendation l GFD.20 l

GT4 GridFTP Implementation l 100% Globus code. No licensing issues. l Striping support is provided in 4.0 l Has IPV6 support included Based on XIO l Extremely modular to allow integration with a variety of data sources (files, mass stores, etc.) u Storage Resource Broker (SRB) DSI – Allows use of GridFTP to access data stored in SRB systems u High Performance Storage System (HPSS) DSI – Provides GridFTP access to hierarchical mass storage systems (HPSS 6.2) that include tape storage

? Clients Data Storage Interfaces (DSI) -POSIX -SRB -HPSS -NEST GridFTP Server Separate control, data Striping, fault tolerance Metrics collection Access control XIO Drivers -TCP -UDT (UDP) -Parallel streams -GSI -SSH Client Interfaces Globus-URL-Copy C Library RFT (3 rd party) File Systems I/O Network GridFTP

Control and Data Channels Control Data Typical Installation Control Data Separate Processes Striped Server Control Data l GridFTP (and FTP) use (at least) two separate socket connections: u A control channel for carrying the commands and responses u A data Channel for actually moving the data l Control Channel and Data Channel can be (optionally) completely separate processes.

Parallel and Striped GridFTP Transfers l A distributed GridFTP service that typically runs on a storage cluster u Every node of the cluster is used to transfer data into/out of the cluster u Head node coordinates transfers l Multiple NICs/internal busses lead to very high performance u Maximizes use of Gbit+ WANs Parallel Transfer Fully utilizes bandwidth of network interface on single nodes. Striped Transfer Fully utilizes bandwidth of Gb+ WAN using multiple nodes. Parallel Filesystem

Striped Server Mode l Multiple nodes work together *on a single file* and act as a single GridFTP server l An underlying parallel file system allows all nodes to see the same file system and must deliver good performance (usually the limiting factor in transfer speed) u I.e., NFS does not cut it l Each node then moves (reads or writes) only the pieces of the file that it is responsible for. l This allows multiple levels of parallelism, CPU, bus, NIC, disk, etc. u Critical if you want to achieve better than 1 Gbs without breaking the bank

TeraGrid Striping results l Ran varying number of stripes l Ran both memory to memory and disk to disk. l Memory to Memory gave extremely good (nearly 1:1) linear scalability. l We achieved 27 Gbs on a 30 Gbs link (90% utilization) with 32 nodes. l Disk to disk we were limited by the storage system, but still achieved 17.5 Gbs

Memory to Memory over 30 Gigabit/s Network (San Diego — Urbana) 30 Gb/s 20 Gb/s 10 Gb/s Striping

Disk to Disk over 30 Gigabit/s Network (San Diego — Urbana) 20 Gb/s 10 Gb/s Striping

Scalability Results

Lots Of Small Files (LOSF) l Pipelining u Many transfer requests outstanding at once u Client sends second request before the first completes u Latency of request is hidden in data transfer time l Cached Data channel connections u Reuse established data channels (Mode E) u No additional TCP or GSI connect overhead

“Lots of Small Files” (LOSF) Optimization Send 1 GB partitioned into equi-sized files over 60 ms RTT, 1 Gbit/s WAN Megabit/sec File size (Kbyte) (16MB TCP buffer) Number of files John Bresnahan et al., Argonne

Reliable File Transfer Service (RFT) l Service that accepts requests for third-party file transfers l Maintains state in a DB about ongoing transfers l Recovers from RFT service failures l Increased reliability because state is stored in a database. l Service interface u The client can submit the transfer request and then disconnect and go away u Similar to a job scheduler for transfer job l Two ways to check status u Subscribe for notifications u Poll for status (can check for missed notifications)

Reliable File Transfer (cont.) l RFT accepts a SOAP description of the desired transfer l It writes this to a database l It then uses the Java GridFTP client library to initiate 3 rd part transfers on behalf of the requestor l Restart Markers are stored in the database to allow for restart in the event of an RFT failure l Supports concurrency, i.e., multiple files in transit at the same time u This gives good performance on many small files

Reliable File Transfer: Third Party Transfer RFT Service RFT Client SOAP Messages Notifications (Optional) Data Channel Protocol Interpreter Master DSI Data Channel Slave DSI IPC Receiver IPC Link Master DSI Protocol Interpreter Data Channel IPC Receiver Slave DSI Data Channel IPC Link GridFTP Server l Fire-and-forget transfer l Web services interface l Many files & directories l Integrated failure recovery l Has transferred 900K files

Data Transfer Comparison Control Data Control Data Control Data Control Data globus-url-copyRFT Service RFT Client SOAP Messages Notifications (Optional)

Globus Replica Location Service

Replica Management in Grids Data intensive applications produce terabytes or petabytes of data stored as of millions of data objects Replicate data at multiple locations for reasons of: l Fault tolerance: u Avoid single points of failure l Performance u Avoid wide area data transfer latencies u Achieve load balancing l Need tools for: u Registering the existence of data items, discovering them u Replicating data items to new locations

A Replica Location Service A Replica Location Service (RLS) is a distributed registry that records the locations of data copies and allows replica discovery u Must perform and scale well: support hundreds of millions of objects, hundreds of clients l E.g., LIGO (Laser Interferometer Gravitational Wave Observatory) Project u RLS servers at 8 sites u Maintain associations between 3 million logical file names & 30 million physical file locations l RLS is one component of a Replica Management system u Other components include consistency services, replica selection services, reliable data transfer, etc.

A Replica Location Service l A Replica Location Service (RLS) is a distributed registry that records the locations of data copies and allows discovery of replicas l RLS maintains mappings between logical identifiers and target names l An RLS framework was designed in a collaboration between the Globus project and the DataGrid project (SC2002 paper)

LRC RLI LRC Replica Location Indexes Local Replica Catalogs Replica Location Index (RLI) nodes aggregate information about one or more LRCs LRCs use soft state update mechanisms to inform RLIs about their state: relaxed consistency of index Optional compression of state updates reduces communication, CPU and storage overheads Membership service registers participating LRCs and RLIs and deals with changes in membership RLS Framework Local Replica Catalogs (LRCs) contain consistent information about logical-to-target mappings

Replica Location Service In Context l The Replica Location Service is one component in a layered data management architecture l Provides a simple, distributed registry of mappings l Consistency management provided by higher-level services

Components of RLS Implementation l Common server implementation for LRC and RLI l Front-End Server u Multi-threaded u Written in C u Supports GSI Authentication using X.509 certificates l Back-end Server u MySQL, PostgreSQL, Oracle Relational Database u Embedded SQLite DB l Client APIs: C, Java, Python l Client Command line tool

RLS Implementation Features l Two types of soft state updates from LRCs to RLIs u Complete list of logical names registered in LRC u Compressed updates: Bloom filter summaries of LRC l Immediate mode u Incremental updates l User-defined attributes u May be associated with logical or target names l Partitioning (without bloom filters) u Divide LRC soft state updates among RLI index nodes using pattern matching of logical names l Currently, static membership configuration only u No membership service

Alternatives for Soft State Update Configuration l LFN List u Send list of Logical Names stored on LRC u Can do exact and wildcard searches on RLI u Soft state updates get increasingly expensive as number of LRC entries increases l space, network transfer time, CPU time on RLI u E.g., with 1 million entries, takes 20 minutes to update mySQL on dual-processor 2 GHz machine (CPU-limited) l Bloom filters u Construct a summary of LRC state by hashing logical names, creating a bitmap u Compression u Updates much smaller, faster u Supports higher query rate u Small probability of false positives (lossy compression) u Lose ability to do wildcard queries

Immediate Mode for Soft State Updates l Immediate Mode u Send updates after 30 seconds (configurable) or after fixed number (100 default) of updates u Full updates are sent at a reduced rate u Tradeoff depends on volatility of data/frequency of updates u Immediate mode updates RLI quickly, reduces period of inconsistency between LRC and RLI content l Immediate mode usually sends less data u Because of less frequent full updates l Usually advantageous u An exception would be initially loading of large database

Performance Testing l Extensive performance testing reported in HPDC 2004 paper l Performance of individual LRC (catalog) or RLI (index) servers u Client program submits operation requests to server l Performance of soft state updates u Client LRC catalogs sends updates to index servers Software Versions: u Replica Location Service Version u Globus Packaging Toolkit Version u libiODBC library Version u MySQL database Version u MyODBC library (with MySQL) Version

Testing Environment l Local Area Network Tests u 100 Megabit Ethernet u Clients (either client program or LRCs) on cluster: dual Pentium-III 547 MHz workstations with 1.5 Gigabytes of memory running Red Hat Linux 9 u Server: dual Intel Xeon 2.2 GHz processor with 1 Gigabyte of memory running Red Hat Linux 7.3 l Wide Area Network Tests (Soft state updates) u LRC clients (Los Angeles): cluster nodes u RLI server (Chicago): dual Intel Xeon 2.2 GHz machine with 2 gigabytes of memory running Red Hat Linux 7.3

LRC Operation Rates (MySQL Backend) Up to 100 total requesting threads Clients and server on LAN Query: request the target of a logical name Add: register a new mapping Delete a mapping

Comparison of LRC to Native MySQL Performance LRC Overheads Highest for queries: LRC achieve 70-80% of native rates Adds and deletes: ~90% of native performance for 1 client (10 threads) Similar or better add and delete performance with 10 clients (100 threads)

Bulk Operation Performance l For user convenience, server supports bulk operations l E.g., 1000 operations per request l Combine adds/deletes to maintain approx. constant DB size l For small number of clients, bulk operations increase rates l E.g., 1 client (10 threads) performs 27% more queries, 7% more adds/deletes

Uncompressed Soft State Updates Perform poorly when multiple LRCs update RLI E.g., 6 LRCs with 1 million entries updating RLI, average update ~5102 seconds in Local Area Limiting factor: rate of updates to an RLI database Advisable to use incremental updates

Bloom Filter Compression l Construct a summary of each LRC’s state by hashing logical names, creating a bitmap l RLI stores in memory one bitmap per LRC Advantages: l Updates much smaller, faster l Supports higher query rate u Satisfied from memory rather than database Disadvantages: l Lose ability to do wildcard queries, since not sending logical names to RLI l Small probability of false positives (configurable) u Relaxed consistency model

Bloom Filter Performance: Single Wide Area Soft State Update (Los Angeles to Chicago) LRC Database Size Avg. time to send soft state update (seconds) Avg. time for initial bloom filter computation (seconds) Size of bloom filter (bits) 100,000 entries Less than 121 million 1 million entries million 5 million entries million

Scalability of Bloom Filter Updates l 14 LRCs with 5 million mappings send Bloom filter updates continuously in Wide Area (unlikely, represents worst case) l Update times increase when 8 or more clients send updates l 2 to 3 orders of magnitude better performance than uncompressed (e.g., 5102 seconds with 6 LRCs)

Bloom Filter Compression Supports Higher RLI Query Rates Uncompressed updates: about 3000 queries per second Higher rates with Bloom filter compression Scalability limit: significant overhead to check 100 bit maps Practical deployments: <10 LRCs updating an RLI

RLS Performance Summary Individual RLS servers perform well and scale up to u Millions of entries u One hundred requesting threads l Soft state updates of the distributed index scale well when using Bloom filter compression l Uncompressed updates slow as size of catalog grows u Immediate mode is advisable

Current Work l Ongoing maintenance and improvements to RLS u RLS is a stable component u Good performance and scalability u No major changes to existing interfaces l Recently added features: u WS-RLS: WS-RF compatible web services interface to existing RLS service u Embedded SQLite database for easier RLS deployment u Pure Java client implementation completed

Wide Area Data Replication for Scientific Collaborations Ann Chervenak, Robert Schuler, Carl Kesselman USC Information Sciences Institute Scott Koranda Univa Corporation Brian Moe University of Wisconsin Milwaukee

Motivation l Scientific application domains spend considerable effort managing large amounts of experimental and simulation data l Have developed customized, higher-level Grid data management services l Examples: u Laser Interferometer Gravitational Wave Observatory (LIGO) Lightweight Data Replicator System u High Energy Physics projects: EGEE system, gLite, LHC Computing Grid (LCG) middleware u Portal-based coordination of services (E.g., Earth System Grid)

Motivation (cont.) l Data management functionality varies by application l Share several requirements: u Publish and replicate large datasets (millions of files) u Register data replicas in catalogs and discover them u Perform metadata-based discovery of datasets u May require ability to validate correctness of replicas u In general, data updates and replica consistency services not required (i.e., read-only accesses) l Systems provide production data management services to individual scientific domains l Each project spends considerable resources to design, implement & maintain data management system u Typically cannot be re-used by other applications

Motivation (cont.) l Long-term goals: u Generalize functionality provided by these data management systems u Provide suite of application-independent services l Paper describes one higher-level data management service: the Data Replication Service (DRS) l DRS functionality based on publication capability of the LIGO Lightweight Data Replicator (LDR) system l Ensures that a set of files exists on a storage site u Replicates files as needed, registers them in catalogs l DRS builds on lower-level Grid services, including: u Globus Reliable File Transfer (RFT) service u Replica Location Service (RLS)

Outline l Description of LDR data publication capability l Generalization of this functionality u Define characteristics of an application-independent Data Replication Service (DRS) l DRS Design l DRS Implementation in GT4 environment l Evaluation of DRS performance in a wide area Grid l Related work l Future work

A Data-Intensive Application Example: The LIGO Project l Laser Interferometer Gravitational Wave Observatory (LIGO) collaboration l Seeks to measure gravitational waves predicted by Einstein l Collects experimental datasets at two LIGO instrument sites in Louisiana and Washington State l Datasets are replicated at other LIGO sites l Scientists analyze the data and publish their results, which may be replicated l Currently LIGO stores more than 40 million files across ten locations

The Lightweight Data Replicator l LIGO scientists developed the Lightweight Data Replicator (LDR) System for data management l Built on top of standard Grid data services: u Globus Replica Location Service u GridFTP data transport protocol l LDR provides a rich set of data management functionality, including u a pull-based model for replicating necessary files to a LIGO site u efficient data transfer among LIGO sites u a distributed metadata service architecture u an interface to local storage systems u a validation component that verifies that files on a storage system are correctly registered in a local RLS catalog

LIGO Data Publication and Replication Two types of data publishing 1. Detectors at Livingston and Hanford produce data sets u Approx. a terabyte per day during LIGO experimental runs u Each detector produces a file every 16 seconds u Files range in size from 1 to 100 megabytes u Data sets are copied to main repository at CalTech, which stores them in tape-based mass storage system u LIGO sites can acquire copies from CalTech or one another 2. Scientists also publish new or derived data sets as they perform analysis on existing data sets u E.g., data filtering or calibration may create new files u These new files may also be replicated at LIGO sites

Some Terminology l A logical file name (LFN) is a unique identifier for the contents of a file u Typically, a scientific collaboration defines and manages the logical namespace u Guarantees uniqueness of logical names within that organization l A physical file name (PFN) is the location of a copy of the file on a storage system. u The physical namespace is managed by the file system or storage system l The LIGO environment currently contains: u More than 25 million unique logical files u More than 145 million physical files stored at ten sites

Components at Each LDR Site l Local storage system l GridFTP server for file transfer l Metadata Catalog: associations between logical file names and metadata attributes l Replica Location Service: u Local Replica Catalog (LRCs) stores mappings from logical names to storage locations u Replica Location Index (RLI) collects state summaries from LRCs l Scheduler and transfer daemons l Prioritized queue of requested files

LDR Data Publishing l Scheduling daemon runs at each LDR site u Queries site’s metadata catalog to identify logical files with specified metadata attributes u Checks RLS Local Replica Catalog to determine whether copies of those files already exist locally u If not, puts logical file names on priority-based scheduling queue l Transfer daemon also runs at each site u Checks queue and initiates data transfers in priority order u Queries RLS Replica Location Index to find sites where desired files exists u Randomly selects source file from among available replicas u Use GridFTP transport protocol to transfer file to local site u Registers newly-copied file in RLS Local Replica Catalog

Generalizing the LDR Publication Scheme l Want to provide a similar capability that is u Independent of LIGO infrastructure u Useful for a variety of application domains l Capabilities include: u Interface to specify which files are required at local site u Use of Globus RLS to discover whether replicas exist locally and where they exist in the Grid u Use of a selection algorithm to choose among available replicas u Use of Globus Reliable File Transfer service and GridFTP data transport protocol to copy data to local site u Use of Globus RLS to register new replicas

Relationship to Other Globus Services At requesting site, deploy: l WS-RF Services u Data Replication Service u Delegation Service u Reliable File Transfer Service l Pre WS-RF Components u Replica Location Service (Local Replica Catalog, Replica Location Index) u GridFTP Server

DRS Functionality l Initiate a DRS Request l Create a delegated credential l Create a Replicator resource l Monitor Replicator resource l Discover replicas of desired files in RLS, select among replicas l Transfer data to local site with Reliable File Transfer Service l Register new replicas in RLS catalogs l Allow client inspection of DRS results l Destroy Replicator resource DRS implemented in Globus Toolkit Version 4, complies with Web Services Resource Framework (WS-RF)

WSRF in a Nutshell l Service l State Management: u Resource u Resource Property l State Identification: u Endpoint Reference l State Interfaces: u GetRP, QueryRPs, GetMultipleRPs, SetRP l Lifetime Interfaces: u SetTerminationTime u ImmediateDestruction l Notification Interfaces u Subscribe u Notify l ServiceGroups RPs Resource Service GetRP GetMultRPs SetRP QueryRPs Subscribe SetTermTime Destroy EPR

Service Container Create Delegated Credential Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP proxy Initialize user proxy cert. Create delegated credential resource Set termination time Credential EPR returned EPR

Service Container Create Replicator Resource Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Create Replicator resource Pass delegated credential EPR Set termination time Replicator EPR returned EPR Replicator RP Access delegated credential resource

Service Container Monitor Replicator Resource Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Periodically polls Replicator RP via GetRP or GetMultRP Add Replicator resource to MDS Information service Index Index RP Subscribe to ResourceProperty changes for “Status” RP and “Stage” RP Conditions may trigger alerts or other actions (Trigger service not pictured) EPR

Service Container Query Replica Information Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Notification of “Stage” RP value changed to “discover” Replicator queries RLS Replica Index to find catalogs that contain desired replica information Replicator queries RLS Replica Catalog(s) to retrieve mappings from logical name to target name (URL)

Service Container Transfer Data Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Notification of “Stage” RP value changed to “transfer” Create Transfer resource Pass credential EPR Set Termination Time Transfer resource EPR returned Transfer RP EPR Access delegated credential resource Setup GridFTP Server transfer of file(s) Data transfer between GridFTP Server sites Periodically poll “ResultStatus” RP via GetRP When “Done”, get state information for each file transfer

Service Container Register Replica Information Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Notification of “Stage” RP value changed to “register” RLS Replica Catalog sends update of new replica mappings to the Replica Index Transfer RP Replicator registers new file mappings in RLS Replica Catalog

Service Container Client Inspection of State Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Notification of “Status” RP value changed to “Finished” Transfer RP Client inspects Replicator state information for each replication in the request

Service Container Resource Termination Client Delegation Data Rep. RFT Replica Index Replica Catalog GridFTP Server GridFTP Server Replica Catalog Replica Catalog Replica Catalog MDS Credential RP Replicator RP Index RP Termination time (set by client) expires eventually Transfer RP Resources destroyed (Credential, Transfer, Replicator) TIME

Performance Measurements: Wide Area Testing l The destination for the pull-based transfers is located in Los Angeles u Dual-processor, 1.1 GHz Pentium III workstation with 1.5 GBytes of memory and a 1 Gbit Ethernet u Runs a GT4 container and deploys services including RFT and DRS as well as GridFTP and RLS l The remote site where desired data files are stored is located at Argonne National Laboratory in Illinois u Dual-processor, 3 GHz Intel Xeon workstation with 2 gigabytes of memory with 1.1 terabytes of disk u Runs a GT4 container as well as GridFTP and RLS services

DRS Operations Measured l Create the DRS Replicator resource l Discover source files for replication using local RLS Replica Location Index and remote RLS Local Replica Catalogs l Initiate an Reliable File Transfer operation by creating an RFT resource l Perform RFT data transfer(s) l Register the new replicas in the RLS Local Replica Catalog

Experiment 1: Replicate 10 Files of Size 1 Gigabyte Component of Operation Time (milliseconds) Create Replicator Resource317.0 Discover Files in RLS Create RFT Resource Transfer Using RFT Register Replicas in RLS l Data transfer time dominates l Wide area data transfer rate of 67.4 Mbits/sec

Experiment 2: Replicate 1000 Files of Size 10 Megabytes Component of Operation Time (milliseconds) Create Replicator Resource Discover Files in RLS 9.8 Create RFT Resource Transfer Using RFT Register Replicas in RLS l Time to create Replicator and RFT resources is larger u Need to store state for 1000 outstanding transfers l Data transfer time still dominates l Wide area data transfer rate of 85 Mbits/sec