What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines.

Slides:



Advertisements
Similar presentations
GridFTP Challenges In Data Transport John Bresnahan Argonne National Laboratory The University of Chicago.
Advertisements

Cross-site data transfer on TeraGrid using GridFTP TeraGrid06 Institute User Introduction to TeraGrid June 12 th by Krishna Muriki
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Distributed Systems basics
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
Esma Yildirim Department of Computer Engineering Fatih University Istanbul, Turkey DATACLOUD 2013.
GridFTP: File Transfer Protocol in Grid Computing Networks
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Introduction to client/server architecture
Data Grid Web Services Chip Watson Jie Chen, Ying Chen, Bryan Hess, Walt Akers.
Network Topologies.
GridFTP Guy Warner, NeSC Training.
1 The Google File System Reporter: You-Wei Zhang.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Part Three: Data Management 3: Data Management A: Data Management — The Problem B: Moving Data on the Grid FTP, SCP GridFTP, UberFTP globus-URL-copy.
Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago.
ALICE data access WLCG data WG revival 4 October 2013.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
Globus GridFTP: What’s New in 2007 Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
Reliable Data Movement Framework for Distributed Science Environments Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Why GridFTP? l Performance u Parallel TCP streams, optimal TCP buffer u Non TCP protocol such as UDT u Order of magnitude greater l Cluster-to-cluster.
Secure, Collaborative, Web Service enabled and Bittorrent Inspired High-speed Scientific Data Transfer Framework.
The Globus GridFTP Framework and Server John Bresnahan, Mike Link and Raj Kettimuthu (Presenting) Math & Computer Science Division, Argonne National Laboratory,
File and Object Replication in Data Grids Chin-Yi Tsai.
Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and.
Globus GridFTP and RFT: An Overview and New Features Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
Reliable Data Movement Framework for Distributed Petascale Science Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
UDT as an Alternative Transport Protocol for GridFTP Raj Kettimuthu Argonne National Laboratory The University of Chicago.
Managed Object Placement Service John Bresnahan, Mike Link and Raj Kettimuthu (Presenting) Argonne National Lab.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
Communicating Security Assertions over the GridFTP Control Channel Rajkumar Kettimuthu 1,2, Liu Wantao 3,4, Frank Siebenlist 1,2 and Ian Foster 1,2,3 1.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
GridFTP Richard Hopkins
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
A Managed Object Placement Service (MOPS) using NEST and GridFTP Dr. Dan Fraser John Bresnahan, Nick LeRoy, Mike Link, Miron Livny, Raj Kettimuthu SCIDAC.
A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
AERG 2007Grid Data Management1 Grid Data Management GridFTP Carolina León Carri Ben Clifford (OSG)
ALCF Argonne Leadership Computing Facility GridFTP Roadmap Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
GridFTP Guy Warner, NeSC Training Team.
1 GridFTP and SRB Guy Warner Training, Outreach and Education Team, Edinburgh e-Science.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
DataGrid is a project funded by the European Commission EDG Conference, Heidelberg, Sep 26 – Oct under contract IST OGSI and GT3 Initial.
A Sneak Peak of What’s New in Globus GridFTP John Bresnahan Michael Link Raj Kettimuthu (Presenting) Argonne National Laboratory and The University of.
Diskpool and cloud storage benchmarks used in IT-DSS
dCache “Intro” a layperson perspective Frank Würthwein UCSD
Introduction to Data Management in EGI
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Introduction to client/server architecture
University of Technology
Part Three: Data Management
CS 345A Data Mining MapReduce This presentation has been altered.
THE GOOGLE FILE SYSTEM.
Presentation transcript:

What is GridFTP? l High-performance, reliable data transfer protocol optimized for high-bandwidth wide-area networks l Based on FTP protocol - defines extensions for high- performance operation and security l We supply a reference implementation: u Server u Client tools (globus-url-copy) u Development Libraries l Multiple independent implementations can interoperate u Fermi Lab and U. Virginia have home grown servers that work with ours.

GridFTP l Two channel protocol like FTP l Control Channel u Communication link (TCP) over which commands and responses flow u Low bandwidth; encrypted and integrity protected by default l Data Channel u Communication link(s) over which the actual data of interest flows u High Bandwidth; authenticated by default; encryption and integrity protection optional

Performance l Disk transfer between Urbana, IL and San Diego, CA

GridFTP over UDT l GridFTP uses XIO for network I/O operations l XIO presents a POSIX-like interface to many different protocol implementations GSI TCP Default GridFTP GridFTP over UDT GSI UDT

Reliable File Transfer Service ( RFT) RFT Service RFT Client SOAP Messages Notifications (Optional) GridFTP Server GridFTP Server CC DC Persistent Store l GridFTP client l WSRF complaint fault-tolerant service

Architecture Overview RFT Client VO 1 RFT Service GridFTP Striped Server GridFTP Client A GridFTP Server GridFTP Server GridFTP Client B B VO 2 GridFTP Striped Server GridFTP Server RFT A

GridFTP Service On demand transfer service  When a connection is formed, resources are dedicated  GridFTP might say “not now”  Not a queuing service Transfer data as fast as possible Maximize resource usage  Without over heating!

What GridFTP Does Fast data transfer service Cluster to cluster copy tool Intra-cluster broadcast tool  Multi-cast transfers Scalable  Need more throughput, add more stripes

RFT Service Orchestrates transfers on client’s behalf  Third party transfers  Interacts with many GridFTP servers Sees a bigger picture VO level Queue requests  RFT should not say no Retry requests on failure  Optimizes its workload

What RFT Does Reliable service  DB backend  Recovers from GridFTP and RFT service failures Batch requests  Light weight sessions  Submit a Request  Wait for notifications Started, finished, failed, etc

GridFTP: On Demand Service Resources are limited  Data transfers are heavy weight operations  Sometimes hardware is too busy Adding another transfer can cause thrashing Collective system throughput goes down  GridFTP might say “no” Transfer requests happen immediately  We do not queue, or delay transfers  An established session means an active transfer

Why Doesn’t GridFTP Queue A GridFTP session is heavy weight  Idle sessions consume resources Backward compatible protocol Sometimes less is more  Goal: Maximize the collective throughput Sum of all active transfer rates  Too many transfers cause thrashing Results in lower collective throughput  Avoid overheating system resources It is in the systems best interest  We know what’s good for you

GridFTP Session Resources Even for an idle session  Active TCP control channel Part of the 959 protocol. A session is defined by a TCP connection  Fork/setuid process Robustness File system/OS permissions  OS buffer space Data channels require large TCP OS buffers Active transfers  Lots of memory/Net/Disk IO Avoid too small of partitions

If GridFTP Always Said Yes OOM: the out of memory handle  OS optimistic provision of TCP buffers  Random processes will be killed  Meltdown Shared FS overuse  Pushing the I/O throughput beyond optimal  Causing OOM on IOD machines Shares of bandwidth too small  1 Million transfers at 500b/s each?  OR 10 transfers at 100Mb/s each

Simultaneous Sessions Goal: Collective throughput  entire servers bytes transferred / time Not the number of transfers at once Only reasons for more than 1 connection  Provide an interactive service for many  One session does not use all of the local resource The remote side is the bottleneck  Hide control messaging overhead in another sessions data transfer payload

Remote Bottleneck Allow more than one simultaneous transfer to use all resources 10Gb/s

But We Want Queuing! May I offer you something in an RFT?  RFT says yes  Server side retries Light weight sessions  GridFTP does the heavy lifting  Queues up requests of pending transfers  Notification upon completion  Scalability Manages/Optimizes access to GridFTP Servers

GridFTP dst GridFTP src RFT Session Interactions RFT GridFTP src GridFTP dst Client Request Notification

Scalability GridFTP  Connection rejection is a feature It SHOULD say no  Intended to scale to system transfer rates Not beyond them To scale up add more nodes as stripes (dynamic backbends) ‏ Use faster NICs RFT  Intended to scale to memory It should not say no

GridFTP Broke My Cluster! GridFTP will push hardware as hard as it is allowed  But not harder sudo rm –rf /  Did sudo break the FS? ssh –u root host1 fork.bomb  Did sshd take down the host? globus-url-copy –tcp-bs 100GB  Did GridFTP break the cluster?

Resource Protection Limits need to be in place to protect  Knowing it is ok to say ‘no’ is step 1 What will hardware allow?  How fast are my disks?  How fast is my NIC?  How fast is can I send data while using the NetFS?  How many WAN transfers can I support with system memory?  How many simultaneous transfers can are reasonable to sustain?

Fast Transfer Resources CPU  Packet switching Memory  OS buffers (BWDP) ‏  User space buffers  WAN needs much more System bus Disk  Shared FS? (net also) ‏ Network  Router and LAN TCP Buffers CPU Bus

Cluster Components Disk  Shared I/O servers Net  Backplate bandwidth Systems  CPU/Memory  Are IODs and GridFTP servers co located? Shared IO Servers GridFTP Backends GridFTP Frontends

Connection Caps As a function of system memory  Cap = |mem| / (2MB + avg(BWDP)) ‏  Never more than |mem| / 4MB service gsiftp { instances = 20 socket_type = stream wait = no env += GLOBUS_LOCATION=… env += LD_LIBRARY_PATH=… server = /usr/local/globus-4.0.1/sbin/globus-gridftp-server server_args = -i -p 2811 disable = no } % globus-gridftp-server –connection-max 20

Connection Caps As a function of system bandwidth  Cap = min(FS.BW, Net.BW) / (Target average transfer rate) ‏ As a function of my gut   Best guess based on personal experience  Typically this is where collective BW plateaus

System Buffer Limits Limit the amount of OS space per conneciton  Auto tuning  16MB - 64MB % sysctl -w net.core.rmem_max= % sysctl -w net.core.wmem_max= % cat /proc/sys/net/ipv4/tcp_wmem % cat /proc/sys/net/ipv4/tcp_rmem

GFork Memory Manager Dynamically rations memory  10% of the allowed connections get 90% of the memory  Remaining session get half of available memory Allows for high connection limits  |mem| / 2MB

Conclusions GridFTP is an on demand service  OK to say no RFT is a VO level queuing service  please use it

What is OGSA DAI? l OGSA-DAI executes workflows l OGSA-DAI is not just for data access, also does data updates, transformations and delivery.

The Globus Replica Location Service l Distributed registry l Records the locations of data copies l Allows replica discovery l RLS maintains mappings between logical identifiers and target names l Must perform and scale well: u support hundreds of millions of objects u hundreds of clients l Mature and stable component of the Globus Toolkit Replica Location Indexes Local Replica Catalogs

Approach: Combine Pegasus Workflow Management with Globus Data Replication Service Workflow Planner: Pegasus Data Placement Service: Globus DRS Compute Cluster Storage Elements Jobs Data Transfer Workflow Tasks Staging Request Setup Transfers

Examples of Placement Policies Make N copies placed randomly on different sites Random One on my server, one on the same rack, one on another rack Topology-aware Query-based replication requests to push or pull data to make new replicas Publish/Subscribe Push replicas toward the “leaf” nodes (or access points) of the tree Tree-based dissemination Exploit locality of reference by creating replicas at any site where they are accessed Pervasive Place replicas at sites in order to optimize Quality-of-Service (QoS) criteria QoS Aware