Modeling and Optimizing Large-Scale Wide-Area Data Transfers

Modeling and Optimizing Large-Scale Wide-Area Data Transfers
Raj Kettimuthu, Gayane Vardoyan, Gagan Agrawal, and P. Sadayappan

Exploding data volumes
Astronomy Climate 2004: 36 TB 2012: 2,300 TB MACHO et al.: 1 TB Palomar: 3 TB 2MASS: 10 TB GALEX: 30 TB Sloan: 40 TB Pan-STARRS: 40,000 TB Genomics 105 increase in data volumes in 6 years 100,000 TB

Datasets must frequently be transported over WAN
Data movement Datasets must frequently be transported over WAN Analysis, visualization, archival Data movement bandwidths not increasing at same rate as dataset sizes Major constraint for data-driven sciences File transfer - dominant data transfer mode GridFTP - widely used by scientific communities 1000s of servers deployed worldwide move >1 PB per day Characterize, control and optimize transfers Data movement bandwidth includes disk speeds, NIC, WAN rates. It is important not only to understand the characteristics of these transfers but also to be able to control and optimize them.

Globus implementation of GridFTP is widely used.
High-performance, secure data transfer protocol optimized for high-bandwidth wide-area networks Based on FTP protocol - defines extensions for high-performance operation and security Globus implementation of GridFTP is widely used. Globus GridFTP servers support usage statistics collection Transfer type, size in bytes, start time of the transfer, transfer duration etc. are collected for each transfer

GridFTP usage log To understand trends associated with WAN transfers, we looked at the GridFTP usage logs for a 24-hour period for top 10 sites that transferred most data. The transfer patterns show a large variability—sometimes there are no transfers, and sometimes there are many concurrent transfers. This points to the fact that the overall load can vary substantially over time. While there is a substantial variation over the 24-hour period, there is more stability over a shorter period. This figure show the variance over the entire 24-hour period, the four disjoint 6-hour periods, the 24 disjoint 1-hour periods, and each disjoint 15-minute period. Variance drops significantly for shorter durations. Thus, one can measure the performance of data transfers at a certain time and get a good indicator of the load for the immediate future.

Parallelism vs concurrency in GridFTP
Parallel File System Data Transfer Node at Site A Data Transfer Node at Site B Parallel File System GridFTP Server Process TCP Connection GridFTP Server Process TCP Connection TCP Connection TCP Connection GridFTP Server Process GridFTP Server Process TCP Connection TCP Connection TCP Connection GridFTP Server Process GridFTP Server Process TCP Connection TCP Connection Parallelism = 3 Concurrency = 3

Parallelism vs concurrency
Concurrency turns out to be a more powerful control knob than is parallelism for increasing the throughput. For example, the average throughput for cc = 8 and p = 1 is >2.2 Gbps, whereas the average throughput for p = 8 and cc = 1 is <600 Mbps. As far as NIC and WAN connections are concerned, parallelism and concurrency work in the same way. The multiple processes used when concurrency is increased seem to help get better I/O performance.

Most large transfers between supercomputers
Problem formulation Objective - control bandwidth allocation for transfer(s) from a source to the destination(s) Most large transfers between supercomputers Ability to both store and process large amounts of data Site heavily loaded, most bandwidth consumed by small number of sites Goal – develop simple model for GridFTP Source concurrency - total number of ongoing transfers between the endpoint A and all its major transfer endpoints Destination concurrency - total number of ongoing transfers between the endpoint A and the endpoint B External load - All other activities on the endpoints including transfers to other sites

Modeling throughput Linear models
Models that consider only source and destination CC Separate model for each destination Data to train, validate models – load variation experiments Errors >15% for most cases Log models Y’ = a1X1 + a2X2 + … + akXk + b DT = a1*DC + a2*SC + b1 DT = a3 *DC/SC + b2 A linear model between several input variables X1,X2, … ,Xk and a target variable Y is Y’ = a1X1 + a2X2 + … + akXk + b. Y’ is the prediction of the observed value of Y for the corresponding values of Xi. Load variation experiments: we start with a baseline case (a case in which all of the destinations have the same concurrency), we continue by increasing destination concurrency for a destination by 1, we run the baseline case again, we increase destination concurrency for the next destination by 1, we run the baseline again, and so on. We used three-fifths of the data from our experiments for training and two-fifths of the data for validation. After a model is built by using the training data, it will not fit the training data perfectly. The resulting error rate is called the training error. The model is then validated by using the validation data; the resulting error rate is called the validation error. Some of the nonlinear dependencies (such as throughput saturation) between the terms could be captured through a model in the form of Y’ = X1a1 * X2a2 * …* Xkak * 2b DT = SCa4 *DCa5 * 2b3 log(DT)=a4*log(SC) + a5*log(DC) + b3

Modeling throughput Log model better than linear models, still high errors Model based on just SC and DC too simplistic Incorporate external load External load - network, disk, and CPU activities outside transfers How to measure the external load? How to include external load in model(s)? Table shows the training and validation errors for the log model and the best one among the non-log models. We can see that the log-based model is clearly better: the training and validation errors went down in every single case and up to 27%. Still the relative error rate is around 15% in many cases.

External load Transfers stable over short duration but vary widely over entire day Multiple training data – same SC, DC - different days & times Throughput differences for same SC, DC attributed to difference in external load Three different functions for external load (EL) EL1=T −AT, T - throughput for transfer t, AT - average throughput of all transfers with same SC, DC as t EL2=T−MT, MT - max throughput with same SC, DC as t EL3 = T/MT

Models with external load
DT = a6*DC + a7*SC + a8*EL + b4 Linear ELa11 if EL>0 |EL|(−a11) otherwise Log DT = SCa9 * DCa10 * AEL{a11} * 2b5 AEL{a11} = Note that relative error rates for all destinations are less than 10% with the log + EL3 model and less than or equal to 15% for all the models.

Calculating external load in practice
DT = a6*DC + a7*SC + a8*EL + b4 Given Control Unknown Unlike SC and DC, external load is unknown Multiple data points with same SC, DC used to train models In practice, may not be any recent transfers with same SC, DC Some recent transfers, no substantial change in external load over few minutes Most recent transfer’s load as current load Average load of transfers in past 30 minutes as current load Average load in the past 30 minutes with error correction

Recent transfers load with error correction
Previous Transfer Method DT = a6*DC + a7*SC + a8*EL + b4 Known Compute Recent Transfers Method Transfers in past 30 minutes Recent Transfers with Error Correction DT = a6*DC + a7*SC + a8*EL + b4 + e Historic transfers

Applying models to control bandwidth
Experimental setup: DTNs at 5 XSEDE sites (Source: TACC, Destinations: PSC, NCAR, NICS, Indiana, SDSC) Goal – control bandwidth allocation to destinations when source is saturated Models express throughput in terms of SC, DC, and EL Given target throughput, determine DC to achieve target Often more than one destination transfer data, SC is also unknown. Limit DC to 20 to narrow search space Even then, large number of possible DC combinations (20n) Heuristics to limit search space to (SCmax – ND + 1)

Experiments Ratio experiments – allocate available bandwidth at source to destinations using predefined ratio Achieve specific fraction of bandwidth for each destination Four ratio combinations Factoring experiments – increase destination’s throughput by a factor when source is saturated Bandwidth increase because of certain priorities Four models/methods (log EL1/EL3 models and RT/RTEC methods) were used Effective in predicting the throughputs 83.6% of the errors are below 15%, and 65.5% of them are below 10% Ratio combinations were picked based on the maximum throughputs that can be independently achieved by these destinations in various tests.

Results – Ratio experiments
Ratios are 4:5:6:8:9 for Kraken, Mason, Blacklight, Gordon, and Yellowstone. Concurrencies picked by Algorithm were {1,3,3,1,1}. Model: log with EL1. Method: RTEC Ratios are 4:5:6:8:9 for Kraken, Mason, Blacklight, Gordon, and Yellowstone. Concurrencies picked by Algorithm were {1,4,3,1,1}. Model: log with EL3. Method: RT

Results – Factoring experiments
Increasing Yellowstone’s baseline throughput by 1.5x. Concurrency picked by picked by Algorithm for Yellowstone was 3 Increasing Gordon’s baseline throughput by 2x. Concurrency picked by picked by Algorithm for Gordon was 5

Related work Several models for predicting behavior & finding optimal parallel TCP streams Uncongested networks, simulations Several studies developed models to find optimal streams, TCP buffer size for GridFTP Buffer size not needed with TCP autotuning Major difference - attempt to model GridFTP throughput based on end-to-end behavior End-system load, destinations’ capabilities, concurrent transfers Many studies on bandwidth allocation at router Our focus is application-level control

Summary Understand performance of WAN transfers
Control bandwidth allocation at FTP level Transfers between major supercomputing centers Concurrency powerful than parallelism Models to help control bandwidth allocation Log models that combine total source CC, destination CC, and a measure of external load are effective Methods that utilize both recent and historical experimental data better at estimating external load

Questions

Modeling and Optimizing Large-Scale Wide-Area Data Transfers

Similar presentations

Presentation on theme: "Modeling and Optimizing Large-Scale Wide-Area Data Transfers"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Modeling and Optimizing Large-Scale Wide-Area Data Transfers

Similar presentations

Presentation on theme: "Modeling and Optimizing Large-Scale Wide-Area Data Transfers"— Presentation transcript:

Similar presentations

About project

Feedback