A Comparison of Web100, Active, and Passive methods for Throughput Calculation I-Heng Mei 8/30/02
Summary of Methods Active Throughput reported directly by application Passive Netflow records indicating time and number of bytes transferred per flow (routing tables) Web100 Exposes TCP connection variables through the /proc file system in linux For throughput -- bytes sent and time for each connection
Tasks Integrate Web100 into bw-tests Stream-by-stream comparison between passive/web100 data Overall correlation between active/passive/web100 throughputs
Integrating Web100 Old way Use web100 “userland” scripts to: Query for existing TCP connections Query for desired variables(~20) for each connection Why it didn’t work for us Lots of process overhead Web100 forgets connections after 5 to 10 seconds not all variables recorded for all streams
Integrating Web100 New way Interact directly with Web100 API (C/C++) Dramatically reduce overhead all variables can be recorded for all streams
Stream-by-stream For each transfer, create a table that lists the stream-by-stream stats given by passive, web100, and active(if available) methods.table
Stream-by-stream Passive vs Web100 Netflow always reports more bytes sent Expected, since Netflow includes tcp headers and retransmissions, whereas DataBytesOut (Web100 Variable) does not. Netflow consistently reports a slightly longer elapsed time Web100 does not have an ‘elapsed time’ var Sender/Receiver/Congestion Limited States
Stream-by-stream Active vs. Web100 (only Iperf) Bytes transferred nearly identical What causes the small discrepancies? Iperf reports a smaller elapsed time Expected – Iperf only considers the time spent transferring data, whereas Web100 considers the entire lifetime of a connection Declining pattern
Stream-by-stream Active vs. Passive (only Iperf) Iperf reports fewer bytes sent Expected, since retransmissions and TCP headers are counted by Netflow Iperf reports smaller elapsed time Expected, Netflow considers the entire lifetime of a connection as the elapsed time Same declining pattern as for Active/Web100
Overall Correlation Throughput Calculation Methods 1. SUM(Mbits per stream/Time per stream) 2. SUM(Mbits per str)/AVG(Time per str) 3. SUM(Mbits per str)/MAX(Time per str) If all streams have same elapsed time Method 1 == Method 2 == Method 3
Overall Correlation Correlation Tables One for Passive/Web, one for Active/Web, and one for Active/Passive Example Row: 103 Iperf test runs(samples) from SLAC to Caltech Two data sets X – set of 103 passive throughputs (using method 1) Y – set of 103 web100 throughputs (using method 1) Important stats are R and Error Each row corresponds to a unique combination of
Passive/Web100 Correlation Very highly correlated for all tests. Very low error for all tests. Summary stats Almost all rows have R ~ 1 Exceptions mostly due to “Long Flows”Long Flows When one or more streams in a transfer reports a grossly exaggerated elapsed time Occurs most often in bbcp*
Passive/Web100 Correlation Effects of Long Flows On several occasions, the bbcpmem transfer to node1.nersc.gov suffered from long flow (example)example Method 2 and 3 throughputs are severely lowered. Why is method 1 still highly correlated? When bbcpmem transfers experienced long flow to nersc.gov, each time it was exactly 1 (of 8) stream that experienced the long flow. Method 1 throughput for a transfer not affected much if there are few long flows compared to the total number of flows
Active/Web100 Correlation Good correlation, low error for all tests except Bbftp - Summary statsSummary stats For Bbftp, only method 3 works. Bbftp considers elapsed time to be the duration of the process, includes a lot of connection setup time. Bbftp streams tend to vary greatly in elapsed time For the other tests, any method provides good correlation between Active/web100. Still there are cases of low correlation/high error – what causes those?
Low Correlation (Active/Web) Not caused by long flow (only affects Netflow) Example – Bbcpdisk to node1.mcs.anl.gov Example Low correlation, high error. Range of throughput values reported by Bbcpdisk is significantly larger than values calculated with Web100 data Caused by “lingering sockets” (past study)past study Bbcp makes system calls to close sockets, but they linger on for some time while the kernel properly closes them. Lingering time tends to be longer for transfers with many simultaneous streams and large RTT Not a ‘random’ event like long flows. This is a consistent occurrence.
Active/Passive Correlation Good correlation, low error for all tests except Bbftp - Summary statsSummary stats Suffers from Long flow and Lingering sockets Again, only method 3 works for Bbftp. For the other tests, method 1 is best (alleviates the long flow problem)
Conclusions Overall, active/passive/web100 throughputs are all well correlated. Long flows in passive data Either ignore transfers with long flows, or Use calculation method 1 for those tests if applicable The lingering sockets problem – Unavoidable for passive Web100 – possible to deal with if we constantly monitor throughput variables during the transfer (Currently we only look at the variables at the end of a transfer)
Conclusions Consider how the application calculates throughput Iperf sums the throughputs for each stream use method 1 Bbftp divides total bytes by elapsed time elapsed time = entire process use method 3 (or variation that uses ‘absolute time’) Bbcp* also divides total bytes by elapsed time elapsed time = entire transfer use method 3 (but consider using method 1 to alleviate the effect of long flows) For more info