Download presentation
Presentation is loading. Please wait.
1
Transport Level Protocol Performance Evaluation for Bulk Data Transfers Matei Ripeanu The University of Chicago http://www.cs.uchicago.edu/~matei/ Abstract: Before developing new protocols targeted at bulk data transfers, the achievable performance and limitations of the broadly used TCP protocol should be carefully investigated. Our first goal is to explore TCP's bulk transfer throughput as a function of network path properties, number of concurrent flows, loss rates, competing traffic, etc. We use analytical models, simulations, and real-world experiments. The second objective is to repeat this evaluation for some of TCP's replacement candidates (e.g. NETBLT). This should allow an informed decision whether (or not) to put effort into developing and/or using new protocols specialized on bulk transfers. Application requirements (GriPhyN): Efficient management of 10s to 100s of PetaBytes (PB) of data, many PBs of new raw data / year. Granularity: file size 10M to 1G. Large-pipes: OC3 and up, high latencies Efficient bulk data transfers Gracefully share with other applns. Projects: CMS, ATLAS, LIGO, SDSS (Rough) analytical stable-state throughput estimates (based on [Math96] Main inefficiencies TCP is blamed for: Overhead. However, less than 15% of time spent in proper TCP processing. Flow control. Claim: a rate-based protocol would be faster. However, there is no proof that this is better than (self) ACK-clocking. Congestion control: Underlying problem: underlying layers do not give explicit congestion feedback, TCP therefore assumes any packet loss is a congestion signal Not scalable. Questions: Is TCP appropriate/usable? What about rate based protocols? Want to optimize: Link utilization Per file transfer delay While maintaining “fair” sharing TCP Refresher: Time Slow Start (exponential growth) Congestion Avoidance (linear growth) Fast retransmit Packet loss discovered through fast recovery mechanism Packet loss discovered through timeout Simulations (using NS []): Simulation topology: Significant throughput improvements can be achieved just by tuning the end-systems and the network path: set up proper window-sizes, disable delayed ACK, use SACK and ECN, use jumbo frames, etc. For high link loss rates, striping is a legitimate and effective solution. OC3 link, 80ms RTT, MSS=1460 initially OC12 link, 100ms RTT, MSS=1460 initially 1Gbps, 1ms RTT links OC3, 35ms or OC12, 45ms TCP striping issues Widespread usage exposes scaling problems in TCP congestion control mechanism: Unfair allocation: a small number of flows grabs almost all available bandwidth Reduced efficiency: a large number of packets are dropped. Rule of thumb: have less flows in the systems than ‘pipe size’ expressed in packets Not ‘TCP unfriendly’ as long as link loss rates are high Even high link loss rates do not break unfairness 0.5GB striped transfer, OC3 link (155Mbps), RTT 80ms, MSS=9000 using up to 1000 flows Loss rate=0.1% Loss rate=0 Conclusions TCP can work well with careful end-host and network tuning For fair sharing with other users, need mechanisms to provide congestion feedback and distinguish genuine link losses from congestion indications. In addition, admission mechanisms based on the number of parallel flows might be beneficial Striping Widely used (browsers, ftp, etc) Good practical results Not ‘TCP friendly’! RFC2140/ Ensemble TCP – share information and congestion management among parallel flows MCS/ANL courtesy Future work What are optimal buffer sizes for bulk transfers? Can we use ECN and large buffers to reliably detect congestion without using dropped packets as a congestion indicator? Assuming the link loss rate pattern is known, can it be used to reliably detect congestion and improve throughput and OC12, ANL to LBNL (56ms), Linux boxes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.