Presentation is loading. Please wait.

Presentation is loading. Please wait.

Slide: 1 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 1 TCP/IP Overview & Performance Richard Hughes-Jones The University.

Similar presentations


Presentation on theme: "Slide: 1 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 1 TCP/IP Overview & Performance Richard Hughes-Jones The University."— Presentation transcript:

1 Slide: 1 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 1 TCP/IP Overview & Performance Richard Hughes-Jones The University of Manchester MB - NG

2 Slide: 2 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 2 TCP (Reno) – What’s the problem? uTCP has 2 phases: Slowstart Probe the network to estimate the Available BW Exponential growth Congestion Avoidance Main data transfer phase – transfer rate glows “slowly” uAIMD and High Bandwidth – Long Distance networks Poor performance of TCP in high bandwidth wide area networks is due in part to the TCP congestion control algorithm. For each ack in a RTT without loss: cwnd -> cwnd + a / cwnd- Additive Increase, a=1 For each window experiencing loss: cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½ uPacket loss is a killer !!

3 Slide: 3 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 3 TCP (Reno) – Details uTime for TCP to recover its throughput from 1 lost packet given by: u for rtt of ~200 ms: 2 min

4 Slide: 4 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 4 Investigation of new TCP Stacks uThe AIMD Algorithm – Standard TCP (Reno) For each ack in a RTT without loss: cwnd -> cwnd + a / cwnd- Additive Increase, a=1 For each window experiencing loss: cwnd -> cwnd – b (cwnd) - Multiplicative Decrease, b= ½ uHigh Speed TCP a and b vary depending on current cwnd using a table a increases more rapidly with larger cwnd – returns to the ‘optimal’ cwnd size sooner for the network path b decreases less aggressively and, as a consequence, so does the cwnd. The effect is that there is not such a decrease in throughput. uScalable TCP a and b are fixed adjustments for the increase and decrease of cwnd a = 1/100 – the increase is greater than TCP Reno b = 1/8 – the decrease on loss is less than TCP Reno Scalable over any link speed. uFast TCP Uses round trip time as well as packet loss to indicate congestion with rapid convergence to fair equilibrium for throughput. uHSTCP-LP, H-TCP, BiC-TCP

5 Slide: 5 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 5 Packet Loss and new TCP Stacks uTCP Response Function Throughput vs Loss Rate – further to right: faster recovery Drop packets in kernel MB-NG rtt 6ms DataTAG rtt 120 ms

6 Slide: 6 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 6 Packet Loss and new TCP Stacks uTCP Response Function UKLight London-Chicago-London rtt 180 ms 2.6.6 Kernel Agreement with theory good

7 Slide: 7 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 7 High Throughput Demonstrations Manchester (Geneva) man03lon01 2.5 Gbit SDH MB-NG Core 1 GEth Cisco GSR Cisco 7609 Cisco 7609 London (Chicago) Dual Zeon 2.2 GHz Send data with TCP Drop Packets Monitor TCP with Web100

8 Slide: 8 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 8 uDrop 1 in 25,000 urtt 6.2 ms uRecover in 1.6 s High Performance TCP – MB-NG StandardHighSpeed Scalable

9 Slide: 9 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 9 High Performance TCP – DataTAG uDifferent TCP stacks tested on the DataTAG Network u rtt 128 ms uDrop 1 in 10 6 uHigh-Speed Rapid recovery uScalable Very fast recovery uStandard Recovery would take ~ 20 mins

10 Slide: 10 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 10 On the way to Higher Bandwidth

11 Slide: 11 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 11 End Hosts & NICs SuperMicro P4DP6 Latency Throughput Bus Activity uUse UDP packets from udpmon to characterise Host & NIC SuperMicro P4DP6 motherboard Dual Xenon 2.2GHz CPU 400 MHz System bus 66 MHz 64 bit PCI bus

12 Slide: 12 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 12 Network switch limits behaviour uEnd2end UDP packets from udpmon Only 700 Mbit/s throughput Lots of packet loss Packet loss distribution shows throughput limited

13 Slide: 13 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 13 TCP Window Scale factor not set correctly uSC2004 London-Chicago-London tests  Server quality hosts – 2.8 GHz Dual Xeon; 133 MHz PCI-X bus uTCP window scale factor should allow the pipe to be filled uDelay*BW 22 Mbytes uWeb100 output shows: Cwnd dows not open Data set at line speed but as 1 burst/rtt Data stops at Cwnd Average throughput: 100 Mbit/s Limited by sender uKernel configuration problem

14 Slide: 14 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 14 Network & Disk Interactions uHosts: Supermicro X5DPE-G2 motherboards dual 2.8 GHz Zeon CPUs with 512 k byte cache and 1 M byte memory 3Ware 8506-8 controller on 133 MHz PCI-X bus configured as RAID0 six 74.3 GByte Western Digital Raptor WD740 SATA disks 64k byte stripe size uMeasure memory to RAID0 transfer rates with & without UDP traffic Disk write 1735 Mbit/s Disk write + 1500 MTU UDP 1218 Mbit/s Drop 30% Disk write + 9000 MTU UDP 1400 Mbit/s CPU load

15 Slide: 15 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 15 iperf Throughput + Web100 u SuperMicro on MB-NG network u HighSpeed TCP u Average: Linespeed 940 Mbit/s u DupACK ? <10 (expect ~400) u BaBar on Production network u Standard TCP u Average: 425 Mbit/s u DupACKs 350-400 – re-transmits

16 Slide: 16 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 16 Disk-Disk bbftp ubbftp file transfer program uses TCP/IP uUKLight: Path:- London-Chicago-London; PCs:- Supermicro +3Ware RAID0 uMTU 1500 bytes; Socket size 22 Mbytes; rtt 177ms; SACK off uMove a 2 Gbyte file uWeb100 plots: uStandard TCP uAverage 825 Mbit/s uScalable TCP uAverage 875 Mbit/s

17 Slide: 17 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 17 Parameters to Consider – Only some of them! uServer quality hosts uCheck that UDP packets use BW expected Poor (old), or wrongly configured routers / switches Overloaded access links – campus / country uHunt down packet loss at your desired sending rate uFill the pipe with packets in flight set socket buffer to2* Delay*BW uKernel configuration settings: Allow large socket buffer (TCP window) settings Set length of transmit queue large (~2000) TCP window scale factor should allow the pipe to be filled Disallow “ Moderation ” in TCP stack uConsider tuning off SACKs in 2.4.x and maybe up to 2.6.6  Large MTUs – reduces CPU load  Enable Interrupt coalescence – reduces CPU load

18 Slide: 18 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 18 Real Time TCP in e-VLBI

19 Slide: 19 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 19 Does TCP delay the data ? u Work in progress !! u Send blocks of data (10kbytes) at regular intervals u Drop every 10,000 packet u Measure the arrival time of the data

20 Slide: 20 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 20 More Information Some URLs uMB-NG project web site: http://www.mb-ng.net/ uDataTAG project web site: http://www.datatag.org/ uUDPmon / TCPmon kit + writeup: http://www.hep.man.ac.uk/~rich/net uMotherboard and NIC Tests: www.hep.man.ac.uk/~rich/net/nic/GigEth_tests_Boston.ppt & http://datatag.web.cern.ch/datatag/pfldnet2003/ “Performance of 1 and 10 Gigabit Ethernet Cards with Server Quality Motherboards” FGCS Special issue 2004 uTCP tuning information may be found at: http://www.ncne.nlanr.net/documentation/faq/performance.html & http://www.psc.edu/networking/perf_tune.html uTCP stack comparisons: “Evaluation of Advanced TCP Stacks on Fast Long-Distance Production Networks” Journal of Grid Computing 2004

21 Slide: 21 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 21 Backup Slides

22 Slide: 22 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 22 UKLight in the UK

23 Slide: 23 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 23 SC2004 UKLIGHT Overview MB-NG 7600 OSR Manchester ULCC UKlight UCL HEP UCL network K2 Ci Chicago Starlight Amsterdam SC2004 Caltech Booth UltraLight IP SLAC Booth Cisco 6509 UKlight 10G Four 1GE channels UKlight 10G Surfnet/ EuroLink 10G Two 1GE channels NLR Lambda NLR-PITT-STAR-10GE-16 K2 Ci Caltech 7600

24 Slide: 24 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 24 Topology of the MB – NG Network Key Gigabit Ethernet 2.5 Gbit POS Access MPLS Admin. Domains UCL Domain Edge Router Cisco 7609 man01 man03 Boundary Router Cisco 7609 RAL Domain Manchester Domain lon02 man02 ral01 UKERNA Development Network Boundary Router Cisco 7609 ral02 lon03 lon01 HW RAID

25 Slide: 25 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 25 uPeak bandwidth 23.21Gbits/s u6.6 TBytes in 48 minutes The Bandwidth Challenge at SC2003 Phoenix - Amsterdam 4.35 Gbit HighSpeed TCP rtt 175 ms, window 200 MB

26 Slide: 26 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 26 Average Transfer Rates Mbit/s AppTCP StackSuperMicro on MB-NG SuperMicro on SuperJANET4 BaBar on SuperJANET4 IperfStandard940350-370425 HighSpeed940510570 Scalable940580-650605 bbcpStandard434290-310290 HighSpeed435385360 Scalable432400-430380 bbftpStandard400-410325320 HighSpeed370-390380 Scalable430345-532380 apacheStandard425260300-360 HighSpeed430370315 Scalable428400317 GridftpStandard405240 HighSpeed320 Scalable335

27 Slide: 27 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 27 bbftp: Host & Network Effects u 2 Gbyte file RAID5 Disks: 1200 Mbit/s read 600 Mbit/s write u Scalable TCP u BaBar + SuperJANET Instantaneous 220 - 625 Mbit/s u SuperMicro + SuperJANET Instantaneous 400 - 665 Mbit/s for 6 sec Then 0 - 480 Mbit/s u SuperMicro + MB-NG Instantaneous 880 - 950 Mbit/s for 1.3 sec Then 215 - 625 Mbit/s

28 Slide: 28 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 28 Applications: Throughput Mbit/s u HighSpeed TCP u 2 GByte file RAID5 u SuperMicro + SuperJANET u bbcp u bbftp u Apachie u Gridftp u Previous work used RAID0 (not disk limited)

29 Slide: 29 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 29 Host, PCI & RAID Controller Performance uRAID0 (striped) & RAID5 (stripped with redundancy) u3Ware 7506 Parallel 66 MHz 3Ware 7505 Parallel 33 MHz u3Ware 8506 Serial ATA 66 MHz ICP Serial ATA 33/66 MHz uTested on Dual 2.2 GHz Xeon Supermicro P4DP8-G2 motherboard uDisk: Maxtor 160GB 7200rpm 8MB Cache uRead ahead kernel tuning: /proc/sys/vm/max-readahead

30 Slide: 30 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 30 RAID Controller Performance RAID 0 RAID 5 Read Speed Write Speed


Download ppt "Slide: 1 Richard Hughes-Jones e-VLBI Network Meeting 28 Jan 2005 R. Hughes-Jones Manchester 1 TCP/IP Overview & Performance Richard Hughes-Jones The University."

Similar presentations


Ads by Google