Download presentation
Presentation is loading. Please wait.
1
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 1 Protocols Working with 10 Gigabit Ethernet Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/ then “Talks” www.hep.man.ac.uk/~rich/
2
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 2 uIntroduction u10 GigE on SuperMicro X7DBE u10 GigE on SuperMicro X5DPE-G2 u10 GigE and TCP– Monitor with web100 disk writes u10 GigE and Constant Bit Rate transfers uUDP + memory access uGÉANT 4 Gigabit tests
3
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 3 uUDP/IP packets sent between back-to-back systems Similar processing to TCP/IP but no flow control & congestion avoidance algorithms uLatency Round trip times using Request-Response UDP frames Latency as a function of frame size Slope s given by: Mem-mem copy(s) + pci + Gig Ethernet + pci + mem-mem copy(s) Intercept indicates processing times + HW latencies Histograms of ‘singleton’ measurements uUDP Throughput Send a controlled stream of UDP frames spaced at regular intervals Vary the frame size and the frame transmit spacing & measure: The time of first and last frames received The number packets received, lost, & out of order Histogram inter-packet spacing received packets Packet loss pattern 1-way delay CPU load Number of interrupts Udpmon: Latency & Throughput Measurements uTells us about: Behavior of the IP stack The way the HW operates Interrupt coalescence uTells us about: Behavior of the IP stack The way the HW operates Capacity & Available throughput of the LAN / MAN / WAN
4
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 4 Throughput Measurements uUDP Throughput with udpmon uSend a controlled stream of UDP frames spaced at regular intervals n bytes Number of packets Wait time time Zero stats OK done ●●● Get remote statistics Send statistics: No. received No. lost + loss pattern No. out-of-order CPU load & no. int 1-way delay Send data frames at regular intervals ●●● Time to send Time to receive Inter-packet time (Histogram) Signal end of test OK done Time Sender Receiver
5
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 5 High-end Server PCs for 10 Gigabit u Boston/Supermicro X7DBE u Two Dual Core Intel Xeon Woodcrest 5130 2 GHz Independent 1.33GHz FSBuses u 530 MHz FD Memory (serial) Parallel access to 4 banks uChipsets: Intel 5000P MCH – PCIe & Memory ESB2 – PCI-X GE etc. u PCI 3 8 lane PCIe buses 3* 133 MHz PCI-X u 2 Gigabit Ethernet u SATA
6
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 6 10 GigE Back2Back: UDP Latency uMotherboard: Supermicro X7DBE uChipset: Intel 5000P MCH uCPU: 2 Dual Intel Xeon 5130 2 GHz with 4096k L2 cache uMem bus: 2 independent 1.33 GHz uPCI-e 8 lane uLinux Kernel 2.6.20-web100_pktd-plus uMyricom NIC 10G-PCIE-8A-R Fibre umyri10ge v1.2.0 + firmware v1.4.10 rx-usecs=0 Coalescence OFF MSI=1 Checksums ON tx_boundary=4096 uMTU 9000 bytes uLatency 22 µs & very well behaved uLatency Slope 0.0028 µs/byte uB2B Expect: 0.00268 µs/byte Mem0.0004 PCI-e0.00054 10GigE0.0008 PCI-e0.00054 Mem0.0004 uHistogram FWHM ~1-2 us
7
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 7 10 GigE Back2Back: UDP Throughput uKernel 2.6.20-web100_pktd-plus uMyricom 10G-PCIE-8A-R Fibre rx-usecs=25 Coalescence ON uMTU 9000 bytes uMax throughput 9.4 Gbit/s uNotice rate for 8972 byte packet u~0.002% packet loss in 10M packets in receiving host uSending host, 3 CPUs idle uFor 90% in kernel mode inc ~10% soft int uReceiving host 3 CPUs idle uFor <8 µs packets, 1 CPU is 70-80% in kernel mode inc ~15% soft int
8
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 8 10 GigE UDP Throughput vs packet size uMotherboard: Supermicro X7DBE uLinux Kernel 2.6.20-web100_ pktd-plus uMyricom NIC 10G-PCIE-8A-R Fibre umyri10ge v1.2.0 + firmware v1.4.10 rx-usecs=0 Coalescence ON MSI=1 Checksums ON tx_boundary=4096 uSteps at 4060 and 8160 bytes within 36 bytes of 2 n boundaries uModel data transfer time as t= C + m*Bytes C includes the time to set up transfers Fit reasonable C= 1.67 µs m= 5.4 e4 µs/byte Steps consistent with C increasing by 0.6 µs uThe Myricom drive segments the transfers, limiting the DMA to 4096 bytes – PCI-e chipset dependent!
9
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 9 10 GigE via Cisco 7600: UDP Latency uMotherboard: Supermicro X7DBE uPCI-e 8 lane uLinux Kernel 2.6.20 SMP uMyricom NIC 10G-PCIE-8A-R Fibre myri10ge v1.2.0 + firmware v1.4.10 Rx-usecs=0 Coalescence OFF MSI=1 Checksums ON uMTU 9000 bytes uLatency 36.6 µs & very well behaved uSwitch Latency 14.66 µs uSwitch internal: 0.0011 µs/byte PCI-e0.00054 10GigE0.0008
10
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 10 The “SC05” Server PCs u Boston/Supermicro X7DBE uTwo Intel Xeon Nocona 3.2 GHz Cache 2048k Shared 800 MHz FSBus uDDR2-400 Memory uChipsets: Intel 7520 Lindenhurst u PCI 2 8 lane PCIe buses 1 4 lane PCIe buse 3* 133 MHz PCI-X u 2 Gigabit Ethernet
11
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 11 10 GigE X7DBE X6DHE: UDP Throughput uKernel 2.6.20-web100_pktd-plus uMyricom 10G-PCIE-8A-R Fibre myri10ge v1.2.0 + firmware v1.4.10 rx-usecs=25 Coalescence ON uMTU 9000 bytes uMax throughput 6.3 Gbit/s uPacket loss ~ 40-60 % in receiving host uSending host, 3 CPUs idle u1 CPU is >90% in kernel mode uReceiving host 3 CPUs idle uFor <8 µs packets, 1 CPU is 70-80% in kernel mode inc ~15% soft int
12
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 12 So now we can run at 9.4 Gbit/s Can we do any work ?
13
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 13 10 GigE X7DBE X7DBE: TCP iperf uNo packet loss uMTU 9000 uTCP buffer 256k BDP=~330k uCwnd SlowStart then slow growth Limited by sender ! uDuplicate ACKs One event of 3 DupACKs uPackets Re-Transmitted uThroughput Mbit/s Iperf throughput 7.77 Gbit/s Web100 plots of TCP parameters
14
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 14 10 GigE X7DBE X7DBE: TCP iperf uPacket loss 1: 50,000 -recv-kernel patch uMTU 9000 uTCP buffer 256k BDP=~330k uCwnd SlowStart then slow growth Limited by sender ! uDuplicate ACKs ~10 DupACKs every lost packet uPackets Re-Transmitted One per lost packet uThroughput Mbit/s Iperf throughput 7.84 Gbit/s Web100 plots of TCP parameters
15
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 15 10 GigE X7DBE X7DBE: CBR/TCP uPacket loss 1: 50,000 -recv-kernel patch utcpdelay message 8120bytes uWait 7 µs uRTT 36 µs uTCP buffer 256k BDP=~330k uCwnd Dips as expected uDuplicate ACKs ~15 DupACKs every lost packet uPackets Re-Transmitted One per lost packet uThroughput Mbit/s tcpdelay throughput 7.33 Gbit/s Web100 plots of TCP parameters
16
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 16 Cpu0 : 6.0% us, 74.7% sy, 0.0% ni, 0.3% id, 0.0% wa, 1.3% hi, 17.7% si, 0.0% st Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st Cpu3 : 100.0% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st B2B UDP with memory access uSend UDP traffic B2B with 10GE uOn receiver run independent memory write task L2 Cache 4096 k Byte 8000k Byte blocks 100% user mode uAchievable UDP Throughput mean 9.39 Gb/s sigma 106 mean 9.21 Gb/s sigma 37 mean 9.2 sigma 30 uPacket loss mean 0.04% mean 1.4 % mean 1.8 % uCPU load:
17
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 17 ESLEA-FABRIC:4 Gbit flows over GÉANT uSet up 4 Gigabit Lightpath Between GÉANT PoPs Collaboration with Dante GÉANT Development Network London – London or London – Amsterdam and GÉANT Lightpath service CERN – Poznan PCs in their PoPs with 10 Gigabit NICs uVLBI Tests: UDP Performance Throughput, jitter, packet loss, 1-way delay, stability Continuous (days) Data Flows – VLBI_UDP and multi-Gigabit TCP performance with current kernels Experience for FPGA Ethernet packet systems uDante Interests: multi-Gigabit TCP performance The effect of (Alcatel) buffer size on bursty TCP using BW limited Lightpaths
18
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 18 Options Using the GÉANT Development Network u10 Gigabit SDH backbone uAlkatel 1678 MCC uNode location: London Amsterdam Paris Prague Frankfurt uCan do traffic routing so make long rtt paths uAvailable Now 07 uLess Pressure for long term tests
19
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 19 Options Using the GÉANT LightPaths uSet up 4 Gigabit Lightpath Between GÉANT PoPs Collaboration with Dante PCs in Dante PoPs u10 Gigabit SDH backbone uAlkatel 1678 MCC uNode location: Budapest Geneva Frankfurt Milan Paris Poznan Prague Vienna uCan do traffic routing so make long rtt paths uIdeal: London Copenhagen
20
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 20 Any Questions?
21
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 21 Backup Slides
22
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 22 10 Gigabit Ethernet: UDP Throughput u1500 byte MTU gives ~ 2 Gbit/s uUsed 16144 byte MTU max user length 16080 uDataTAG Supermicro PCs uDual 2.2 GHz Xenon CPU FSB 400 MHz uPCI-X mmrbc 512 bytes uwire rate throughput of 2.9 Gbit/s uCERN OpenLab HP Itanium PCs uDual 1.0 GHz 64 bit Itanium CPU FSB 400 MHz uPCI-X mmrbc 4096 bytes uwire rate of 5.7 Gbit/s uSLAC Dell PCs giving a uDual 3.0 GHz Xenon CPU FSB 533 MHz uPCI-X mmrbc 4096 bytes uwire rate of 5.4 Gbit/s
23
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 23 10 Gigabit Ethernet: Tuning PCI-X u16080 byte packets every 200 µs uIntel PRO/10GbE LR Adapter uPCI-X bus occupancy vs mmrbc Measured times Times based on PCI-X times from the logic analyser Expected throughput ~7 Gbit/s Measured 5.7 Gbit/s mmrbc 1024 bytes mmrbc 2048 bytes mmrbc 4096 bytes 5.7Gbit/s mmrbc 512 bytes CSR Access PCI-X Sequence Data Transfer Interrupt & CSR Update
24
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 24 10 Gigabit Ethernet: TCP Data transfer on PCI-X uSun V20z 1.8GHz to 2.6 GHz Dual Opterons uConnect via 6509 uXFrame II NIC uPCI-X mmrbc 4096 bytes 66 MHz uTwo 9000 byte packets b2b uAve Rate 2.87 Gbit/s uBurst of packets length 646.8 us uGap between bursts 343 us u2 Interrupts / burst CSR Access Data Transfer
25
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 25 10 Gigabit Ethernet: UDP Data transfer on PCI-X uSun V20z 1.8GHz to 2.6 GHz Dual Opterons uConnect via 6509 uXFrame II NIC uPCI-X mmrbc 2048 bytes 66 MHz uOne 8000 byte packets 2.8us for CSRs 24.2 us data transfer effective rate 2.6 Gbit/s u2000 byte packet wait 0us ~200ms pauses u8000 byte packet wait 0us ~15ms between data blocks CSR Access 2.8us Data Transfer
26
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 26 10 Gigabit Ethernet: Neterion NIC Results uX5DPE-G2 Supermicro PCs B2B uDual 2.2 GHz Xeon CPU uFSB 533 MHz uXFrame II NIC uPCI-X mmrbc 4096 bytes uLow UDP rates ~2.5Gbit/s uLarge packet loss uTCP One iperf TCP data stream 4 Gbit/s Two bi-directional iperf TCP data streams 3.8 & 2.2 Gbit/s
27
ESLEA Closing Conference, Edinburgh, March 2007, R. Hughes-Jones Manchester 27 SC|05 Seattle-SLAC 10 Gigabit Ethernet u2 Lightpaths: Routed over ESnet Layer 2 over Ultra Science Net u6 Sun V20Z systems per λ udcache remote disk data access 100 processes per node Node sends or receives One data stream 20-30 Mbit/s uUsed Neteion NICs & Chelsio TOE uData also sent to StorCloud using fibre channel links uTraffic on the 10 GE link for 2 nodes: 3-4 Gbit per nodes 8.5-9 Gbit on Trunk
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.