Presentation is loading. Please wait.

Presentation is loading. Please wait.

CALICE, Mar 2007, R. Hughes-Jones Manchester 1 Protocols Working with 10 Gigabit Ethernet Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/

Similar presentations


Presentation on theme: "CALICE, Mar 2007, R. Hughes-Jones Manchester 1 Protocols Working with 10 Gigabit Ethernet Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/"— Presentation transcript:

1 CALICE, Mar 2007, R. Hughes-Jones Manchester 1 Protocols Working with 10 Gigabit Ethernet Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/ then “Talks” www.hep.man.ac.uk/~rich/

2 CALICE, Mar 2007, R. Hughes-Jones Manchester 2 uIntroduction to Measurements u10 GigE on SuperMicro X7DBE u10 GigE on SuperMicro X5DPE-G2 u10 GigE and TCP – Monitor with web100 disk writes u10 GigE and Constant Bit Rate program uUDP + memory access

3 CALICE, Mar 2007, R. Hughes-Jones Manchester 3 uUDP/IP packets sent between back-to-back systems Similar processing to TCP/IP but no flow control & congestion avoidance algorithms uLatency Round trip times using Request-Response UDP frames Latency as a function of frame size Slope s given by: Mem-mem copy(s) + pci + Gig Ethernet + pci + mem-mem copy(s) Intercept indicates processing times + HW latencies Histograms of ‘singleton’ measurements uUDP Throughput Send a controlled stream of UDP frames spaced at regular intervals Vary the frame size and the frame transmit spacing & measure: The time of first and last frames received The number packets received, lost, & out of order Histogram inter-packet spacing received packets Packet loss pattern 1-way delay CPU load Number of interrupts Udpmon: Latency & Throughput Measurements uTells us about: Behavior of the IP stack The way the HW operates Interrupt coalescence uTells us about: Behavior of the IP stack The way the HW operates Capacity & Available throughput of the LAN / MAN / WAN

4 CALICE, Mar 2007, R. Hughes-Jones Manchester 4 Throughput Measurements uUDP Throughput with udpmon uSend a controlled stream of UDP frames spaced at regular intervals n bytes Number of packets Wait time time  Zero stats OK done ●●● Get remote statistics Send statistics: No. received No. lost + loss pattern No. out-of-order CPU load & no. int 1-way delay Send data frames at regular intervals ●●● Time to send Time to receive Inter-packet time (Histogram) Signal end of test OK done Time Sender Receiver

5 CALICE, Mar 2007, R. Hughes-Jones Manchester 5 High-end Server PCs u Boston/Supermicro X7DBE u Two Dual Core Intel Xeon Woodcrest 5130 2 GHz Independent 1.33GHz FSBuses u 530 MHz FD Memory (serial) Parallel access to 4 banks uChipsets: Intel 5000P MCH – PCIe & Memory ESB2 – PCI-X GE etc. u PCI 3 8 lane PCIe buses 3* 133 MHz PCI-X u 2 Gigabit Ethernet u SATA

6 CALICE, Mar 2007, R. Hughes-Jones Manchester 6 10 GigE Back2Back: UDP Latency uMotherboard: Supermicro X7DBE uChipset: Intel 5000P MCH uCPU: 2 Dual Intel Xeon 5130 2 GHz with 4096k L2 cache uMem bus: 2 independent 1.33 GHz uPCI-e 8 lane uLinux Kernel 2.6.20-web100_pktd-plus uMyricom NIC 10G-PCIE-8A-R Fibre umyri10ge v1.2.0 + firmware v1.4.10 rx-usecs=0 Coalescence OFF MSI=1 Checksums ON tx_boundary=4096 uMTU 9000 bytes uLatency 22 µs & very well behaved uLatency Slope 0.0028 µs/byte uB2B Expect: 0.00268 µs/byte Mem0.0004 PCI-e0.00054 10GigE0.0008 PCI-e0.00054 Mem0.0004 uHistogram FWHM ~1-2 us

7 CALICE, Mar 2007, R. Hughes-Jones Manchester 7 10 GigE Back2Back: UDP Throughput uKernel 2.6.20-web100_pktd-plus uMyricom 10G-PCIE-8A-R Fibre rx-usecs=25 Coalescence ON uMTU 9000 bytes uMax throughput 9.4 Gbit/s uNotice rate for 8972 byte packet u~0.002% packet loss in 10M packets in receiving host uSending host, 3 CPUs idle uFor 90% in kernel mode inc ~10% soft int uReceiving host 3 CPUs idle uFor <8 µs packets, 1 CPU is 70-80% in kernel mode inc ~15% soft int

8 CALICE, Mar 2007, R. Hughes-Jones Manchester 8 10 GigE Cisco 7600: UDP Latency uMotherboard: Supermicro X7DBE uPCI-e 8 lane uLinux Kernel 2.6.20 SMP uMyricom NIC 10G-PCIE-8A-R Fibre myri10ge v1.2.0 + firmware v1.4.10 Rx-usecs=0 Coalescence OFF MSI=1 Checksums ON uMTU 9000 bytes uLatency 36.6 µs & very well behaved uSwitch Latency 14.66 µs uSwitch internal: 0.0011 µs/byte PCI-e0.00054 10GigE0.0008

9 CALICE, Mar 2007, R. Hughes-Jones Manchester 9 The “SC05” Server PCs u Not ALL PCs work that well !! u Boston/Supermicro X7DBE uTwo Intel Xeon Nocona 3.2 GHz Cache 2048k Shared 800 MHz FSBus uDDR2-400 Memory uChipsets: Intel 7520 Lindenhurst u PCI 2 8 lane PCIe buses 1 4 lane PCIe buse 3* 133 MHz PCI-X u 2 Gigabit Ethernet

10 CALICE, Mar 2007, R. Hughes-Jones Manchester 10 10 GigE X7DBE  X6DHE: UDP Throughput uKernel 2.6.20-web100_pktd-plus uMyricom 10G-PCIE-8A-R Fibre myri10ge v1.2.0 + firmware v1.4.10 rx-usecs=25 Coalescence ON uMTU 9000 bytes uMax throughput 6.3 Gbit/s uPacket loss ~ 40-60 % in receiving host uSending host, 3 CPUs idle u1 CPU is >90% in kernel mode uReceiving host 3 CPUs idle uFor <8 µs packets, 1 CPU is 70-80% in kernel mode inc ~15% soft int

11 CALICE, Mar 2007, R. Hughes-Jones Manchester 11 So now we can run at 9.4 Gbit/s Can we do any work ?

12 CALICE, Mar 2007, R. Hughes-Jones Manchester 12 10 GigE X7DBE  X7DBE: TCP iperf uNo packet loss uMTU 9000 uTCP buffer 256k BDP=~330k uCwnd SlowStart then slow growth Limited by sender ! uDuplicate ACKs One event of 3 DupACKs uPackets Re-Transmitted uThroughput Mbit/s Iperf throughput 7.77 Gbit/s Not bad ! Web100 plots of TCP parameters

13 CALICE, Mar 2007, R. Hughes-Jones Manchester 13 10 GigE X7DBE  X7DBE: TCP iperf uPacket loss 1: 50,000 -recv-kernel patch uMTU 9000 uTCP buffer 256k BDP=~330k uCwnd SlowStart then slow growth Limited by sender ! uDuplicate ACKs ~10 DupACKs every lost packet uPackets Re-Transmitted One per lost packet uThroughput Mbit/s Iperf throughput 7.84 Gbit/s Even Better !!! Web100 plots of TCP parameters

14 CALICE, Mar 2007, R. Hughes-Jones Manchester 14 10 GigE X7DBE  X7DBE: CBR/TCP uPacket loss 1: 50,000 -recv-kernel patch utcpdelay message 8120bytes uWait 7 µs uRTT 36 µs uTCP buffer 256k BDP=~330k uCwnd Dips as expected uDuplicate ACKs ~15 DupACKs every lost packet uPackets Re-Transmitted One per lost packet uThroughput Mbit/s tcpdelay throughput 7.33 Gbit/s Web100 plots of TCP parameters

15 CALICE, Mar 2007, R. Hughes-Jones Manchester 15 Cpu0 : 6.0% us, 74.7% sy, 0.0% ni, 0.3% id, 0.0% wa, 1.3% hi, 17.7% si, 0.0% st Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st Cpu3 : 100.0% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si, 0.0% st B2B UDP with memory access uSend UDP traffic B2B with 10GE uOn receiver run independent memory write task L2 Cache 4096 k Byte Write 8000k Byte blocks in loop 100% user mode uAchievable UDP Throughput mean 9.39 Gb/s sigma 106 mean 9.21 Gb/s sigma 37 mean 9.2 sigma 30 uPacket loss mean 0.04% mean 1.4 % mean 1.8 % uCPU load:

16 CALICE, Mar 2007, R. Hughes-Jones Manchester 16 Backup Slides

17 CALICE, Mar 2007, R. Hughes-Jones Manchester 17 10 Gigabit Ethernet: Neterion NIC Results uX5DPE-G2 Supermicro PCs B2B uDual 2.2 GHz Xeon CPU uFSB 533 MHz uXFrame II NIC uPCI-X mmrbc 4096 bytes uLow UDP rates ~2.5Gbit/s uLarge packet loss uTCP One iperf TCP data stream 4 Gbit/s Two bi-directional iperf TCP data streams 3.8 & 2.2 Gbit/s


Download ppt "CALICE, Mar 2007, R. Hughes-Jones Manchester 1 Protocols Working with 10 Gigabit Ethernet Richard Hughes-Jones The University of Manchester www.hep.man.ac.uk/~rich/"

Similar presentations


Ads by Google