GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Initial Performance Measurements Gigabit Ethernet NICs 64 bit PCI Motherboards (Work in progress Mar 02) Collaboration: Boston Ltd. (Watford) – SuperMicro Motherboards, CPUs, Intel GE NICs Brunel University – Peter Van Santen University of Manchester – Richard Hughes-Jones
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester The Measurements (1) uLatency uRound trip times measured using Request-Response UDP frames uLatency as a function of frame size Slope gives sum of individual data transfer rates end-to-end Mem copy + pci + Gig Ethernet + pci + mem copy uHistograms of individual measurements
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester The Measurements (2) uUDP Throughput uSend a burst of UDP frames spaced at regular intervals uVary the frame size and the frame transmit spacing uRecord The time to send and the time to receive the frames The number received, the number lost, number out of order The received inter-packet spacing CPU load, Number of interrupts Zero stats OK done ●●● Get remote statistics Send statistics Send data frames at regular intervals ●●● Time to send Time to receive Inter-packet time
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester The Measurements (3) uPCI Activity uLogic Analyzer with PCI Probe cards in sending PC Gigabit Ethernet Fiber Probe Card PCI Probe cards in receiving PC Gigabit Ethernet Probe CPU mem chipset NIC CPU mem NIC chipset Logic Analyser Display
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Latency: Alteon AceNIC Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester UDP Throughput: Alteon AceNIC Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: Alteon AceNIC Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit RedHat 7.1 Kernel ALT33102 PCI 33 MHz 1400 bytes sent Wait 16 us ALT MHz 1400 bytes sent Wait 16 us NIC cannot sustain 66 MHz Send PCI Receive PCI
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Latency: SysKonnect SK-9843 Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 33 MHz RedHat 7.1 Kernel Latency low good Latency well behaved Slope us/byte Expect: PCI GigE0.008 PCI us/byte
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester UDP Throughput: SysKonnect SK-9843 Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 33 MHz RedHat 7.1 Kernel Max throughput 690Mbit/s No packet loss Packet loss during drop
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: SysKonnect SK-9843 Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel SK bytes sent Wait 100 us ~8 us for send or receive Gigabit Ethernet frame
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: SysKonnect SK-9843 Motherboard: SuperMicro 370DLE Chipset: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel SK bytes sent Wait 20 us Sk bytes sent Wait 10 us Frames are back-to-back Cannot go any faster ! Gig Eth frames back to back
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Latency: Intel Pro/1000 Motherboard: SuperMicro 370DLE Chipset:: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel Latency high Latency well behaved Slope us/byte Expect: PCI GigE0.008 PCI us/byte
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: Intel Pro/1000 Motherboard: SuperMicro 370DLE Chipset:: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel IT66M bytes sent CSR time: 1.75 us Data time 0.25 us Interrupt delay:~70 us 1400 response
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Throughput: Intel Pro/1000 Motherboard: SuperMicro 370DLE Chipset:: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel Max throughput 910Mbit/s No packet loss Packet loss during drop
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Throughput: Intel Pro/1000 Motherboard: SuperMicro 370DLE Chipset:: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel losses occur in groups ~50 pkts every 140
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: Intel Pro/1000 Motherboard: SuperMicro 370DLE Chipset:: ServerWorks III LE Chipset CPU: PIII 800 MHz PCI:64 bit 66 MHz RedHat 7.1 Kernel IT66M bytes sent Wait 11 us ~4.7us on send PCI bus PCI bus ~45% occupancy ~ 3.25 us on PCI for data recv IT66M bytes sent Wait 11 us Packets lost Action of pause packet?
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Latency: Intel Pro/1000 on P4CD6+ Motherboard: SuperMicro P4CD6+ Chipset: Intel i860 CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz PCI:64 bit 66 MHz RedHat 7.1 Kernel Latency high Slope us/byte Expect: PCI GigE0.008 PCI us/byte
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Throughput: Intel Pro/1000 on P4CD6+ Motherboard: SuperMicro P4CD6+ Chipset: Intel i860 CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz PCI:64 bit 66 MHz RedHat 7.1 Kernel Max throughput 950Mbit/s No packet loss Negligible Packet loss
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: Intel Pro/1000 on P4CD6+ Motherboard: SuperMicro P4CD6+ Chipset: Intel i860 CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz PCI:64 bit 66 MHz RedHat 7.1 Kernel IT66M bytes sent Wait 1000 us CSR time: us Data time 5.0 us Interrupt delay:~79 us IT66M bytes sent Wait 100 us Detail Chipset limits PCI transfers with STOPs Try i870 Chipset
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: Intel Pro/1000 on P4CD6+ Motherboard: SuperMicro P4CD6+ Chipset: Intel i860 CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz PCI:64 bit 66 MHz RedHat 7.1 Kernel IT66M bytes sent Wait 11 us
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Latency: Intel Pro/1000 on IBM board Motherboard: IBM das Chipset:: ServerWorks CNB20LE CPU: Dual PIII 1GHz PCI:64 bit 33 MHz RedHat 7.1 Kernel Latency high Latency well behaved Slope us/byte Expect: PCI GigE0.008 PCI us/byte
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Throughput: Intel Pro/1000 on IBM board Motherboard: IBM das Chipset:: ServerWorks CNB20LE CPU: Dual PIII 1GHz PCI:64 bit 33 MHz RedHat 7.1 Kernel Max throughput 930Mbit/s No packet loss Packet loss during drop
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: Intel Pro/1000 on IBM board Motherboard: IBM das Chipset:: ServerWorks CNB20LE CPU: Dual PIII 1GHz PCI:64 bit 33 MHz RedHat 7.1 Kernel uva64m bytes sent Wait 11 us ~9.3us on send PCI bus PCI bus ~82% occupancy ~ 5.9 us on PCI for data recv.
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Latency: Intel Pro/1000 on P4DP6 Motherboard: SuperMicro P4DP6 Chipset: Intel E7500 (Plumas) CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz Slot 4: PCI, 64 bit, 66 MHz RedHat 7.2 Kernel Latency high but smooth Indicates Interrupt coalescence Slope us/byte Expect: PCI GigE0.008 PCI us/byte
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Throughput: Intel Pro/1000 on P4DP6 Max throughput 950Mbit/s Some throughput drop for packets >1000 bytes Packet loss small 800 – 1000 byte packets Motherboard: SuperMicro P4DP6 Chipset: Intel E7500 (Plumas) CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz Slot 4: PCI, 64 bit, 66 MHz RedHat 7.2 Kernel
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: Intel Pro/1000 on P4DP6 ITP bytes sent Wait 1000 us Send: CSR time: 2.0 us Send: Data time 3.25 us Recv: Data time 2.2 us Slot 3 to slot 5 ITP4001 Detail of 1400 bytes sent CSR time 2.2 us Data time 3.2 us Slot 4 to slot 4 Small differences between slots Motherboard: SuperMicro P4DP6 Chipset: Intel E7500 (Plumas) CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz Slots: PCI, 64 bit, 66 MHz RedHat 7.2 Kernel
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester PCI: Intel Pro/1000 on P4DP6 ITP bytes sent Wait 8 us ~5.14us on send PCI bus PCI bus ~68% occupancy ~ 2 us on PCI for data recv Motherboard: SuperMicro P4DP6 Chipset: Intel E7500 (Plumas) CPU: Dual Xeon Prestonia (2cpu/die) 2.2 GHz Slot 3-5: PCI, 64 bit, 66 MHz RedHat 7.2 Kernel