Presentation is loading. Please wait.

Presentation is loading. Please wait.

DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG.

Similar presentations


Presentation on theme: "DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG."— Presentation transcript:

1 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG team at Manchester Andrew McNab MB - NG

2 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 2 What’s Involved in Performance  End Hosts  NIC: operation & PCI design – use of modern PCI commands  Chipset & PCIX  CPU, memory & memory-bus  OS kernel: drivers & TCP UDP IP stack  Disk sub-system: disks, controllers & interconnects  Routers  Blades: operation  Switching / routing fabric: bus, crossbar, non-blocking operation  Policies  The Network (s)  Framing  Bandwidth  Load & Congestion

3 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 3 PC MB – NG SuperJANET4 Development Network (22 Mar 02) UCL OSM- 1OC48- POS-SS MCC OSM- 1OC48- POS-SS MAN Gigabit Ethernet 2.5 Gbit POS Access 2.5 Gbit POS core MPLS Admin. Domains SJ4 Dev PC 3ware RAID0 PC 3ware RAID0

4 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 4 Initial Measurements  UDP throughput very variable  TCP throughput very variable  Lon  Man < Man  Lon ?  Investigations:  Packet loss several %  Errors on interface (ifconfig)  Overruns

5 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 5 txqueuelen-vs-sendstalls  TCP throughput very variable  Lon  Man < Man  Lon ?  throughput very variable

6 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 6 Interrupt Coalescence Investigations  TCP mem-mem lon2-man1  Tx 64 Tx-abs 64  Rx 0 Rx-abs 128  820-980 Mbit/s +- 50 Mbit/s  Tx 64 Tx-abs 64  Rx 20 Rx-abs 128  937-940 Mbit/s +- 1.5 Mbit/s  Tx 64 Tx-abs 64  Rx 80 Rx-abs 128  937-939 Mbit/s +- 1 Mbit/s

7 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 7 24 Hours HighSpeed TCP mem-mem  TCP mem-mem lon2-man1  Tx 64 Tx-abs 64  Rx 64 Rx-abs 128  941.5 Mbit/s +- 0.5 Mbit/s

8 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 8 Raid0 Performance (1)  Maxdor 3.5 Series DiamondMax PLus 9 120 Gb ATA/133  Write Slight increase with number of disks  Read  3 Disks OK  Write 100 MBytes/s  Read 130 MBytes/s

9 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 9 Raid0 Performance (2)  Maxdor 3.5 Series DiamondMax PLus 9 120 Gb ATA/133  No difference for Write  Larger Stripe lower the performance  Write 100 MBytes/s  Read 120 MBytes/s

10 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 10 Gridftp Throughput HighSpeedTCP  Int Coal 64 128  Txqueuelen 2000  TCP buffer 1 M byte (rtt*BW = 750kbytes)  Interface throughput  Acks received  Data moved  520 Mbit/s  Same for B2B tests  So its not that simple!

11 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 11 Gridftp Throughput + Web100  Throughput Mbit/s:  See alternate 600/800 Mbit and zero  Cwnd smooth  No dup Ack / send stall / timeouts

12 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 12 Gridftp Throughput + Web100  Throughput Mbit/s vs Recv Window Size  Zero throughput independent of Recv Window Size  Bytes sent  Bytes received  Waits 0.4s at start !

13 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 13 http data transfers HighSpeed TCP  Apachie web server out of the box!  prototype client - curl http library  1Mbyte TCP buffers  2Gbyte file  Throughput 72 MBytes/s  Cwnd - some variation  No dup Ack / send stall / timeouts

14 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 14 http data transfers (2)  Limited by:  Sender  Receive window size

15 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 15  Int Coal 64 128  Txqueuelen 500 (no stall in rtt)  TCP buffer 750k byte (rtt*BW = 750k bytes)  1 stream every 60 s:  man1  lon2  man2  lon2  man3  lon2  Sample ever 10ms  Send rates:  940 Mbit/s  450 Mbit/s  300 Mbit/s TCP sharing man1-lon2 mem-mem web100

16 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 16  Int Coal 64 128  Txqueuelen 500 (no stall in rtt)  TCP buffer 750k byte (rtt*BW = 750k bytes)  1 stream every 60 s:  man1  lon2  man2  lon2  man3  lon2  Sample ever 10ms  Time in send limit:  Sender  Cwind  Recv wind TCP sharing man1-lon2 mem-mem web100

17 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 17  1Stream:  No Dup ACKs  No SACKs  No Sendstalls  Why does Cwnd vary TCP sharing man1-lon2 the WHY?

18 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 18  2Streams:  Many Dup ACKs  Many SACKs   Why does Cwnd have large variations 2 TCP streams man1-lon2 - the WHY?

19 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 19  2Streams:  Dips in throughput due to Dup ACK   ~4 losses /sec  A bit regular ?  Cwnd decreases:  1 point 33%  Ramp starts at 62%  Slope 70Bytes/us 2 TCP streams man1-lon2 - the WHY? (2) 1 sec

20 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 20  3Streams:  Dips in throughput due to Dup ACK  3 TCP streams man1-lon2 - the WHY? 10 sec

21 DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 21 TCP sharing man1-lon2 - the WHY?  There is (a) correlation 


Download ppt "DataTAG Meeting CERN 7-8 May 03 R. Hughes-Jones Manchester 1 High Throughput: Progress and Current Results Lots of people helped: MB-NG team at UCL MB-NG."

Similar presentations


Ads by Google