Download presentation
Presentation is loading. Please wait.
Published byNickolas Ferguson Modified over 9 years ago
1
Experiences Tuning Cluster Hosts 1GigE and 10GbE Paul Hyder Cooperative Institute for Research in Environmental Sciences, CU Boulder Cooperative Institute for Research in Environmental Sciences, CU Boulder (CIRES at NOAA/ERSL/GSD High Performance Computing) (CIRES at NOAA/ERSL/GSD High Performance Computing) Paul.Hyder at noaa.gov Paul.Hyder at noaa.gov
2
Tuning Focus n Cluster Front Ends and Cron Server Hosts n File transfer servers (scponly) n BWCTL host n Remote client hosts n 10GbE Testbed (7.2 Gb/sec uses ~49% of one 3G CPU) (7.2 Gb/sec uses ~49% of one 3G CPU)
3
How We Apply the Well Known Rules n Jumbo Frames –8K on hosts –9K on network n Tune TCP to match BDP n Encourage application writers to use large read and write buffers n Install tuned Applications –PSC.edu patch to ssh OpenSSH:channels.h #define CHAN_TCP_PACKET_DEFAULT (32*1024) #define CHAN_TCP_WINDOW_DEFAULT (4*CHAN_TCP_PACKET_DEFAULT)
4
Throughput Testing n Iperf (2.0.2) from shell scripts –Vary buffer (-l) and window (-w) –Modify ifconfig and PCI configuration –Loop takes 3 days n Bwctl with remote hosts –?Anyone on NLR? n Use scp/sftp/rsync as final test
5
I’m Curious n How much TCP tuning information do you provide users and admins? n Are hosts being tuned? n Does your internal LAN support jumbo frames?
6
GSD Cluster GigE Defaults n [wr]mem_default 2MB n [wr]mem_max 16MB n ipv4/tcp_[wr]mem 64KB 2MB 16MB n optmem_max 512K n txqueuelen 10000 n netdev_max_backlog 3000 n ipv4/tcp_sack and ipv4/tcp_timestamps on n Don’t touch ipv4/tcp_mem
7
Jumbo Frame Plot
8
What doesn’t work n Jumbo Frames –Switch Fabrics n High density cards n Complex vLAN configurations n Stand alone GigE switches –Firewalls –ICMP for path mtu discovery n Disabled completely n Network devices don’t respond
9
Linux 2.6 and Jumbos IP hostA.52434 > hostB.22: S 544:544(0) win 16304 IP hostB.22 > hostA.52434: S 207:207(0) ack 545 win 5792... IP hostA.52434 > hostB.22:. 2255:6599(4344) ack 2293 win 16304 IP hostA.52434 > hostB.22: P 6599:10943(4344) ack 2293 win 16304 IP router > hostA: icmp 36: hostB unreachable - need to frag (mtu 1500) IP hostA.52434 > hostB.22:. 2255:3703(1448) ack 2293 win 16304
10
Host Side Checks n Interrupt Aggregation (Linux NAPI) n Memory to match buffer tuning n More than one CPU n Static ARP entries
11
Network Device Settings n Static ARP entries or increase timeout n Increase FDB timeouts n Verify jumbo frame configuration
12
10GbE Quick Notes n Know your PCI hardware (MMRBC, Latency timer, and Splits) n TCP stack is ~0.200ms n Increase netdev_max_backlog to 30000 (throughput = backlog * 100MHz * ave_bytes_pkt) n Set *_cong to CERN values n Write buffers in code ~128KB
13
10G buffer plot
14
Questions?
15
Reference URLs n http://www.psc.edu/networking/projects/hpn-ssh/ n http://dast.nlanr.net/Projects/Iperf/ n http://www.sublimation.org/scponly/ n http://e2epi.internet2.edu/bwctl/ –http://abilene.internet2.edu/ami/bwctl_status.cgi/TCP/now n http://www.tcptrace.org/ n http://ultralight.caltech.edu/ n http://staff.science.uva.nl/~delaat/articles/2003-7-10gige.pdf n http://www.csm.ornl.gov/~dunigan/netperf/netlinks.html n http://www.psc.edu/networking/projects/tcptune/ n http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26310.pdf
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.