Presentation is loading. Please wait.

Presentation is loading. Please wait.

Commodity Flash-Based Systems at 40GbE - FIONA Philip Papadopoulos* Tom Defanti Larry Smarr John Graham Qualcomm Institute, UCSD *Also San Diego Supercomputer.

Similar presentations


Presentation on theme: "Commodity Flash-Based Systems at 40GbE - FIONA Philip Papadopoulos* Tom Defanti Larry Smarr John Graham Qualcomm Institute, UCSD *Also San Diego Supercomputer."— Presentation transcript:

1 Commodity Flash-Based Systems at 40GbE - FIONA Philip Papadopoulos* Tom Defanti Larry Smarr John Graham Qualcomm Institute, UCSD *Also San Diego Supercomputer Center

2 FIONA Internal Block Diagram (some channels close to saturation) Intel Haswell 1600 LSI 9300 HBA 16 X 1200MB/s x16 Gen3 2 x 40GbE x8 Gen3 I/O Controller x8 Gen2 x8 Gen3 PCIe Gen3 – 1GB/s SATA3 – 600MB/s SAS3 – 1200MB/s SATA 3 32GB x 68GB/s SATA 3 3.5 GHz x8

3 MB: Supermicro X10SRL-F Socket 2011-v3, SAS Controller – LSI 9300/16i 16Gb/s x 16) PCIe Gen3 x 8) Network – Myricom (2 x 10GbE) OR Mellanox 1x40GbE 8 x SSD – Intel 535 (540MB/s R, 520MB/s W), ea SAS-to-SATA Cable 8643-4xSFF 8 X Hard Drive: Western Digital 4TB RED (SATA 2) PCIe Gen3 x 8) CPU: Xeon E5-1620v3 (3.50GHz 68GB/s, ECC Memory) FIONA – Classic Approximate Cost $6K 4TB Flash, 32TB Hard Disk Network

4 Performance Anomalies Iperf testing 10GbE (C = Fiona Classic) 10GbE (Rocks = 5 year old Rocks Node) 40GbE (R = Fiona Rackmount) Network is isolated Most performance within expectation EXCEPT 40GbE  10GbE when thread count is > 3 Inconsistency also manifests itself in 40GbE to 40GbE testing R-To-C(2) == Rackmount-to-Classic (2 threads) Where we WERE a year ago. Very Inconsistent Performance

5 Long Story – The short version. Mellanox 40GbE performance was very erratic on dual socket systems 100Mbps – 39 Gbps in variation (variation over a few minutes). “good” cores and “bad” cores? Not the problem mask interrupts to certain cores (affinity). Not the problem Do just about anything in Mellanox tuning guide (30 pages). No Help. Spend months and months trying to understand what the heck is going on. Throw the “tuning guide” in the trash and keep searching Looked in memory performance (stream). Not an issue on a whim: Turned OFF TCP-offloading  performance evened out. 25Gbps. Not peak speed but consistent performance  This was a driver memory buffer issue

6 The Core Issue Definitively characaterized issue to Mellanox in Nov 2014 with a 100% reproducibility “patched” driver tested in December. Problem fixed Mellanox published updated driver in Jan 2015 Each new release requires retesting to insure no regressions. CPU0CPU1 PCI MEM iperf receiver process MEM pin iperf receiver to any core on this socket == 8Gbps pin iperf receiver to any core on this socket == 39Gbps QPI

7 Performance Today with Latest Driver # ethtool –C eth0 adaptive-rx off reboot (default)

8 Rackmount and Desktop Versions. Options for GPUs Online Parts Spreadsheet

9 Software CentOS 6.6/6.7 Updated Mellanox Driver ZFS (http://openzfs.org)http://openzfs.org Command-line tools that are included in the perfSONAR toolkit iperf3, nuttcp, bwctl, owamp, …. Rational sysctl settings for windows/buffers Data Access Standard Linux tools for local data access (NFS, Samba, SCP,…) Data Transfer Tools (pick from your favorites) FDT GridFTP UDT-based XRootD Custom Code We manage via the Rocks toolkit, but that is not absolutely essential.


Download ppt "Commodity Flash-Based Systems at 40GbE - FIONA Philip Papadopoulos* Tom Defanti Larry Smarr John Graham Qualcomm Institute, UCSD *Also San Diego Supercomputer."

Similar presentations


Ads by Google