RDMA vs TCP experiment.

Slides:



Advertisements
Similar presentations
Analyzing NFS Client Performance with IOzone
Advertisements

Monitoring and Testing I/O
© 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Performance Measurements of a User-Space.
R2: An application-level kernel for record and replay Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, Z. Zhang, (MSR Asia, Tsinghua, MIT),
Pankaj Kumar Qinglan Zhang Sagar Davasam Sowjanya Puligadda Wei Liu
MCTS GUIDE TO MICROSOFT WINDOWS 7 Chapter 10 Performance Tuning.
Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
CSE 190: Internet E-Commerce Lecture 16: Performance.
Dantong Yu Stony Brook University/Brookhaven National Lab
Reducing the Energy Usage of Office Applications Jason Flinn M. Satyanarayanan Carnegie Mellon University Eyal de Lara Dan S. Wallach Willy Zwaenepoel.
Netkit ftpd/ftp migration version 0.12 Part 5 Yufei 02/11/2011.
RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.
Initial Data Access Module & Lustre Deployment Tan Li.
1 Software Testing and Quality Assurance Lecture 40 – Software Quality Assurance.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Input / Output CS 537 – Introduction to Operating Systems.
22-Aug-15 | 1 |1 | Help! I need more servers! What do I do? Scaling a PHP application.
MCTS Guide to Microsoft Windows 7
Protocols for Wide-Area Data-intensive Applications: Design and Performance Issues Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi, Brian.
Key Perf considerations & bottlenecks Windows Azure VM characteristics Monitoring TroubleshootingBest practices.
Globus GridFTP: What’s New in 2007 Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
HPCS Lab. High Throughput, Low latency and Reliable Remote File Access Hiroki Ohtsuji and Osamu Tatebe University of Tsukuba, Japan / JST CREST.
Ch 6. Performance Rating Windows 7 adjusts itself to match the ability of the hardware –Aero Theme v. Windows Basic –Gaming features –TV recording –Video.
Copyright © 2010, Scryer Analytics, LLC. All rights reserved. Optimizing SAS System Performance − A Platform Perspective Patrick McDonald Scryer Analytics,
The NE010 iWARP Adapter Gary Montry Senior Scientist
A Comparative Study of the Linux and Windows Device Driver Architectures with a focus on IEEE1394 (high speed serial bus) drivers Melekam Tsegaye
High Bandwidth Data Acquisition and Network Streaming in VLBI Jan Wagner, Guifré Molera et al. TKK / Metsähovi Radio Observatory.
Assignment 5/9 – 2005 INF 5070 – Media Servers and Distribution Systems:
Network Tests at CHEP K. Kwon, D. Han, K. Cho, J.S. Suh, D. Son Center for High Energy Physics, KNU, Korea H. Park Supercomputing Center, KISTI, Korea.
Block1 Wrapping Your Nugget Around Distributed Processing.
1 Performance Optimization In QTP Execution Over Video Automation Testing Speaker : Krishnesh Sasiyuthaman Nair Date : 10/05/2012.
Data transfer over the wide area network with a large round trip time H. Matsunaga, T. Isobe, T. Mashimo, H. Sakamoto, I. Ueda International Center for.
All Rights Reserved © Alcatel-Lucent | TIA 30.3 Contribution | August 2010 Telecommunications Industry AssociationTR-30.3/ Arlington, VA.
DBI313. MetricOLTPDWLog Read/Write mixMostly reads, smaller # of rows at a time Scan intensive, large portions of data at a time, bulk loading Mostly.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
RAC parameter tuning for remote access Carlos Fernando Gamboa, Brookhaven National Lab, US Frederick Luehring, Indiana University, US Distributed Database.
DONE-08 Sizing and Performance Tuning N-Tier Applications Mike Furgal Performance Manager Progress Software
2003 Dominic Swayne1 Microsoft Disk Operating System and PC DOS CS-550-1: Operating Systems Fall 2003 Dominic Swayne.
DYNES Storage Infrastructure Artur Barczyk California Institute of Technology LHCOPN Meeting Geneva, October 07, 2010.
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Input-Output Organization
1 Chapter Overview Planning to Install SQL Server 2000 Deciding SQL Server 2000 Setup Configuration Options Running the SQL Server 2000 Setup Program Using.
Hepix LAL April 2001 An alternative to ftp : bbftp Gilles Farrache In2p3 Computing Center
NetLogger Using NetLogger for Distributed Systems Performance Analysis of the BaBar Data Analysis System Data Intensive Distributed Computing Group Lawrence.
iSER update 2014 OFA Developer Workshop Eyal Salomon
A Measurement Based Memory Performance Evaluation of Streaming Media Servers Garba Isa Yau and Abdul Waheed Department of Computer Engineering King Fahd.
Mr. P. K. GuptaSandeep Gupta Roopak Agarwal
Accelerating PHP Applications Ilia Alshanetsky O’Reilly Open Source Convention August 3rd, 2005.
PROOF Benchmark on Different Hardware Configurations 1 11/29/2007 Neng Xu, University of Wisconsin-Madison Mengmeng Chen, Annabelle Leung, Bruce Mellado,
ECHO A System Monitoring and Management Tool Yitao Duan and Dawey Huang.
Scaling up from local DB to distributed DB Cristiano Bozza European Emulsion Group Nagoya, Jan 2004 Presented by Giuseppe Grella.
Benchmarking Storage Systems How to characterize the system Storage Network Clients Specific benchmarks iozone mdtest h5perf Hdf5-aggregation (tiff2nexus)
Computer software: There are at least six step developmental procedures the programmer: Define problem Make or buy decision Design program Code program.
FroNtier Stress Tests at Tier-0 Status report Luis Ramos LCG3D Workshop – September 13, 2006.
Tackling I/O Issues 1 David Race 16 March 2010.
CPSC 231 Secondary storage (D.H.)1 Learning Objectives Understanding disk organization. Sectors, clusters and extents. Fragmentation. Disk access time.
COMP7500 Advanced Operating Systems I/O-Aware Load Balancing Techniques Dr. Xiao Qin Auburn University
StoRM+Lustre Performance Test with 10Gbps Network YAN Tian for Distributed Computing Group Meeting Nov. 4th, 2014.
Compute and Storage For the Farm at Jlab
contentXXL: performance analysis and optimization
Windows Azure Migrating SQL Server Workloads
IS 4506 Server Configuration (HTTP Server)
Tools.
Tools.
FTS Issue in Beijing Erming PEI 2010/06/18.
Operating Systems Structure
Achieving reliable high performance in LFNs (long-fat networks)
Summer 2002 at SLAC Ajay Tirumala.
IS 4506 Configuring the FTP Service
Presentation transcript:

RDMA vs TCP experiment

Goal Environment Test tool - iperf Test Suits Conclusion

Goal Test maximum and average bandwidth usage in 40Gbps(Infiniband) and 10Gbps(iWARP) network environment Compare CPU usage between TCP and RDMA data transfer mode Compare CPU usage between RDMA READ and RDMA WRITE mode

Environment 40 Gbps Infiniband 10 Gbps iWARP Netqos03/client Netqos04/server Whether there is a switch? Between the two server?

Tool - iperf Migrate iperf 2.0.5 to the RDMA environment with OFED(librdmacm and libibverbs). 2000+ Source Lines of Code added. From 8382 to 10562. iperf usage extended -H: RDMA transfer mode instead of TCP/UDP -G: pr(passive read) pw(passive write) Data read from server. Server writes into clients. -O: output data file, both TCP server and RDMA server Only one stream to transfer

Test Suits test suits 1: memory -> memory test suits 2: file -> memory -> memory test case 2.1: file(regular file) -> memory -> memory test case 2.2: file(/dev/zero) -> memory -> memory test case 2.3: file(lustre) -> memory -> memory test suits 3: memory -> memory -> file test case 3.1: memory -> memory -> file(regular file) test case 3.2: memory -> memory -> file(/dev/null) test case 3.3: memory -> memory -> file(lustre) test suits 4: file -> memory -> memory -> file test case 4.1: file ( regular file) -> memory -> memory -> file( regular file) test case 4.2: file(/dev/zero) -> memory -> memory -> file(/dev/null) test case 4.3: file(lustre) -> memory -> memory -> file(lustre)

File choice File operation with Standard I/O library fread, fwrite, Cached by OS Input with /dev/zero wants to test the maximum application data transfer include file operation – read, which means disk is not the bottleneck Output with /dev/null wants to test the maximum application data transfer include file operation – write, which means disk is not the bottleneck

Buffer choice RDMA operation block size is 10MB RDMA READ/WRITE one time Previous experiment shows that, in this environment, if the block size more than 5MB, there is little effect to the transfer speed TCP read/write buffer size is the default TCP window size: 85.3 KByte (default)

Test case 1: memory -> memory CPU

Test case 1: memory -> memory Bandwidth RDMA speed is limited by PCI Express bus.

Test case 2.1: (fread) file(regular file) -> memory -> memory CPU

Test case 2.1: (fread) file(regular file) -> memory -> memory Bandwidth Speed are limited by disk.

Test case 2.2 (five minutes) file(/dev/zero) -> memory -> memory CPU

Test case 2.2 (five minutes) file(/dev/zero) -> memory -> memory Bandwidth

Test case 3.1 (200G file are generated): memory -> memory -> file(regular file) CPU Bandwidths, limited by disk write, are almost the same.

Test case 3.1 (200G file are generated): memory -> memory -> file(regular file) Bandwidth Bandwidths are almost the same!

Test case 3.2: memory -> memory -> file(/dev/null) CPU

Test case 3.2: memory -> memory -> file(/dev/null) Bandwidth

Test case 4.1: file(r) -> memory -> memory -> file(r) CPU

Test case 4.1: file(r) -> memory -> memory -> file(r) Bandwidth

Test case 4.2: file(/dev/zero) -> memory -> memory -> file(/dev/null) CPU

Test case 4.2: file(/dev/zero) -> memory -> memory -> file(/dev/null) Bandwidth

Conclusion For one data transfer stream, the RDMA transport is twice as fast as TCP, while the RDMA has only 10% of CPU load compare with the CPU load under TCP, without disk operation. FTP includes two components: Networking and File operation. Compare with the RDMA operation, file operation (limited by the disk performance) takes most of the CPU usage. Therefore, a well-designed file buffer mode is critical.

Future work Setup Lustre environment, and configure Lustre with RDMA function Startup FTP migration Source control Bug database Document etc (refer to The Joel Test)

Memory Cached Cleanup # sync # echo 3 > /proc/sys/vm/drop_caches