Initial Data Access Module & Lustre Deployment Tan Li.

Slides:



Advertisements
Similar presentations
© 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Performance Measurements of a User-Space.
Advertisements

BAI613 Module 2 - Voice over IP Technology. Module Objectives 1. Describe the benefits of IP Telephony/Packet Telephony/VoIP over traditional telephone.
Middleware Support for RDMA-based Data Transfer in Cloud Computing Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi Department of Electrical.
The Development of Mellanox - NVIDIA GPUDirect over InfiniBand A New Model for GPU to GPU Communications Gilad Shainer.
Unified Wire Felix Marti, Open Fabrics Alliance Workshop Sonoma, April 2008 Chelsio Communications.
HPC USER FORUM I/O PANEL April 2009 Roanoke, VA Panel questions: 1 response per question Limit length to 1 slide.
IWARP Update #OFADevWorkshop.
Dantong Yu Stony Brook University/Brookhaven National Lab
ISCSI Performance in Integrated LAN/SAN Environment Li Yin U.C. Berkeley.
RDMA vs TCP experiment.
SRW File Transfer Featuring SRW-5800 w/HKSR-5804.
Loopback: Exploiting Collaborative Caches for Large-Scale Streaming Ewa Kusmierek, Yingfei Dong, Member, IEEE, and David H. C. Du, Fellow, IEEE.
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
New Direction Proposal: An OpenFabrics Framework for high-performance I/O apps OFA TAC, Key drivers: Sean Hefty, Paul Grun.
Module 12 MXL DCB <Place supporting graphic here>
Existing Network Study CPIT 375 Data Network Designing and Evaluation.
OFA-IWG - March 2010 OFA Interoperability Working Group Update Authors: Mikkel Hagen, Rupert Dance Date: 3/15/2010.
SRP Update Bart Van Assche,.
InterVLAN Routing Design and Implementation. What Routers Do Intelligent, dynamic routing protocols for packet transport Packet filtering capabilities.
Windows RDMA File Storage
Protocols for Wide-Area Data-intensive Applications: Design and Performance Issues Yufei Ren, Tan Li, Dantong Yu, Shudong Jin, Thomas Robertazzi, Brian.
Roland Dreier Technical Lead – Cisco Systems, Inc. OpenIB Maintainer Sean Hefty Software Engineer – Intel Corporation OpenIB Maintainer Yaron Haviv CTO.
OFA-IWG Interop Event March 2008 Rupert Dance, Arkady Kanevsky, Tuan Phamdo, Mikkel Hagen Sonoma Workshop Presentation.
Copyright DataDirect Networks - All Rights Reserved - Not reproducible without express written permission Adventures Installing Infiniband Storage Randy.
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
Towards a Common Communication Infrastructure for Clusters and Grids Darius Buntinas Argonne National Laboratory.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
TCP Throughput Collapse in Cluster-based Storage Systems
Slide 1 DESIGN, IMPLEMENTATION, AND PERFORMANCE ANALYSIS OF THE ISCSI PROTOCOL FOR SCSI OVER TCP/IP By Anshul Chadda (Trebia Networks)-Speaker Ashish Palekar.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
© 2012 MELLANOX TECHNOLOGIES 1 The Exascale Interconnect Technology Rich Graham – Sr. Solutions Architect.
The NE010 iWARP Adapter Gary Montry Senior Scientist
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
Optimizing UDP-based Protocol Implementations Yunhong Gu and Robert L. Grossman Presenter: Michal Sabala National Center for Data Mining.
2006 Sonoma Workshop February 2006Page 1 Sockets Direct Protocol (SDP) for Windows - Motivation and Plans Gilad Shainer Mellanox Technologies Inc.
ISER Update OpenIB Workshop, Feb 2006 Yaron Haviv, Voltaire John Hufferd, Brocade
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
OFED Interoperability NetEffect April 30, 2007 Sonoma Workshop Presentation.
OFED Usage in VMware Virtual Infrastructure Anne Marie Merritt, VMware Tziporet Koren, Mellanox May 1, 2007 Sonoma Workshop Presentation.
Network Architecture for the LHCb DAQ Upgrade Guoming Liu CERN, Switzerland Upgrade DAQ Miniworkshop May 27, 2013.
7. CBM collaboration meetingXDAQ evaluation - J.Adamczewski1.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
Storage and Storage Access 1 Rainer Többicke CERN/IT.
Hot Interconnects TCP-Splitter: A Reconfigurable Hardware Based TCP/IP Flow Monitor David V. Schuehler
Terascala – Lustre for the Rest of Us  Delivering high performance, Lustre-based parallel storage appliances  Simplifies deployment, management and tuning.
1 Recommendations Now that 40 GbE has been adopted as part of the 802.3ba Task Force, there is a need to consider inter-switch links applications at 40.
Performance Networking ™ Server Blade Summit March 23, 2005.
Server OEM Panel 1 Bob Souza, HP 15 March 2010 OFA Sonoma Workshop.
iSER update 2014 OFA Developer Workshop Eyal Salomon
W&L Page 1 CCNA CCNA Training 2.5 Describe how VLANs create logically separate networks and the need for routing between them Jose Luis.
Barriers to IB adoption (Storage Perspective) Ashish Batwara Software Solution Architect May 01, 2007.
A Two-phase Execution Engine of Reduce Tasks In Hadoop MapReduce XiaohongZhang*GuoweiWang* ZijingYang*YangDing School of Computer Science and Technology.
Datacenter Fabric Workshop NFS over RDMA Boris Shpolyansky Mellanox Technologies Inc.
Use case of RDMA in Symantec storage software stack Om Prakash Agarwal Symantec.
STORAGE ARCHITECTURE/ MASTER): Where IP and FC Storage Fit in Your Enterprise Randy Kerns Senior Partner The Evaluator Group.
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet P. Balaji, S. Bhagvat, R. Thakur and D. K. Panda, Mathematics.
F. HemmerUltraNet® Experiences SHIFT Model CPU Server CPU Server CPU Server CPU Server CPU Server CPU Server Disk Server Disk Server Tape Server Tape Server.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Microsoft Advertising 16:9 Template Light Use the slides below to start the design of your presentation. Additional slides layouts (title slides, tile.
Red Hat Enterprise Linux Presenter name Title, Red Hat Date.
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
Experiences with VI Communication for Database Storage Yuanyuan Zhou, Angelos Bilas, Suresh Jagannathan, Cezary Dubnicki, Jammes F. Philbin, Kai Li.
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
Tgt: Framework Target Drivers FUJITA Tomonori NTT Cyber Solutions Laboratories Mike Christie Red Hat, Inc Ottawa Linux.
Balazs Voneki CERN/EP/LHCb Online group
Research Introduction
100% REAL EXAM QUESTIONS ANSWERS
High Bandwidth Interface for Cameras and Digital Cinema
Factors Driving Enterprise NVMeTM Growth
Presentation transcript:

Initial Data Access Module & Lustre Deployment Tan Li

2 Outline Disk I/O test for netqos03 and netqos04 Initial design for file I/O module  Data read with different function and buffer size  Data read with fread() with different waiting time and buffer size  Some conclusions Intro to Lustre setup Lustre deployment for the new servers

3 Initial Design for Data Access  Current data access module (Block size: 100K, 1M, 10M,100M, 500M for 100G file)

4 Initial design for file I/O module 1.Head file: ftp_io.h 2.Date access functions int ftp_open(char *path, int block_size, int mode); int ftp_read(int infile_fd, char *out_buf, int block_size); int ftp_write(int outfile_fd, char *in_buf, int block_size); int ftp_close(int close_fd, int block); Usage of ftp_open(): Block size passed to the function in order to decide the open method (open, fopen or open with O_DIRECT), and the close method of ftp_close should accord with the ftp_open. mode=0 is open for read, and mode=1 is for write

5 Initial design for file I/O module

6

7 Block size > 400K? open/fopen (Read only) open with O_DIRECT(Read only) No Yes Mode=0 or 1 Return the file descriptor open with O_DIRECT(Write only) open/fopen (Write only)

8 Initial design for file I/O module  Problem with O_DIRECT when write data When write data with O_DIRECT, the block should be the multiple of 512 Byte on our platform. So, we will have problem to write the last few bytes of the file. Possible solution: 1. using the regular write() to output the remaining data. 2. Integrate open function into the read and write function

9 Data reading test on fread() 1.Test result by the time tool of linux 2.Test result by nmon (recording data every two secs)

10 Data reading test on fread() Some Conclusions  The bandwidth grows with the increment of buffer size, especially when the buffer size change from 100K to 1000K(3 times).  The bandwidth is not sensitive to the wait time until it reach some threshold. And the larger the buffer size is, the bandwidth is less sensitive to the delay.  The CPU utilization is 0% when the buffer size is below 100K. And it grows with the increase of buffer size.

11 IWARP and Infiniband InfinibandIWARP HardwareSpecialized I/O structure A set of mechanisms over Ethernet that moving data management and network protocol processing to the RNIC card Transport methodpoint-to-pointend to end Compatibilityfully compatible with existing Ethernet switching specialized infrastructure VendorsA broad range of vendors Only two: Mellanox and QLogic

12 RoCEE RoCEE = Infiniband over Ethernet(IBoE) RDMA over Converged Enhanced Ethernet (RoCEE) protocol proposal, is designed to allow the deployment of RDMA semantics on Converged Enhanced Ethernet fabric by running the IB transport protocol using Ethernet frames.In other words, to take the InfiniBand transport layer and package it into Ethernet frames, instead of using the iWARP protocol for Ethernet-based high-performance cluster networking.

13 RoCEE Problem 1: IWARP has already leveraged the performance benefit of RoCEE Problem 2: hard to implement. Problem 3: the RoCEE is dependent on the deployment of 10GbE CEE infrastructure; currently only one vendor (Cisco) offers CEE switches, which are at relatively high price points.

14 Thanks & Questions