HPCS Lab. High Throughput, Low latency and Reliable Remote File Access Hiroki Ohtsuji and Osamu Tatebe University of Tsukuba, Japan / JST CREST.

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

A Case for Redundant Arrays Of Inexpensive Disks Paper By David A Patterson Garth Gibson Randy H Katz University of California Berkeley.
1 NCFS: On the Practicality and Extensibility of a Network-Coding-Based Distributed File System Yuchong Hu 1, Chiu-Man Yu 2, Yan-Kit Li 2 Patrick P. C.
04/25/06Pavan Balaji (The Ohio State University) Asynchronous Zero-copy Communication for Synchronous Sockets in the Sockets Direct Protocol over InfiniBand.
Evaluation of ConnectX Virtual Protocol Interconnect for Data Centers Ryan E. GrantAhmad Afsahi Pavan Balaji Department of Electrical and Computer Engineering,
Performance Analysis of Virtualization for High Performance Computing A Practical Evaluation of Hypervisor Overheads Matthew Cawood University of Cape.
Abstract HyFS: A Highly Available Distributed File System Jianqiang Luo, Mochan Shrestha, Lihao Xu Department of Computer Science, Wayne State University.
“Redundant Array of Inexpensive Disks”. CONTENTS Storage devices. Optical drives. Floppy disk. Hard disk. Components of Hard disks. RAID technology. Levels.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
RAID Redundant Arrays of Inexpensive Disks –Using lots of disk drives improves: Performance Reliability –Alternative: Specialized, high-performance hardware.
Availability in Globally Distributed Storage Systems
RAID Technology CS350 Computer Organization Section 2 Larkin Young Rob Deaderick Amos Painter Josh Ellis.
RDS and Oracle 10g RAC Update Paul Tsien, Oracle.
Scale-out Central Store. Conventional Storage Verses Scale Out Clustered Storage Conventional Storage Scale Out Clustered Storage Faster……………………………………………….
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
Modularized Redundant Parallel Virtual System
High Performance Computing Course Notes High Performance Storage.
An overview of Infiniband Reykjavik, June 24th 2008 R E Y K J A V I K U N I V E R S I T Y Dept. Computer Science Center for Analysis and Design of Intelligent.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
\\fs\share File Server SMB Direct Client Application NIC RDMA NIC TCP/ IP SMB Direct Ethernet and/or InfiniBand TCP/ IP Unchanged.
P. Balaji, S. Bhagvat, D. K. Panda, R. Thakur, and W. Gropp
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Infiniband enables scalable Real Application Clusters – Update Spring 2008 Sumanta Chatterjee, Oracle Richard Frank, Oracle.
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy,
A Workflow-Aware Storage System Emalayan Vairavanathan 1 Samer Al-Kiswany, Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu.
Managing Storage Lesson 3.
Redundant Array of Independent Disks
RAID: High-Performance, Reliable Secondary Storage Mei Qing & Chaoxia Liao Nov. 20, 2003.
1 A Look at PVFS, a Parallel File System for Linux Will Arensman Anila Pillai.
1 A Look at PVFS, a Parallel File System for Linux Talk originally given by Will Arensman and Anila Pillai.
TPT-RAID: A High Performance Multi-Box Storage System
Copyright DataDirect Networks - All Rights Reserved - Not reproducible without express written permission Adventures Installing Infiniband Storage Randy.
RAMCloud: a Low-Latency Datacenter Storage System John Ousterhout Stanford University
Reliable Datagram Sockets and InfiniBand Hanan Hit NoCOUG Staff 2010.
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
Towards a Common Communication Infrastructure for Clusters and Grids Darius Buntinas Argonne National Laboratory.
Emalayan Vairavanathan
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
Low-Latency Datacenters John Ousterhout Platform Lab Retreat May 29, 2015.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
Impact of High Performance Sockets on Data Intensive Applications Pavan Balaji, Jiesheng Wu, D.K. Panda, CIS Department The Ohio State University Tahsin.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
Data Replication and Power Consumption in Data Grids Susan V. Vrbsky, Ming Lei, Karl Smith and Jeff Byrd Department of Computer Science The University.
RAMCloud Overview and Status John Ousterhout Stanford University.
Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.
CMS week, June 2002, CERN 1 First P2P Measurements on Infiniband Luciano Berti INFN Laboratori Nazionali di Legnaro.
Sockets Direct Protocol Over InfiniBand in Clusters: Is it Beneficial? P. Balaji, S. Narravula, K. Vaidyanathan, S. Krishnamoorthy, J. Wu and D. K. Panda.
Mr. P. K. GuptaSandeep Gupta Roopak Agarwal
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
Trickles :A stateless network stack for improved Scalability, Resilience, and Flexibility Alan Shieh,Andrew C.Myers,Emin Gun Sirer Dept. of Computer Science,Cornell.
Storage Networking. Storage Trends Storage grows %/year, gets more complicated It’s necessary to pool storage for flexibility Intelligent storage.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
A Practical Evaluation of Hypervisor Overheads Matthew Cawood Supervised by: Dr. Simon Winberg University of Cape Town Performance Analysis of Virtualization.
G. Russo, D. Del Prete, S. Pardi Frascati, 2011 april 4th-7th The Naples' testbed for the SuperB computing model: first tests G. Russo, D. Del Prete, S.
Advisor: Hung Shi-Hao Presenter: Chen Yu-Jen
CWG12: Filesystems for TDS U. FUCHS / CERN. O2 Data Flow Schema FLPs Data Network 1500 EPNs ~3o PB, 10 9 files, ~150 GBps Data Management facilities,
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Video Security Design Workshop:
Storage Networking.
Storage Networking.
Mark Zbikowski and Gary Kimura
Application taxonomy & characterization
CS 295: Modern Systems Organizing Storage Devices
Tia Newhall, Daniel Amato, Alexandr Pshenichkin
Presentation transcript:

HPCS Lab. High Throughput, Low latency and Reliable Remote File Access Hiroki Ohtsuji and Osamu Tatebe University of Tsukuba, Japan / JST CREST

HPCS Lab. Motivation and Background Data-intensive computing is a one of the most important issue in many areas Storage systems for Exa-byte (10 18 ) 2 Need a fast and reliable remote file access system

HPCS Lab. Motivation and Background(cont’d) Data sharing – Distributed file system – Clients access the data via Network Bottlenecks – Wide-area network Long latency – Storage cluster Overhead of network Fault tolerance – Suggestion: Congestion avoidance 3 Meta data server Storage node Client

HPCS Lab. Remote file access with RDMA Latency of Ethernet is at least 50 microseconds – Overhead of software – Protocol – Memory copy Flash memory based storage devices – 25μs latency (e.g. Fusion-io ioDrive), (HDD=5ms) Network becomes a bottleneck of the system 4

HPCS Lab. Usage of Infiniband IP over IB – Use the IP Protocol stack of operating systems Pros – Can use as a network adapter Cons – Inefficient SDP (Socket Direct Protocol) – Pros Easy to use – Specify the LD_PRELOAD Cons – Performance RDMA (Verbs API) – Pros Low-latency – Cons No compatibility with socket APIs Performance Implementation cost 5 better

HPCS Lab. Structure of OFED Verbs API SDP IPoIB From ©Mellanox document OFED : Drive and libraries for Infiniband 6

HPCS Lab. Remote file access with RDMA Architecture Memory Client Server Storage Infiniband Memory Client Application RDMA Infiniband FDR (54.3Gbps) Storage:Fusion-io ioDrive 7 RC Low overhead remote file access with Verbs API CPU bypass

HPCS Lab. Preliminary Evaluation: Throughput A client accesses the file on the file server via Infiniband w/ Verbs API – Sequential (Verbs API) 8

HPCS Lab. Preliminary Evaluation of IOPS Stride access from 2KB-64KB (seek 1MB) (Verbs API) Stride = 1MB 9

HPCS Lab. Congestion avoidance by using redundant data Concentration of access – There are hotspots (files) on the storage node Redundant data – Fault tolerance – Can be use to avoid congestion

HPCS Lab. Redundant data Basic structure File1d 0 d 1 p d 1 xor p 0 = d 0

HPCS Lab. RAID: connected with SCSI / SATA →connected with network 12 SCSI / SATA Network

HPCS Lab. Performance deterioration File1d 0 d 1 p 0 File2 d 0 d 1 p Storage nodes Clients 012 File1d 0 d 1 p 0 File2 p 0 d 0 d Storage nodes Clients

HPCS Lab. Performance deterioration(cont’d) 14 Clients RAM Disk , 16GB , Sequential No congestion w/ congestion

HPCS Lab. Congestion avoidance File1d 0 d 1 p 0 File2 d 0 d 1 p Clients Storage nodes

HPCS Lab. Performance evaluation Compare the cases 16 復号あり w/ decode Congestion avoided +28% +15% Clients 復号あり Rebuild the striped data No congestion w/ congestion

HPCS Lab. Related work Stephen C. Simms et al, Wide Area Filesystem Performance using Lustre on the TeraGrid, Wu, J., Wyckoff, P. and Panda, D.: PVFS over InfiniBand: Design and Performance Evaluation Erasure Coding in Windows Azure Storage, USENIX ATC ‘12 – Shorten the latency by using redundant data HDFS RAID 17

HPCS Lab. Conclusion and Future work Remote file access with Infiniband RDMA Congestion avoidance Future work – How to detect the congestion – Writing of data(in progress) w/o performance degradation Client Storage nodes