© 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Performance Measurements of a User-Space.

Slides:



Advertisements
Similar presentations
Analyzing NFS Client Performance with IOzone
Advertisements

Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Multiple Processor Systems
Chapter 5 Input/Output 5.1 Principles of I/O hardware
Bus Specification Embedded Systems Design and Implementation Witawas Srisa-an.
Hard Disk Drives Chapter 7.
1 Chapter 16 Tuning RMAN. 2 Background One of the hardest chapters to develop material for Tuning RMAN can sometimes be difficult Authors tried to capture.
Device Virtualization Architecture
I/O and Networking Fred Kuhns
I/O Systems.
Slide 19-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 19.
Windows® Deployment Services
Virtual Switching Without a Hypervisor for a More Secure Cloud Xin Jin Princeton University Joint work with Eric Keller(UPenn) and Jennifer Rexford(Princeton)
Operating System.
Executional Architecture
Operating Systems: Internals and Design Principles
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Database Architectures and the Web
TCP/IP Protocol Suite 1 Chapter 18 Upon completion you will be able to: Remote Login: Telnet Understand how TELNET works Understand the role of NVT in.
High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
04/16/2010CSCI 315 Operating Systems Design1 I/O Systems Notice: The slides for this lecture have been largely based on those accompanying an earlier edition.
Middleware Technologies compiled by: Thomas M. Cosley.
Federated DAFS: Scalable Cluster-based Direct Access File Servers Murali Rangarajan, Suresh Gopalakrishnan Ashok Arumugam, Rabita Sarker Rutgers University.
RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.
Figure 1.1 Interaction between applications and the operating system.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
Realizing the Performance Potential of the Virtual Interface Architecture Evan Speight, Hazim Abdel-Shafi, and John K. Bennett Rice University, Dep. Of.
1 I/O Management in Representative Operating Systems.
PRASHANTHI NARAYAN NETTEM.
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
Infiniband enables scalable Real Application Clusters – Update Spring 2008 Sumanta Chatterjee, Oracle Richard Frank, Oracle.
1 Input/Output. 2 Principles of I/O Hardware Some typical device, network, and data base rates.
Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute.
Oracle8 JDBC Drivers Section 2. Common Features of Oracle JDBC Drivers The server-side and client-side Oracle JDBC drivers provide the same basic functionality.
UNIX System Administration OS Kernal Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept Kernel or MicroKernel Concept: An OS architecture-design.
1 Chapter Client-Server Interaction. 2 Functionality  Transport layer and layers below  Basic communication  Reliability  Application layer.
Disk Access. DISK STRUCTURE Sector: Smallest unit of data transfer from/to disk; 512B 2/4/8 adjacent sectors transferred together: Blocks Read/write heads.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
Operating System 4 THREADS, SMP AND MICROKERNELS
Windows Operating System Internals - by David A. Solomon and Mark E. Russinovich with Andreas Polze Unit OS6: Device Management 6.1. Principles of I/O.
COT 4600 Operating Systems Fall 2009 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:00-4:00 PM.
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
DBI313. MetricOLTPDWLog Read/Write mixMostly reads, smaller # of rows at a time Scan intensive, large portions of data at a time, bulk loading Mostly.
Windows Network Programming ms-help://MS.MSDNQTR.2004JAN.1033/winsock/winsock/windows_sockets_start_page_2.htm 井民全.
August 22, 2005Page 1 of (#) Datacenter Fabric Workshop Open MPI Overview and Current Status Tim Woodall - LANL Galen Shipman - LANL/UNM.
1 Public DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience Arkady Kanevsky & Peter Corbett Network Appliance Vijay Velusamy.
Architectural Design of Distributed Applications Chapter 13 Part of Design Analysis Designing Concurrent, Distributed, and Real-Time Applications with.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
WINDOWS NT Network Architecture Amy, Mei-Hsuan Lu CML/CSIE/NTU August 19, 1998.
The Mach System Silberschatz et al Presented By Anjana Venkat.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
AFS/OSD Project R.Belloni, L.Giammarino, A.Maslennikov, G.Palumbo, H.Reuter, R.Toebbicke.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
10/15: Lecture Topics Input/Output –Types of I/O Devices –How devices communicate with the rest of the system communicating with the processor communicating.
Experiences with VI Communication for Database Storage Yuanyuan Zhou, Angelos Bilas, Suresh Jagannathan, Cezary Dubnicki, Jammes F. Philbin, Kai Li.
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Interactions with Microarchitectures and I/O Copyright 2004 Daniel.
Introduction to Operating Systems Concepts
Chapter 3: Windows7 Part 4.
CSCI 315 Operating Systems Design
iSCSI-based Virtual Storage System for Mobile Devices
I/O Systems I/O Hardware Application I/O Interface
Performance And Scalability In Oracle9i And SQL Server 2000
Presentation transcript:

© 2003 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Performance Measurements of a User-Space DAFS Server with a Database Workload Samuel A. Fineberg Don Wilson NonStop Labs

page 2August 27, 2003 Fineberg and Wilson NICELI Presentation Outline Background on DAFS and ODM Prototype client and server I/O tests performed Raw benchmark results Oracle TPC-H results Summary and conclusions

page 3August 27, 2003 Fineberg and Wilson NICELI Presentation What is the Direct Access File System (DAFS)? Created by the DAFS Collaborative – Group consisting of over 80 members from industry, government, and academic institutions – DAFS 1.0 spec was approved in September 2001 DAFS is a distributed file access protocol – Data requested from files, not blocks – Based loosely on NFSv4 Optimized for local file sharing environments – Systems are in relatively close proximity – Connected by a high-speed low-latency network Built on top of direct-access transport networks – Initially targeted at Virtual Interface Architecture (VIA) networks – Direct Access Transport (DAT) API was later generalized and ported to other networks (e.g., Infiniband, iWarp)

page 4August 27, 2003 Fineberg and Wilson NICELI Presentation Characteristics of a Direct Access Transport Connected model, i.e., VIs must be connected before communication can occur Two forms of data transport – Send/receive – two-sided – RDMA read and write – one sided Both transports support direct data placement – Receives must be pre-posted Memory regions must be registered before they can be transferred through a DAT – Pins data in physical memory – Establishes VM nslation tables for the NIC

page 5August 27, 2003 Fineberg and Wilson NICELI Presentation DAFS Details Session based – DAFS clients initiate sessions with a server – DAT/VIA connection is associated with a session RPC-like Command format – Implemented with send/receive – Server receives requests sent from clients – Server sends responses to be received by client Open/Close – Unlike NFSv2, files must be open and closed (not stateless) Read/Write I/O modes – Inline: limited amount of data included in request/response – Direct: Server initiates RDMA read or write to move data

page 6August 27, 2003 Fineberg and Wilson NICELI Presentation Inline vs. Direct I/O Time ClientServer ClientServer InlineDirect Response Request disk read or write Request disk write RDMA read Response disk read RDMA write Response Request Direct write Direct read Inline Read or write

page 7August 27, 2003 Fineberg and Wilson NICELI Presentation Oracle Disk Manager (ODM) File access interface spec for the Oracle Database – Supported as a standard feature in Oracle 9i – Implemented as a vendor supplied DLL – Files that can not be opened using ODM use standard APIs Basic commands – Files are created and pre-allocated then committed – Files are then identified (open) and unidentified (closed) – All r/w I/O uses an asynchronous odm_io command I/Os specified as descriptors, multiple per odm_io call – Multiple waiting mechanisms: wait for specific, wait for any – Other commands are synchronous, e.g., resizing

page 8August 27, 2003 Fineberg and Wilson NICELI Presentation Prototype Client/Server DAFS Server – Implemented for Windows 2000 and Linux (all testing was on Windows) – Built on VIPL 1.0 using DAFS 1.0 SDK protocol stubs – All data buffers are pre-allocated and pre-registered – Data-driven multithreaded design ODM Client – Implemented as a Windows 2000 dll for Oracle 9i – Multithreaded to enable decoupling of asynchronous I/O from Oracle threads – Inline buffers are copied, direct buffers are registered/deregistered as part of the I/O – Inline/direct threshold (set when library is initialized)

page 9August 27, 2003 Fineberg and Wilson NICELI Presentation Test System Configuration Goal was to compare local I/O with DAFS Local I/O configuration – Single system running Oracle on locally attached disks DAFS/ODM I/O configuration – One system running DAFS server software with locally attached disks – Second system running Oracle and ODM client, files on DAFS server accessed using ODM over a network 4-processor Windows 2000 server based systems – 500MHz Xeon, 3GB RAM, dual-bus PCI 64/33 – ServerNet II (VIA 1.0 based) System Area Network – Disks were 15K RPM attached by two PCI RAID controllers, configured for RAID 1/0 (mirrored-striped)

page 10August 27, 2003 Fineberg and Wilson NICELI Presentation Experiments Raw I/O Tests – Odmblast – streaming I/O test – Odmlat – I/O latency test – DAFS tests used ODM dll to access files on DAFS server – Local tests used special local ODM library built on Windows unbuffered I/O Oracle database test – Standard TPC-H benchmark – SQL based decision support code – DAFS tests used ODM dll to access files on DAFS server – Local tests used ran without ODM (Oracle uses windows unbuffered I/O directly)

page 11August 27, 2003 Fineberg and Wilson NICELI Presentation Odmblast ODM based I/O stress test – Intended to present a continuous load to the I/O system – Issues many simultaneous I/Os (to allow for pipelining) In our experiments, Odmblast streams 32 I/Os to server – 16 I/Os per odm_io call – wait for I/Os from the previous odm_io call I/Os can be reads, writes, or a random mix I/Os can be at sequential or random offsets

page 12August 27, 2003 Fineberg and Wilson NICELI Presentation Odmblast read comparison

page 13August 27, 2003 Fineberg and Wilson NICELI Presentation Odmblast write comparison

page 14August 27, 2003 Fineberg and Wilson NICELI Presentation Odmlat I/O Latency test – How long does a single I/O take (not necessarily related to aggregate I/O rate) – For these experiments, <16K = inline, 16K = direct – Derived the components that make up I/O time using linear regression – More details in paper

page 15August 27, 2003 Fineberg and Wilson NICELI Presentation Odmlat performance

page 16August 27, 2003 Fineberg and Wilson NICELI Presentation Oracle-based results Standard Database Benchmark - TPC-H – Written in SQL – Decision support benchmark – Multiple ad-hoc query streams with an update thread – 30GB database size Oracle configuration – All I/Os 512-byte aligned (required for unbuffered I/O) – 16K database block size – Database files distributed across two NTFS file systems Measurements – Compared average runtime for local vs. DAFS based I/O – Skipped official TPC-H power metric – Varied inline/direct threshold for DAFS based I/O

page 17August 27, 2003 Fineberg and Wilson NICELI Presentation Oracle TPC-H Performance

page 18August 27, 2003 Fineberg and Wilson NICELI Presentation Oracle TPC-H Operation Distribution

page 19August 27, 2003 Fineberg and Wilson NICELI Presentation Oracle TPC-H CPU Utilization

page 20August 27, 2003 Fineberg and Wilson NICELI Presentation TPC-H Summary Local I/O still faster – Limited ServerNet II bandwidth – Memory registration or copying overhead – Windows unbuffered I/O is already very efficient DAFS still has more capabilities than local I/O – Capable of cluster I/O (RAC) Memory registration is still a problem with DATs – Registration caching can be problematic Can not guarantee address mappings will not change ODM has no means for notifying NIC of mapping changes – Need either better integration of I/O library with Oracle or better integration of OS with DAT Transparency is expensive!

page 21August 27, 2003 Fineberg and Wilson NICELI Presentation Conclusions DAFS Server/ODM Client achieved performance close to the limits of our network – Local SCSI I/O was still faster Running a database benchmark, DAFS TPC-H performance was within 10% of local I/O – Also provides advantages of a network file system (i.e., clustering support) Limitations of our tests – ServerNet II bandwidth was inadequate – no support for multiple NICs – Needed to do client-side registration for all direct I/Os TPC-H benchmark was not optimally tuned – Needed to bring client CPU closer to 100% More disks, less CPUs, other tuning – CPU offload is not a benefit if I/O is the bottleneck

HP logo