NeST: Network Storage Technologies Building I/O Appliances on Commodity Systems John Bent, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau and Miron Livny.

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

NAS vs. SAN 10/2010 Palestinian Land Authority IT Department By Nahreen Ameen 1.
Cloud Computing Part #3 Zigmunds Buliņš, Mg. sc. ing 1.
GridFTP: File Transfer Protocol in Grid Computing Networks
The Condor Data Access Framework GridFTP / NeST Day 31 July 2001 Douglas Thain.
Condor Overview Bill Hoagland. Condor Workload management system for compute-intensive jobs Harnesses collection of dedicated or non-dedicated hardware.
Firewall and Proxy Server Director: Dr. Mort Anvari Name: Anan Chen Date: Summer 2000.
NeST: Network Storage Flexible Commodity Storage Appliances John Bent, Miron Livny, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau.
Platform as a Service (PaaS)
Slingshot: Deploying Stateful Services in Wireless Hotspots Ya-Yunn Su Jason Flinn University of Michigan.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Experiences Deploying Xrootd at RAL Chris Brew (RAL)
Chapter 16 – DNS. DNS Domain Name Service This service allows client machines to resolve computer names (domain names) to IP addresses DNS works at the.
Chapter Oracle Server An Oracle Server consists of an Oracle database (stored data, control and log files.) The Server will support SQL to define.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Chapter 3.  Help you understand different types of servers commonly found on a network including: ◦ File Server ◦ Application Server ◦ Mail Server ◦
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Networked Storage Technologies Douglas Thain University of Wisconsin GriPhyN NSF Project Review January 2003 Chicago.
NETWORK SERVERS Oliver Topping (with a little help from my Mum)
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems CSCI-6140 – Computer Operating Systems David Goldschmidt, Ph.D.
f ACT s  Data intensive applications with Petabytes of data  Web pages billion web pages x 20KB = 400+ terabytes  One computer can read
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Slingshot: Deploying Stateful Services in Wireless Hotspots Ya-Yunn Su Jason Flinn University of Michigan Presenter: Youngki, Lee.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
Distributed Computing Systems CSCI 4780/6780. Distributed System A distributed system is: A collection of independent computers that appears to its users.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
Copyright © cs-tutorial.com. Overview Introduction Architecture Implementation Evaluation.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
Distributed System Concepts and Architectures 2.3 Services Fall 2011 Student: Fan Bai
Server Performance, Scaling, Reliability and Configuration Norman White.
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
Tevfik Kosar Computer Sciences Department University of Wisconsin-Madison Managing and Scheduling Data.
Flexibility, Manageability and Performance in a Grid Storage Appliance John Bent, Venkateshwaran Venkataramani, Nick Leroy, Alain Roy, Joseph Stanley,
Storage Research Meets The Grid Remzi Arpaci-Dusseau.
Process Architecture Process Architecture - A portion of a program that can run independently of and concurrently with other portions of the program. Some.
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
History & Motivations –RDBMS History & Motivations (cont’d) … … Concurrent Access Handling Failures Shared Data User.
Bulk Data Transfer Activities We regard data transfers as “first class citizens,” just like computational jobs. We have transferred ~3 TB of DPOSS data.
Douglas Thain, John Bent Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau, Miron Livny Computer Sciences Department, UW-Madison Gathering at the Well: Creating.
NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Operating System (Reference : OS[Silberschatz] + Norton 6e book slides)
Introduction to Operating Systems
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Introduction to Distributed Platforms
Chapter 1: Introduction
Migratory File Services for Batch-Pipelined Workloads
Reliable Sockets: A Foundation for Mobile Communications
Slingshot: Deploying Stateful Services in Wireless Hotspots
Chapter 1: Introduction
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Storage Virtualization
A Survey on Distributed File Systems
DUCKS – Distributed User-mode Chirp-Knowledgeable Server
Hadoop Technopoints.
STORK: A Scheduler for Data Placement Activities in Grid
NeST: Network Storage Technologies
THE GOOGLE FILE SYSTEM.
IBM Tivoli Storage Manager
Presentation transcript:

NeST: Network Storage Technologies Building I/O Appliances on Commodity Systems John Bent, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau and Miron Livny com

Outline zIntroduction zCase studies zStorage modules zConclusion

Problem Statement Appliances are attractive because they are robust, reliable, available and especially because they are easy to use. To fulfill these criteria, traditional network appliances impose policy decisions on their users and are built either as kernel modules or upon specially designed kernels. “How to build portable, configurable I/O appliances?”

Goal To create a network-storage “template” that produces a range of I/O appliances according to the storage needs of the target application and any constraints of the host system. Network Storage Technologies Target App Host System Perfect I/O Appliance

Host system constraints zThread support zRaw disk access zSelect interface

Target app. storage needs zInvariant and variant storage needs zInvariant yReliable yLow latency yHigh bandwidth yEasy to administer yCheap

Target app. storage needs zVariant yWrite concurrency yReplacement costs ySecurity and authentication needs yCommunication protocol yTransfer unit

Outline zIntroduction zCase studies zStorage modules zConclusion

Building I/O appliances zFour case studies yReqEx yWiND yWeb proxy cache yCondor checkpoint server

What is ReqEx? ReqEx Staging Area Huge tape library (terabytes) Queue of Reqs Tape Robot A robot moves archived data one tape at a time to a temporary staging area.

Perfect I/O Appliance Condor Manager What is ReqEx? ReqEx Staging Area WAN Compute cluster Data is transferred and stored locally to facilitate access by compute nodes.

ReqEx variant storage needs zWrite concurrency yNo write (or read) concurrency zReplacement costs yTape robot is very slow; objects cannot be lost zSecurity and authentication needs yOnly owner can remove object zProtocol yReqEx can be linked with NeST client library zTransfer unit yWhole object transfers only

What is WiND?

WiND variant storage needs zWrite concurrency yNo write concurrency zReplacement costs yUnknown zSecurity and authentication needs yUnknown zProtocol yPredefined specific WiND protocol zTransfer unit yDisk blocks are accessed directly

What is a web proxy cache? Local Area Network Internet Perfect I/O Appliance Frequently accessed objects can be stored locally to decrease request latencies.

Cache variant storage needs zWrite concurrency yNo write concurrency zReplacement costs yNegligible zSecurity and authentication needs yNone zProtocol yHTTP zTransfer unit yWhole object transfer only

Perfect I/O Appliance What is Condor ckpt server? A condor job runs on an execute machine. Keyboard activity causes the job to be evicted. A snapshot of the process is sent to the checkpoint server. When the job migrates to another idle machine, the checkpoint file is recovered and progress resumes.

CCS variant storage needs zWrite concurrency yNo write concurrency zReplacement costs yThe running time of the job (could be months) zSecurity and authentication needs yUnauthorized access cannot be allowed zProtocol yCan link with NeST client library zTransfer unit yWhole file transfer only I see you’re discussing checkpointing. Don’t forget about incremental.

Outline zIntroduction zCase studies zStorage modules zConclusion

Storage modules Storage Management Concurrency Architectures Data Semantics Static Configuration Protocols Administrative Interface Runtime Adaptation Name Space

Configurable Components zConcurrency architecture zData semantics zProtocol layer zNamespace zSecurity and authentication zStorage management

Concurrency architecture NOB POP POT Easy... but uninteresting. “How can multiple storage requests be interleaved to maximize system throughput?”

Data semantics zMust stored objects be protected from concurrent writes? zIs transaction support necessary? zWhat are the recovery costs for lost objects?

Protocol layer zMost applications can not link with NeST client libraries zMost applications have their own specific communication protocols “How can a protocol layer easily communicate with arbitrary networking protocols?” Tower of Babel

Namespace zFlat zHierarchical “How do clients uniquely identify their stored objects?”

Security and authentication zOwnership zPrivacy zEncryption zAuthentication zAccess rights

Storage management zNative filesystem zRaw disk access zUninteresting from client perspective

Outline zIntroduction zCase studies zStorage modules zConclusion

Conclusions and future work zConclusions yNone zFuture work yLots Maybe you should try a little harder.

Conclusions and future work zHow to most easily identify the variant storage needs of the target application? yConfig file? yInstallation script? yRun-time monitoring? zHow to ensure that performance is at least as good as an appliance specifically designed for the target application?