Presentation is loading. Please wait.

Presentation is loading. Please wait.

Garth A. Gibson*, David F. Nagle**, William Courtright II*, Nat Lanza*, Paul Mazaitis*, Marc Unangst*, Jim Zelenka* "NASD Scalable Storage Systems",USENIX99,

Similar presentations


Presentation on theme: "Garth A. Gibson*, David F. Nagle**, William Courtright II*, Nat Lanza*, Paul Mazaitis*, Marc Unangst*, Jim Zelenka* "NASD Scalable Storage Systems",USENIX99,"— Presentation transcript:

1 Garth A. Gibson*, David F. Nagle**, William Courtright II*, Nat Lanza*, Paul Mazaitis*, Marc Unangst*, Jim Zelenka* "NASD Scalable Storage Systems",USENIX99, Extreme Linux Workshop, Monterey, CA, June 1999. http://www.pdl.cs.cmu.edu/Publications/publications.html)

2 Motivation NASD minimizes server based data movement and separates management and filesystem sematics from store-and-forward copying Figure 1: Standalone server with attached disks –Look at long path requests and data take through OS layers and through various machines Reference implementation of NASD for Linux 2.2 including NASD device code that runs on workstation or PC masquerading as subsystem or disk drive NFS-like distributed file system that uses NASD subsystems or devices NASD striping middleware for large striped files

3 Figure 1 -- NetSCSI and NASD Figure 1 outlines data path where clients ask for data, servers forward request to storage -- forwarded request is a DMA command to return data directly to a client. –When DMA is complete, status is returned to server and collected and forwarded to client NASD –On first access, client contacts server for access checks –Server grants reusable rights or capabilities –Clients then present requests directly to storage –Storage verifies capabilities and directly replies

4 NASD Interface Read, write object data Read, write object attributes Create, resize, remove soft partitions Construct copy-on-write version of object Logical version number on file can be changed by file manager to revoke capability

5 NASD Security Security protocol –Capability has public portion -- CAapArg, private key CapKey –CapArg specifies what rights are being granted for which object –CapKey is a keyed message digest of CapArg and a secret key shared only with target drive –Client sends CapArg with each request, gnerates a CapKey-keyed digest of request parameters and CapArg –Each drive knows its secret keys and receives CapArg with each request –Can compute client’s CapKey and verify request –If any field of CapArg or request has been changed, digest comparison will fail –Scheme protects integrity of requests but does not protect privacy of data

6 Filesystems for NASD Constructed distributed file system with NFS-like semantics tailored for NASD Each file and directory occupies exactly one NASD object, offsets in files are same as offsets in objects File length, last file modify time correspond directly to NASD-maintained object attributes Remainder of file attributes stored in uninterpreted section of object’s attributes Data moving operations -- read, write) and attribute reads (getattr) are sent directly to NASD drive –file attributes are either computed from NASD object attributes (e.g. modify times and object size) or stored in the uninterpreted filesystem-specific attribute Other requests are handled by file manager Capabilities are piggybacked on file manager’s response to lookup operations

7 Access to Striped Files and Continuous Media NASD-optimized parallel filesystem Filesystem manages objects not directly backed by data Backed by storage manager which redirects clients to component NASD objects NASD PFS supports SIO low-level parallel filesystem interface on top of NASD-NFS files striped using user-level Cheops middleware Figure 6

8 Garth A. Gibson, David F. Nagle, Khalil Amiri, Jeff Butler, Fay W. Chang, Howard Gobioff, Charles Hardin, Erik Riedel, David Rochberg and Jim Zelenka A cost-effective, high-bandwidth storage architecture. Architectural Support for Programming Languages and Operating Systems Proceedings of the 8th international conference on Architectural support for programming languages and operating systems October 2 - 7, 1998, San Jose, CA USA Pages 92-103.

9 Evolution of storage architectures Local Filesystem -- Simple- aggregate, application, file management concurrency control, low level storage management. Data makes one trip of peripheral area network such as SCSI. Disks offer fixed sized block abstraction Distributed Filesystem -- Intermediate server machine is introduced. Server offers simple file access interface to clients. Distributed Filesystem with RAID controller -- Interpose another computer -- RAID controller. Distributed Filesystem that employs DMA -- Can arrange to DMA data to clients rather than to copy through server. HPSS is an example (although this is not how it is usually employed). NASD- based DFS, NASD-Cheops based DFS

10 Principals of NASD Direct transfer -- data moved between drive and client without indirection or store-and-forward through file server Asynchronous oversight -- Ability of client to perform most operations without synchronous appeal to the file manager Cryptographic integrity -- Drives ensure that commands and data have not been tampered with by generating and verifying cryptographic keyed digests Object based interface -- Drives export variable length objects instead of fixed-size blocks. Allows disk drives to direct knowledge of relationships between disk blocks and minimize security overhead.

11 Prototype Implementation NASD prototype drive runs on 133MHz, 64MB, Dec Alpha 3000/400 with two Seagate ST52160 disks attached by two 5 MB/s SCSI busses Intended to simulate a controller and drive NASD system implements own internal object access, cache, disk space management modules Figure 6 -- Performance for sequential reads and writes –Sequential bandwidth as function of request size –NASD better tuned for disk access on reads that miss cache –FFS better tuned for cache accesses –Write performance of FFS due to immediate acknowledgement for writes up to 64KB

12 Scalability 13 NASD drives, each linked by OC-3 ATM to 10 client machines Each client issues series of sequential 2MB read requests striped across four NASDs. Each NASD can deliver 32MB/s from cache to RPC protocol stack DCE RPC cannot push more than 80Mb/s through a 155 Mb/s ATM link before receiving client saturates Figure 7 demonstrates close to linear scaling up to 10 clients

13 Computational Requirements Table 1 -- number of instructions needed to service given request size including all communications (DCE RPC, UDP/IP) Overhead mostly due to communications Significantly more expensive than Seagate Barracuda

14 Filesystems for NASD NFS covered in last paper AFS -- lookup operations carried out by parsing directory files locally AFS RPCs added to obtain and relinquish capabilities explicitly AFS’s sequential consistency provided by breaking callbacks (notifying holders of potentially stale copies) when a write capability is issued File manager does’nt know that a write operation has arrived at a drive so it must tell clients when a write may occur No new callbacks on file with outstanding write capability AFS enforces per-volume quota on allocated disk space File manager allocates space when it issues a capability, and it keeps track of how much space is actually written to

15 Active Disks Provide full application-level programmability of drives Customize functionality for data intensive computations NASD’s object based interface provides knowledge of data at devices without having to use external metadata


Download ppt "Garth A. Gibson*, David F. Nagle**, William Courtright II*, Nat Lanza*, Paul Mazaitis*, Marc Unangst*, Jim Zelenka* "NASD Scalable Storage Systems",USENIX99,"

Similar presentations


Ads by Google