NeST: Network Storage John Bent, Venkateshwaran V Miron Livny, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau
NeST: Network storage Flexible, commodity based, software-only storage appliances A storage appliance must be – easy to deploy – self-configurable to minimize administrative costs – reliable but also recoverable – secure against intrusion yet transparent to legit users
From commodity to appliance Why build appliance on commodity system? – Want to ride commodity cost curve Avoid high cost of specialized servers (e.g. NetApp) – Leverage high availability of commodity systems – Allow on-demand acquisition and deployment Challenge of building on commodity systems – Must be portable across multiple operating systems Additional challenge to achieve high performance at user level – Must be adaptable across a range of storage devices
NeST structure Protocol layer – Pluggable protocols map diverse protocols into common control flows – Analogous to Linux v-nodes Transfer layer – Different concurrency architectures maximize system throughput across diverse platforms Storage layer – Provides abstract interface to disks and memory
NeST Structure Protocol Layer GFTPNeSTWiNDHTTPNFS Control Logic Concurrency Architecture Pool of processes Nonblocking Pool of threads Storage Layer Raw diskLocal FSRAIDMemory
Many Protocols, Single Server Single administrative interface - Set policies, manage user accounts Maintainable S/W - Shared code base reduces replication, increases maintainability Different protocols for different purposes - Grid FTP for wide-area transfers - Chirp for local-area accesses and reservations Single point of control - Storage quotas/guarantees can be supported - Bandwidth can be controlled & QoS provided
Concurrency architecture Three difficult goals – Low latency – High bandwidth – Multiple simultaneous clients No single portable solution – Multiple models provide solutions on a range of different platforms Multi-threaded Multi-process Single process event-driven – Control logic dynamically selects “best” model
Storage Layer Abstract storage models – Virtual storage model akin to virtual protocol layer – RAID, JBOD, etc. – Memory storage model also a possibility Provide file system interface to remote memory Useful for store and forward buffering systems like Kangaroo Single interface for storage resource management – Reservations – Access control – User and group management
NeST, Condor and the Grid Wide area scheduling requirements: – Buffers for wide area data movements – Staging for local replicas of distributed datasets – Guaranteed availability for informed scheduling – Flexible/dynamic management of user accounts Example: Distributed repository Local NeST 1) Reserve space at NeST 2) Initiate transfer from distributed repository 3) Schedule jobs at local cluster. 4) Jobs access data locally
Future work HTTP administrative interface Define metrics by which to measure – Deployability – Manageability – Reliability Allow reservations to have a “cost” Provide mechanisms by which files can have arbitrary, searchable metadata