Download presentation
Presentation is loading. Please wait.
Published byHope Franklin Modified over 9 years ago
1
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY HPSS Features and Futures Presentation to SCICOMP4 Randy Burris ORNL’s Storage Systems Manager
2
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY Table of Contents Background – design goals and descriptions General information Architecture How it works Infrastructure HPSS 4.3 – current release (as of Sept. 1) HPSS 4.5 HPSS 5.1 Background Main features
3
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS is… File-based storage system – software only. Extremely scalable, targeting: Millions of files; Multiple petabyte capacity; Gigabyte/second transfer rates; Single files ranging to terabyte size. Distributed: Multiple nodes; Multiple instances of most servers. Winner of an R&D 100 award (1997).
4
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS is … Developed by LLNL, Sandia, LANL, ORNL, NERSC, IBM Used in >40 very large installations ASCI (Livermore, Sandia, Los Alamos Labs) High-energy physics sites (SLAC, Brookhaven, other US sites and sites in Europe and Japan) NASA Universities As anExamples at ORNL Archiving systemARM Backup systemBackups of servers, O2000 Active repositoryClimate, bioinformatics, …
5
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY Example of the type of configuration HPSS is designed to support Control Parallel RAID Disk Farm Local Devices HPSS Server(s) Workstation Cluster or Parallel Systems Sequential Systems HIPPI/ GigE/ATM Network Parallel Tape Farm Visualization Engines Frame buffers HSI NFS FTP DFS Control Secondary Server(s) LANs Internet To Client Hosts WANs Throuhput Scalable to the GB/s Region
6
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS Software Architecture Diagram Communications Security Transaction Manager Metadata Manager Logging Infrastructure Services 64-bit Math Libraries ManagementManagement Client(s) - Client API - PFS Applications Data Management System Daemons: -HSI -FTP & PFTP - NFS - DFS Storage System Management (all components) Bitfile Servers Storage Servers Name Servers Location Servers Migration/ Purge Repack Movers NSL UniTree Migration Other Modules Green components are defined in the IEEE Mass Storage Reference Model. Common Infrastructure HPSS Software Architecture Physical Volume Library Physical Volume Respositories Installation
7
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY How’s it work? User stores a file using hsi, ftp, parallel ftp or nfs. It will be sent to a particular Class of Service (COS) depending upon user selection or defaults. Default COS specifies a hierarchy with disk at the top level and tape below it. So, file is first stored on disk (HPSS cache) When enough time elapses or the cache gets full enough, the file will automatically be copied to the next level - tape - and purged from disk.
8
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS Infrastructure HPSS depends upon (I.e., is layered over): Operating system (AIX or Solaris for core servers) Distributed Computing Environment (DCE) Security – authentication and authorization Name service Remote Procedure Calls Encina Structured File System – flat-file system used to store metadata such as file names, segment locations, etc. Encina is built upon DCE. GUI – Sammi product from Kinesix Distributed File System (DFS) – for some installations. DFS is built upon DCE
9
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.3 (Newest released version) Support for new hardware StorageTek 9940 tape drives IBM Linear Tape Open (LTO) tape drives and robots Sony GY-8240 tape drives Redundant Arrays of Independent Tapes An ASCI PathForward project contracted with StorageTek Target is multiple tape drives striped with parity
10
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.3 (continued) Mass configuration Earlier, each device or server had to be individually configured through the GUI Could be tedious and error-prone for installations with hundreds of drives or servers Mass configuration takes advantage of the command line interface (new with HPSS 4.2) Allows scripted configuration of devices and various types of servers.
11
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.3 (continued) Support for IBM High Availability configurations HACMP (High Availability Cluster MultiProcessor) hardware feature HACMP supporting AIX software Handles node and network interface failures Essentially a controlled failover to a spare node Initiated manually
12
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.3 (continued) Other features: Support for Solaris 8 Client API ported to Redhat Linux Support for NFS v3 By the way In our Probe testbed, we’re running HPSS 4.3 on AIX 5L on our S80 Not certified, just trying it to see what happens.
13
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.5 – target date 7/1/2002 Features Implement an efficient, transparent interface for users to access their HPSS data Uses HPSS as an archive Available freely for Linux (no licensing fee) Key requirements Support HPSS access via XFS using DMAPI XFS / HPSS filesystems shall be accessible via NFS for transparent access Support archived filesets (rename / delete) Support on Linux
14
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 4.5 (continued) Provide migration and purge from XFS based on policy Stage data from HPSS when data has been purged from XFS Support whole and partial file migration Support utilities for the following: Create / Delete XFS fileset metadata in HPSS List HPSS filenames in archived fileset List XFS names of files Compare archive dumps from HPSS and XFS Delete all files from HPSS side of XFS fileset Delete files older than a specified age from HPSS side Recover files deleted from XFS filesets not yet deleted from HPSS
15
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 5.1- release date Jan. 2003 Background HPSS was designed in 1992/1993 as a total rewrite of NSL UniTree. Goal – achieve speed using many parallel servers. The Distributed Computing Environment (DCE) was a prominent and promising infrastructure product Encina’s Structured File System (SFS) was the only product supporting distributed nested transactions. Management GUI mandated to be Sammi, from Kinesix, because of anticipated reuse of NSL UniTree screens.
16
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 5.1 Background (continued) Today: DCE – future in doubt Encina’s Structured File System Future in doubt Performance problems No longer need nested transactions Or distributed transactions Sammi relatively expensive and feature poor
17
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS 5.1 Features New basic structure DCE still used – still no alternative Designing a “core” server combining the name server, the bitfile server, the storage server and parts of the Client API Replacing SFS with a commercial DBMS – DB2 – but design and coding goal is easy replacement of the DBMS Expect considerable speed improvement Oracle and DB2 were both ~10 times faster than SFS in a model run in ORNL’s Probe testbed There is reduced communication between servers
18
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY HPSS Software Architecture Diagram Communications Security Transaction Manager Metadata Manager Logging Infrastructure Services 64-bit Math Libraries ManagementManagement Client(s) - Client API - PFS Applications Data Management System Daemons: -HSI -FTP & PFTP - NFS - DFS Storage System Management (all components) Bitfile Servers Storage Servers Name Servers Location Servers Migration/ Purge Repack Movers NSL UniTree Migration Other Modules Green components are defined in the IEEE Mass Storage Reference Model. Common Infrastructure HPSS Software Architecture Physical Volume Library Physical Volume Respositories Installation
19
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY New Java Admin Interface User benefits: Fast Immediately portable to Unix, Windows, Macintosh Picking up various manageability improvements Developer benefits Object oriented Much code sharing Central communication and processing engine Different presentation engines GUI ASCII for the command-line interface A third one, a Web interface, would be easy to add later Overall maintenance much easier - code generated from HPSS C structures
20
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY Future futures These topics are under discussion; no guarantees In each case, a gating function is the availability of staff to do the development. Modification to HPSS’s parallel ftp to comply with specs for GridFTP. Interest from ASCI, Argonne and others. GPFS/HPSS interface Participants - LLNL, LBNL, Indiana University and IBM Seeking further help SAN exploitation – gleam in the eye right now
21
OAK RIDGE NATIONAL LABORATORY U.S. DEPARTMENT OF ENERGY Questions? http://www4.clearlake.ibm.com/hpss/http://www4.clearlake.ibm.com/hpss/HPSS home page http://www.sdsc.edu/hpss/hpss1.htmlhttp://www.sdsc.edu/hpss/hpss1.htmlHPSS tutorial http://www.ccs.ornl.govhttp://www.ccs.ornl.govCenter for Comp. Sci. http://www.csm.ornl.govhttp://www.csm.ornl.govComputer Sci and Math Div http://www.csm.ornl.gov/PROBEhttp://www.csm.ornl.gov/PROBETestbed
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.