ORNL is managed by UT-Battelle for the US Department of Energy OLCF HPSS Performance Then and Now Jason Hill HPC Operations Storage Team Lead 2016-05-25.

Slides:



Advertisements
Similar presentations
Andrew Hanushevsky7-Feb Andrew Hanushevsky Stanford Linear Accelerator Center Produced under contract DE-AC03-76SF00515 between Stanford University.
Advertisements

Skyward Disaster Recovery Options
Rhea Analysis & Post-processing Cluster Robert D. French NCCS User Assistance.
Architecture and Implementation of Lustre at the National Climate Computing Research Center Douglas Fuller National Climate Computing Research Center /
HPSS Update Jason Hick Mass Storage Group NERSC User Group Meeting September 17, 2007.
Scale-out Central Store. Conventional Storage Verses Scale Out Clustered Storage Conventional Storage Scale Out Clustered Storage Faster……………………………………………….
Distributed IT Infrastructure for U.S. ATLAS Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
ASKAP Central Processor: Design and Implementation Calibration and Imaging Workshop 2014 ASTRONOMY AND SPACE SCIENCE Ben Humphreys | ASKAP Software and.
UNCLASSIFIED: LA-UR Data Infrastructure for Massive Scientific Visualization and Analysis James Ahrens & Christopher Mitchell Los Alamos National.
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.
PetaByte Storage Facility at RHIC Razvan Popescu - Brookhaven National Laboratory.
ORNL is managed by UT-Battelle for the US Department of Energy Data Management User Guide Suzanne Parete-Koon Oak Ridge Leadership Computing Facility.
© 2009 IBM Corporation Statements of IBM future plans and directions are provided for information purposes only. Plans and direction are subject to change.
ORNL is managed by UT-Battelle for the US Department of Energy Globus: Proxy Lifetime Endpoint Lifetime Oak Ridge Leadership Computing Facility.
Mass RHIC Computing Facility Razvan Popescu - Brookhaven National Laboratory.
Status Report on Tier-1 in Korea Gungwon Kang, Sang-Un Ahn and Hangjin Jang (KISTI GSDC) April 28, 2014 at 15th CERN-Korea Committee, Geneva Korea Institute.
CERN IT Department CH-1211 Geneva 23 Switzerland t Experience with NetApp at CERN IT/DB Giacomo Tenaglia on behalf of Eric Grancher Ruben.
Eos Center-wide File Systems Chris Fuson Outline 1 Available Center-wide File Systems 2 New Lustre File System 3 Data Transfer.
SDSC RP Update TeraGrid Roundtable Reviewing Dash Unique characteristics: –A pre-production/evaluation “data-intensive” supercomputer based.
Presented by The Earth System Grid: Turning Climate Datasets into Community Resources David E. Bernholdt, ORNL on behalf of the Earth System Grid team.
CC - IN2P3 Site Report Hepix Fall meeting 2009 – Berkeley
Big Red II & Supporting Infrastructure Craig A. Stewart, Matthew R. Link, David Y Hancock Presented at IUPUI Faculty Council Information Technology Subcommittee.
Managed by UT-Battelle for the Department of Energy 1 Integrated Catalogue (ICAT) Auto Update System Presented by Jessica Feng Research Alliance in Math.
Corral: A Texas-scale repository for digital research data Chris Jordan Data Management and Collections Group Texas Advanced Computing Center.
High Performance Storage System Harry Hulen
Large Scale Test of a storage solution based on an Industry Standard Michael Ernst Brookhaven National Laboratory ADC Retreat Naples, Italy February 2,
Tier 1 Facility Status and Current Activities Rich Baker Brookhaven National Laboratory NSF/DOE Review of ATLAS Computing June 20, 2002.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
Presented by Leadership Computing Facility (LCF) Roadmap Buddy Bland Center for Computational Sciences Leadership Computing Facility Project.
Katie Antypas User Services Group Lawrence Berkeley National Lab 17 February 2012 JGI Training Series.
Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief.
Large Scale Parallel File System and Cluster Management ICT, CAS.
SAN DIEGO SUPERCOMPUTER CENTER SDSC's Data Oasis Balanced performance and cost-effective Lustre file systems. Lustre User Group 2013 (LUG13) Rick Wagner.
HPSS for Archival Storage Tom Sherwin Storage Group Leader, SDSC
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY HPSS Features and Futures Presentation to SCICOMP4 Randy Burris ORNL’s Storage Systems Manager.
CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)
Bosch DSA Storage (based on NetApp E2700)
CENTER FOR HIGH PERFORMANCE COMPUTING Introduction to I/O in the HPC Environment Brian Haymore, Sam Liston,
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
1 Overall Architectural Design of the Earth System Grid.
ORNL is managed by UT-Battelle for the US Department of Energy OLCF News Suzanne Parete-Koon Oak Ridge Leadership Computing Facility.
BlueWaters Storage Solution Michelle Butler NCSA January 19, 2016.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
RHIC/US ATLAS Tier 1 Computing Facility Site Report Christopher Hollowell Physics Department Brookhaven National Laboratory HEPiX Upton,
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Office of Science U.S. Department of Energy NERSC Site Report HEPiX October 20, 2003 TRIUMF.
NOAA R&D High Performance Computing Colin Morgan, CISSP High Performance Technologies Inc (HPTI) National Oceanic and Atmospheric Administration Geophysical.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
High Availability Environments cs5493/7493. High Availability Requirements Achieving high availability Redundancy of systems Maintenance Backup & Restore.
9/22/10 OSG Storage Forum 1 CMS Florida T2 Storage Status Bockjoo Kim for the CMS Florida T2.
CC-IN2P3 Pierre-Emmanuel Brinette Benoit Delaunay IN2P3-CC Storage Team 17 may 2011.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Data storage services at CC-IN2P3 Jean-Yves Nief.
Computer System Replacement at KEK K. Murakami KEK/CRC.
Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.
Master video editing and 4K workflow with QNAP Thunderbolt Solutions
© Thomas Ludwig Prof. Dr. Thomas Ludwig German Climate Computing Center (DKRZ) University of Hamburg, Department for Computer Science (UHH/FBI) Disks,
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Packard Campus for Audio Visual Conservation Formerly NAVCC Architecting for Data Integrity.
The demonstration of Lustre in EAST data system
Research Data Archive - technology
Kirill Lozinskiy NERSC Storage Systems Group
USF Health Informatics Institute (HII)
HII Technical Infrastructure
JDAT Production Hardware
San Diego Supercomputer Center
Presentation transcript:

ORNL is managed by UT-Battelle for the US Department of Energy OLCF HPSS Performance Then and Now Jason Hill HPC Operations Storage Team Lead

2 Presentation_name Overview Where we came from Where we are today Performance analysis On the horizon Questions

3 Presentation_name OLCF HPSS ~ CY x DDN SFA10K (10GB/s ea) –2 PB disk cache capacity 8 Disk movers –Responsible for data ingress and migration to tape –10GbE Networking (~1GB/s each) 120 Oracle T10K-{A,B,C} tape drives –~150 MB/s each T10K-{A,B} –~250 MB/s each for T10K-C

4 Presentation_name OLCF HPSS ~ x DDN SFA10K (10 GB/s ea) –2 PB disk cache capacity 1 x NetApp E5560 –Files < 16 MB –330 TB of capacity, 150 TB utilized 8 Disk movers –Responsible for data ingress and migration to tape –10GbE Networking (~1GB/s each) 40 GbE Network switch to Disk movers 120 Oracle T10K-{A,B,C} tape drives –~150 MB/s each 32 Oracle T10K-D tape drives –252 MB/s each

5 Presentation_name OLCF HPSS ~2014 Disk 2 x DDN SFA10K (10 GB/s ea) –2 PB disk cache capacity 3 x DDN SFA12K (40 GB/s) –~12 PB raw capacity for cache 1 x NetApp E5560 –Files < 16 MB –330 TB of capacity, 150 TB utilized 20 Disk movers –Responsible for data ingress and migration to tape –40 GbE Networking Tape 120 Oracle T10K-{A,B,C} tape drives –~150 MB/s each 32 Oracle T10K-D tape drives –252 MB/s each Network 2 x Arista GbE switches –Connectivity to Disk movers –Connectivity between Disk and Tape movers –13 x 100 GbE ISL’s

6 Presentation_name OLCF HPSS ~2015 Disk 1 x DDN SFA10K 5 x DDN SFA12K (40 GB/s) –~20 PB raw capacity for cache 1 x NetApp E5560 –Files < 16 MB –330 TB of capacity, 150 TB utilized 44 Disk movers –40 GbE Networking Tape 120 Oracle T10K-{A,B,C} tape drives –~150 MB/s each 72 Oracle T10K-D tape drives –252 MB/s each Network 2 x Arista GbE switches –Connectivity to Disk movers –Connectivity between Disk and Tape movers –13 x 100 GbE ISL’s

7 Presentation_name OLCF HPSS ~2016 Disk 5 x DDN SFA12K (40 GB/s) –~20 PB raw capacity for cache 1 x NetApp E5560 –Files < 16 MB –330 TB of capacity, 150 TB utilized 40 Disk movers –40 GbE Networking Tape 112 Oracle T10K-D tape drives –252 MB/s each Network 2 x Arista GbE switches –Connectivity to Disk movers –Connectivity between Disk and Tape movers –13 x 100 GbE ISL’s Metadata NetApp EF560 –SAS connected; All Flash

8 Presentation_name Performance analysis In 2013 asked how long to put in 1PB of data to HPSS –Getting it to disk took 21 days. –Migration to tape took another 35 days. –Single directory, multiple files Processed serially from single node This seems inefficient, right? Drive investments in hardware and user experience

9 Presentation_name Performance Analysis In February, a user moved 1.3 PB of data into HPSS in 8 days. –Utilized DTN scheduled queue –Requested extended wall time (not needed in the end) –Used HTAR utility Migration to tape took ~12 additional days to complete Significant improvement over 2013 data point Other success stories out there Date# Files TB TransferredDaily Total 2/10/ /11/ /12/ /13/ /14/ /15/ /16/

10 Presentation_name More HPSS data points Not uncommon for ingest to be over 100 TB per day into HPSS HPSS currently has over 52 PB stored Continuous investments to improve user experience and data security –Redundant Array of Inexpensive Tapes (RAIT) deployed in –Migrating away from T10K-{A,B,C} technology to make way for the next generation tape drive and media –Significant improvements in R/W speeds and capacities

11 Presentation_name Looking to the future Lustre purging is ongoing –Scratch areas are 14 days –Project areas are 15 days Storage controllers, unforeseen circumstances can cause data loss on $WORKDIR filesystems Currently have HSI/HTAR from login nodes, interactive DTN nodes, scheduled DTN nodes Working to deploy a Globus interface to HPSS –Some hurdles to clear, but progress is meeting expectations What are your hurdles to using HPSS more?

12 Presentation_name Questions? Jason Hill hilljj at ornl dot gov 12Managed by UT-Battelle for the Department of Energy The research and activities described in this presentation were performed using the resources of the National Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC0500OR22725.

13 Presentation_name Questions? Thank you!!! hilljj at ornl dot gov