BlueWaters Storage Solution Michelle Butler NCSA January 19, 2016.

Slides:



Advertisements
Similar presentations
Computing Infrastructure
Advertisements

Advanced Lustre® Infrastructure Monitoring (Resolving the Storage I/O Bottleneck and managing the beast) Torben Kling Petersen, PhD Principal Architect.
Rhea Analysis & Post-processing Cluster Robert D. French NCCS User Assistance.
Architecture and Implementation of Lustre at the National Climate Computing Research Center Douglas Fuller National Climate Computing Research Center /
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
Katie Antypas NERSC User Services Lawrence Berkeley National Lab NUG Meeting 1 February 2012 Best Practices for Reading and Writing Data on HPC Systems.
Solaris Volume Manager M. Desouky. RAID Overview SDS Software SDS Installation SDS User Interfaces MD State Database Concats & Stripes Mirrors Hot Spares.
ManeFrame File Systems Workshop Jan 12-15, 2015 Amit H. Kumar Southern Methodist University.
An Overview of RAID Chris Erickson Graduate Student Department of Electrical and Computer Engineering Auburn University, Auburn, AL 36849
ASKAP Central Processor: Design and Implementation Calibration and Imaging Workshop 2014 ASTRONOMY AND SPACE SCIENCE Ben Humphreys | ASKAP Software and.
SUMS Storage Requirement 250 TB fixed disk cache 130 TB annual increment for permanently on- line data 100 TB work area (not controlled by SUMS) 2 PB near-line.
NWfs A ubiquitous, scalable content management system with grid enabled cross site data replication and active storage. R. Scott Studham.
What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC Andrew Hanushevsky Stanford.
Data Storage Willis Kim 14 May Types of storages Direct Attached Storage – storage hardware that connects to a single server Direct Attached Storage.
Simplifying Storage and Accelerating Results
Lustre at Dell Overview Jeffrey B. Layton, Ph.D. Dell HPC Solutions |
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
Data oriented job submission scheme for the PHENIX user analysis in CCJ Tomoaki Nakamura, Hideto En’yo, Takashi Ichihara, Yasushi Watanabe and Satoshi.
Project Overview:. Longhorn Project Overview Project Program: –NSF XD Vis Purpose: –Provide remote interactive visualization and data analysis services.
Eos Center-wide File Systems Chris Fuson Outline 1 Available Center-wide File Systems 2 New Lustre File System 3 Data Transfer.
RAID REDUNDANT ARRAY OF INEXPENSIVE DISKS. Why RAID?
Larry Marx and the Project Athena Team. Outline Project Athena Resources Models and Machine Usage Experiments Running Models Initial and Boundary Data.
1 - Q Copyright © 2006, Cluster File Systems, Inc. Lustre Networking with OFED Andreas Dilger Principal System Software Engineer
Optimizing Performance of HPC Storage Systems
Hosted by Case Study - Storage Consolidation Steve Curry Yahoo Inc.
Pawsey Site Report Andrew Elwell. Overview Compute Resources Installed Systems (/scratch, /scratch2, /group) /scratch purge policy (robinhood)
Small File File Systems USC Jim Pepin. Level Setting  Small files are ‘normal’ for lots of people Metadata substitute (lots of image data are done this.
HPC at HCC Jun Wang Outline of Workshop1 Overview of HPC Computing Resources at HCC How to obtain an account at HCC How to login a Linux cluster at HCC.
Big Red II & Supporting Infrastructure Craig A. Stewart, Matthew R. Link, David Y Hancock Presented at IUPUI Faculty Council Information Technology Subcommittee.
Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.
SIMPLE DOES NOT MEAN SLOW: PERFORMANCE BY WHAT MEASURE? 1 Customer experience & profit drive growth First flight: June, minute turn at the gate.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
Scientific Computing Experimental Physics Lattice QCD Sandy Philpott May 20, 2011 IT Internal Review 12GeV Readiness.
GStore: GSI Mass Storage ITEE-Palaver GSI Horst Göringer, Matthias Feyerabend, Sergei Sedykh
03/03/09USCMS T2 Workshop1 Future of storage: Lustre Dimitri Bourilkov, Yu Fu, Bockjoo Kim, Craig Prescott, Jorge L. Rodiguez, Yujun Wu.
SAN DIEGO SUPERCOMPUTER CENTER SDSC's Data Oasis Balanced performance and cost-effective Lustre file systems. Lustre User Group 2013 (LUG13) Rick Wagner.
HPCC Mid-Morning Break File System Tips & Tricks.
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
Terascala – Lustre for the Rest of Us  Delivering high performance, Lustre-based parallel storage appliances  Simplifies deployment, management and tuning.
Active Storage Processing in Parallel File Systems Jarek Nieplocha Evan Felix Juan Piernas-Canovas SDM CENTER.
Scientific Storage at FNAL Gerard Bernabeu Altayo Dmitry Litvintsev Gene Oleynik 14/10/2015.
CERN Computer Centre Tier SC4 Planning FZK October 20 th 2005 CERN.ch.
Accelerating High Performance Cluster Computing Through the Reduction of File System Latency David Fellinger Chief Scientist, DDN Storage ©2015 Dartadirect.
S AN D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE 6 TB SSA Disk StorageTek Tape Libraries 830 GB MaxStrat.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Model-driven Data Layout Selection for Improving Read Performance Jialin Liu 1, Bin Dong 2, Surendra Byna 2, Kesheng Wu 2, Yong Chen 1 Texas Tech University.
Predrag Buncic CERN Data management in Run3. Roles of Tiers in Run 3 Predrag Buncic 2 ALICEALICE ALICE Offline Week, 01/04/2016 Reconstruction Calibration.
ORNL is managed by UT-Battelle for the US Department of Energy OLCF HPSS Performance Then and Now Jason Hill HPC Operations Storage Team Lead
Lustre at Pawsey Andrew Elwell. (2015) Future Plans New hardware for /group Graph ALL THE THINGS Fabric refactor (Block v Lustre) Upgrade both sonexions.
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Compute and Storage For the Farm at Jlab
MERANTI Caused More Than 1.5 B$ Damage
Experience of Lustre at QMUL
Vanderbilt Tier 2 Project
Experience of Lustre at a Tier-2 site
CERN Lustre Evaluation and Storage Outlook
Computing Infrastructure for DAQ, DM and SC
Research Data Archive - technology
Steve Woods, Solutions Architect
Kirill Lozinskiy NERSC Storage Systems Group
Real IBM C exam questions and answers
Welcome to our Nuclear Physics Computing System
Lock Ahead: Shared File Performance Improvements
Enterprise Class Virtual Tape Libraries
Presentation transcript:

BlueWaters Storage Solution Michelle Butler NCSA January 19, 2016

Overview Disk Storage Solution - Hardware and Software Archive Storage Solution – Hardware and Software Lustre – User needs to know more Our current work 2

Disk Storage Solution 3 Presentation Title /project 2.16 PB /home /scratch 21.6 PB QDR InfiniBand I/E DM esLoginesLogin esLoginesLogin HPSS DM UsersUsers UsersUsers

Disk Storage System – Lustre 2.1 OpenSFS SonExion Lustre Appliance from Cray /home and /project 2.16 PB useable 3 cabinets with 6 scalable storage units each. Each SSU has 2 sister pair failover OSS Each OSS has 8 OSTs of 10 drives each (80 drives) Total of 144 OSTs Software RAID in 8+2 config with 2 hot spares disks /scratch 21.60PB useable 30 SonExion cabinets 1440 OST’s Total aggregate throughput of > 1TB/s 4 Presentation Title

Archive Storage System RFP1107 from UofI - 100GB/s with 500PB media ~420 TS1140 drives with hot spares added 7 SpectraLogic tape libraries for 380PB slots ½ environment in 2012 ~ March 2012 ½ environment in early 2013 after production Buying 4TB tapes in chunks of 20PB $$ of media was the largest portion of cost 5 Presentation Title

Lustre – User Perspective Lustre does not stripe data on behalf of the user. Directories will be defaulted to a default stripe depth Users can change the stripe depth easily based on their data, but do need experience at these types of settings. Stripe depth is set by #of OST’s A single file can not stripe over 160 OST’s today. Stay tuned for best practices for the file systems including stripe depth and block sizes. 6 Presentation Title

Lustre – User perspective Current work Job Placement and OST’s to use for read and write Working with I/O libraries for Lustre file system configurations (Adios, NetCDF, HDF5) Data movement to and from archive Before job begins have all data in Lustre Store data that is wanted to be kept All within quotas Parallel user commands such as grep, cp, mv… 7 Presentation Title

Other Projects RAIT for HPSS Stability for these large file systems Back up of /home and /project file systems Data movement from outside projects and other large NSF centers and use of those resources OpenSFS directions for Lustre 8 Presentation Title