Kjiersten Fagnan JGI/NERSC Consultant

Slides:

Advertisements

Similar presentations

Chapter 20 Oracle Secure Backup.

Advertisements

2 Copyright © 2005, Oracle. All rights reserved. Installing the Oracle Database Software.

MUNIS Platform Migration Project WELCOME. Agenda Introductions Tyler Cloud Overview Munis New Features Questions.

1 Deciding When to Forget in the Elephant File System University of British Columbia: Douglas. S. Santry, Michael J. Feeley, Norman C. Hutchinson, Ross.

7.1 Advanced Operating Systems Versioning File Systems Someone has typed: rm -r * However, he has been in the wrong directory. What can be done? Typical.

Chapter One The Essence of UNIX.

SOFTWARE PRESENTATION ODMS (OPEN SOURCE DOCUMENT MANAGEMENT SYSTEM)

DESIGNING A PUBLIC KEY INFRASTRUCTURE

File Management Systems

Introducing the Command Line CMSC 121 Introduction to UNIX Much of the material in these slides was taken from Dan Hood’s CMSC 121 Lecture Notes.

Chapter 12 File Management Systems

What is it? Hierarchical storage software developed in collaboration with five US department of Energy Labs since 1992 Allows storage management of 100s.

ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.

MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.

9 Copyright © Oracle Corporation, All rights reserved. Oracle Recovery Manager Overview and Configuration.

CHAPTER 17 Configuring RMAN. Introduction to RMAN RMAN was introduced in Oracle 8.0. RMAN is Oracle’s tool for backup and recovery. RMAN is much more.

Agenda  Overview  Configuring the database for basic Backup and Recovery  Backing up your database  Restore and Recovery Operations  Managing your.

Maintaining Windows Server 2008 File Services

1 Chapter Overview Creating User and Computer Objects Maintaining User Accounts Creating User Profiles.

Amazon EC2 Quick Start adapted from EC2_GetStarted.html.

NovaBACKUP 10 xSP Technical Training By: Nathan Fouarge

Database Security and Auditing: Protecting Data Integrity and Accessibility Chapter 3 Administration of Users.

70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 14: Problem Recovery.

A crash course in njit’s Afs

SQL Server 2008 Implementation and Maintenance Chapter 7: Performing Backups and Restores.

PPOUG, 05-OCT-01 Agenda RMAN Architecture Why Use RMAN? Implementation Decisions RMAN Oracle9i New Features.

Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.

Hands-On Microsoft Windows Server 2008 Chapter 5 Configuring, Managing, and Troubleshooting Resource Access.

Kirsten Fagnan NERSC User Services Februar 12, 2013 Getting Started at NERSC.

Maintaining a Microsoft SQL Server 2008 Database SQLServer-Training.com.

Chapter 2. Creating the Database Environment

Introduction and simple using of Oracle Logistics Information System Yaxian Yao

Yavor Todorov. Introduction How it works OS level checkpointing Application level checkpointing CPR for parallel programing CPR functionality References.

Help session: Unix basics Keith 9/9/2011. Login in Unix lab  User name: ug0xx Password: ece321 (initial)  The password will not be displayed on the.

Welcome to Linux & Shell Scripting Small Group How to learn how to Code Workshop small-group/

Eos Center-wide File Systems Chris Fuson Outline 1 Available Center-wide File Systems 2 New Lustre File System 3 Data Transfer.

1 Chapter 12 File Management Systems. 2 Systems Architecture Chapter 12.

Section 15.1 Identify Webmastering tasks Identify Web server maintenance techniques Describe the importance of backups Section 15.2 Identify guidelines.

Chapter 18: Windows Server 2008 R2 and Active Directory Backup and Maintenance BAI617.

Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.

Managing User Accounts. Module 2 – Creating and Managing Users ♦ Overview ► One should log into a Linux system with a valid user name and password granted.

IT 456 Seminar 5 Dr Jeffrey A Robinson. Overview of Course Week 1 – Introduction Week 2 – Installation of SQL and management Tools Week 3 - Creating and.

Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.

Katie Antypas User Services Group Lawrence Berkeley National Lab 17 February 2012 JGI Training Series.

Htar Hpss Tape Archiver Client API-based interface written by Mike Gleicher Originally commissioned for LLNL in 2000 Now available as part of the HPSS.

NCAR storage accounting and analysis possibilities David L. Hart, Pam Gillman, Erich Thanhardt NCAR CISL July 22, 2013

 CASTORFS web page - CASTOR web site - FUSE web site -

CLASS Information Management Presented at NOAATECH Conference 2006 Presented by Pat Schafer (CLASS-WV Development Lead)

3 Copyright © 2006, Oracle. All rights reserved. Using Recovery Manager.

IT1001 – Personal Computer Hardware & system Operations Week7- Introduction to backup & restore tools Introduction to user account with access rights.

Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.

| nectar.org.au NECTAR TRAINING Module 9 Backing up & Packing up.

GridKa December 2004 Forschungszentrum Karlsruhe in der Helmholtz-Gemeinschaft Doris Ressmann dCache Implementation at FZK Forschungszentrum Karlsruhe.

CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.

Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.

Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.

Configuring the User and Computer Environment Using Group Policy Lesson 8.

High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.

Advanced Computing Facility Introduction

Compute and Storage For the Farm at Jlab

GRID COMPUTING.

Creating an Oracle Database

Section 15.1 Section 15.2 Identify Webmastering tasks

Kirill Lozinskiy NERSC Storage Systems Group

Welcome to our Nuclear Physics Computing System

Welcome to our Nuclear Physics Computing System

Chapter 2: Operating-System Structures

Introduction to Operating Systems

Chapter 2: Operating-System Structures

IBM Tivoli Storage Manager

Presentation transcript:

Kjiersten Fagnan JGI/NERSC Consultant JGI Data Migration Party! Kjiersten Fagnan JGI/NERSC Consultant September 27, 2013

Agenda Objectives Motivation Transferring data between file systems Describe the file systems and where data should be stored Gain hands-on experience with data migration tools Develop strategies for data management in your analysis Motivation Where is my data??? Why so many file systems – can’t we keep /house? What is a “high-performance” file system? Transferring data between file systems File transfer protocols Moving data from /house Reading and writing data from my scripts Introduction to the NERSC Archive background mistakes to avoid

File system overview

Pop quiz!! What’s the name of the file system that’s retiring? Where should you write data from your compute jobs on the cluster? What file system do you land in when you log into Genepool (login nodes, gpints, etc)? How many file systems are available to the JGI? Where do you have personal directories? What are the quotas on those directories? When was the last time you accessed a file on /house?

Timeline refresher We’re here already 8 weeks to go!

Don’t let this be you in December!

Old strategy House was a collection of ALL the data at the JGI Number of files: 583 Million Average time since file last accessed: 2 years!!!! Backup policy: snapshots on some directories, backups of entire system have not worked properly for ~1 year

New strategy – Multiple file systems /projectb 2.6PB SCRATCH/Sandboxes = “Wild West” Write here from compute jobs WebFS small file system for web servers mounted on gpwebs and in xfer queue Working Directories 2.6 PB Web Services 100TB Shared Data 1 Pb Sequencer Data 500TB DnA Project directories, finished products NCBI databases, etc Read-only on compute nodes, read-write in xfer queue SeqFS 500 TB File system accessible to sequencers at JGI

ProjectB SCRATCH (/projectb/scratch/<username>) NERSC Presentation 4/16/2017 ProjectB SCRATCH (/projectb/scratch/<username>) Each user has 20TB of SCRATCH space There are 300 users with SCRATCH space on ProjectB – if all these directories fill up, how much space would that require? PURGE POLICY – any file not used for 90+ days will be deleted SANDBOXES (/projectb/sandbox/<program>) Each program has a sandbox area, quotas total 1PB Directories are meant for active projects that require more than 90 days to complete – managed by each group Quotas are not easily increased – requires JGI management approval This space is expensive 5.95 PB >> the entirety of projectb

DnA – Data n’ Archive dm_archive (/global/dna/dm_archive) JAMO’s data repository (where files will stay on spinning disk until they expire); owned by JGI archive account shared (migrating from ProjectB) /global/projectb/shared/<dir name> /global/dna/shared/<dir name> NCBI databases Test datasets for benchmarks, software tests projectdirs (migrating from ProjectB) /global/projectb/projectdirs/<dir name> /global/dna/projectdirs/<dir name> place for data shared between groups that you do not want to register with JAMO (shared code, configuration files) will be backed up if less than 5TB (backups not in place yet)

WebFS Small file system for the web server configuration files Ingest for files uploaded through web services VERY SMALL and LOW PERFORMANCE file system – NOT intended for heavy I/O

SeqFS File system for the Illumina sequencers Predominantly used by the SDM group Raw data is moved from SeqFS to DnA with JAMO – you will only read the raw data from DnA, you will never use SeqFS directly

Summary PURPOSE PROS CONS $HOME Store application code, compile files Backed up, not purged Low performing; Low quota /projectb/scratch Large temporary files, checkpoints Highest performing Purged /projectb/sandbox Highest performing No purge; low quota $DNAFS /global/dna/ For groups needing shared data access Optimized for reading data Shared file performance; read-only on compute nodes $GSCRATCH Alternative scratch space Data available on almost all NERSC systems Shared file performance;

NERSC Presentation 4/16/2017 A high-performance parallel file system efficiently manages concurrent file access Compute Nodes MDS I/O Internal Network I/O Servers External Network - (Likely FC) Your laptop has a file system, referred to as a “local file system” A networked file system allows multiple clients to access files Treats concurrent access to the same file as a rare event A parallel file system builds on concept of networked file system Efficiently manages hundreds to thousands of processors accessing the same file concurrently Coordinates locking, caching, buffering and file pointer challenges Scalable and high performing Files Directories Access permissions File pointers File descriptors Moving data between memory and storage devices Coordinating concurrent access to files Managing the allocation and deletion of data blocks on the storage devices Data recovery Disk controllers - manage failover Storage Hardware -- Disks

Moving Data

Transfers within NERSC Recommended nodes for transfers from /house dtn03.nersc.gov, dtn04.nersc.gov (DTNs) schedule jobs in the xfer queue Recommended nodes for transfers to/from ProjectB schedule jobs in the xfer queue for transfers to DnA DTNs or Genepool phase 2 nodes for transfers to the archive Recommended nodes for transfers to DnA use the DTNs or genepool{10,11,12}.nersc.gov

Using the xfer queue on Genepool The batch system (UGE) is a great way to transfer data from ProjectB to DnA kmfagnan@genepool12 ~ $ cat projb_to_dna.sh #!/bin/bash –l #$ -N projb2dna #$ -q xfer.q (or –l xfer.c) rsync files $DNAFS/projectdirs/<dir> kmfagnan@genepool12 ~ $ qsub projb_to_dna.sh

Using the xfer queue on Genepool The batch system (UGE) is a great way to transfer data from ProjectB to DnA kmfagnan@genepool12 ~ $ cat projb_to_dna.sh #!/bin/bash –l #$ -N projb2dna #$ -q xfer.q (or –l xfer.c) rsync files $DNAFS/projectdirs/<dir> kmfagnan@genepool12 ~ $ qsub projb_to_dna.sh Each user can run up to 2 transfers at a time Only meant for transfer, no CPU-intensive jobs

Data Transfer Nodes Nodes that are well-connected to the file systems and outside world 10Gb/s connection to the /house file system Optimized for data transfer Interactive No time limit Limited environment – NOT the same as the Genepool nodes

Let’s move some data Log in to Genepool What directory are you in? Do the following: echo $HOME echo $SCRATCH echo $BSCRATCH echo $GSCRATCH echo $DNAFS Pick a file and decide where you want to move it

Archive Basics

What is an archive? Long-term storage of permanent records and information Often data that is no longer modified or regularly accessed Storage time frame is indefinite or as long as possible Archive data typically has, or may have, long-term value to the organization An archive is not a backup A backup is a copy of production data Value and retention of backup data is short-term A backup is a copy of data. An archive is the data.

Why should I use an archive? Data growth is exponential File system space is finite 80% of stored data is never accessed after 90 days The cost of storing infrequently accessed data on spinning disk is prohibitive Important, but less frequently accessed data should be stored in an archive to free faster disk for processing workload

Features of the NERSC archive NERSC implements an “active archive” NERSC archive supports parallel high-speed transfer and fast data access Data is transferred over parallel connections to the NERSC internal 10Gb network Access to first byte in seconds or minutes as opposed to hours or days The system is architected and optimized for ingest The archive uses tiered storage internally to facilitate high speed data access Initial data ingest to high-performance FC disk cache Data migrated to enterprise tape system and managed by HSM software (HPSS) based on age and usage The NERSC archive is a shared multi-user system Shared resource, no batch system. Inefficient use affects others. Session limits are enforced

Features of the NERSC archive, continued The NERSC archive is a Hierarchical Storage Management system (HSM) Highest performance requirements and access characteristics at top level Lowest cost, greatest capacity at lower levels Migration between levels is automatic, based on policies Latency Fast Disk High Capacity Disk Local Disk or Tape Remote Disk or Tape Capacity

Using the NERSC Archive

How to Log In The NERSC archive uses an encrypted key for authentication Key placed in ~/.netrc file at the top level of the user’s home directory on the compute platform All NERSC HPSS clients use the same .netrc file The key is IP specific. Must generate a new key for use outside the NERSC network. Archive keys can be generated in two ways Automatic: NERSC auth service Log into any NERSC compute platform using ssh Type “hsi” Enter NERSC password Manual: https://nim.nersc.gov/ web site Under “Actions” drop down, select “Generate HPSS Token” Copy/paste content into ~/.netrc chmod 600 ~/.netrc

Storing and Retrieving Files with HSI HSI provides a Unix-like command line interface for navigating archive files and directories Standard Unix commands such as ls, mkdir, mv, rm, chown, chmod, find, etc. are supported FTP-like interface for storing and retrieving files from the archive (put/get) Store from file system to archive: -bash-3.2$ hsi A:/home/n/nickb-> put myfile put 'myfile' : '/home/n/nickb/myfile' ( 2097152 bytes, 31445.8 KBS (cos=4)) Retrieve file from archive to file system: A:/home/n/nickb-> get myfile get 'myfile' : '/home/n/nickb/myfile' (2010/12/19 10:26:49 2097152 bytes, 46436.2 KBS ) Full pathname or rename file during transfer: A:/home/n/nickb-> put local_file : hpss_file A:/home/n/nickb-> get local_file : hpss_file

Storing and Retrieving Directories with HTAR HTAR stores a Unix tar-compatible bundle of files (aggregate) in the archive Traverses subdirectories like tar No local staging space required--aggregate stored directly into the archive Recommended utility for storing small files Some limitations 5M member files 64GB max member file size 155/100 path/filename character limitation Max archive file size* currently 10TB Syntax: htar [options] <archive file> <local file|dir> Store -bash-3.2$ htar –cvf /home/n/nickb/mydir.tar ./mydir List -bash-3.2$ htar –tvf /home/n/nickb/mydir.tar Retrieve -bash-3.2$ htar –xvf /home/n/nickb/mydir.tar [file…] * By configuration, not an HPSS limitation

Avoiding Common Mistakes

Small Files Tape storage systems do not work well with large numbers of small files Tape is sequential media—tapes must be mounted in drives and positioned to specific locations for IO to occur Mounting and positioning tapes are the slowest system activities Small file retrieval incurs delays due to high volume of tape mounts and tape positioning Small files stored periodically over long periods of time can be written to hundreds of tapes—especially problematic for retrieval Use HTAR when possible to optimize small file storage and retrieval Recommend file sizes in the 10s – 100s of GB

Large Directories Each HPSS system is backed by a single metadata server Metadata is stored in a single SQL database instance Every user interaction causes database activity Metadata-intensive operations incur delays Recursive operations such as “chown –R ./*” may take longer than expected Directories containing more than a few thousand files may become difficult to work with interactively -bash-3.2$ time hsi –q ‘ls –l /home/n/nickb/tmp/testing/80k-files/’ > /dev/null 2>&1 real 20m59.374s user 0m7.156s sys 0m7.548s

Large Directories, continued hsi “ls –l” exponential delay:

Long-running Transfers Failure prone for a variety of reasons Transient network issues, planned/unplanned maintenance, etc. Many clients do not have capability to resume interrupted transfers Can affect archive internal data management (migration) performance Recommend keeping transfers to 24hrs or less if possible

Hands-on Examples

Logging into archive: Hands-on Using ssh, log into any NERSC compute platform -bash-3.2$ ssh dtn01.nersc.gov Start HPSS storage client “hsi” -bash-3.2$ hsi Enter NERSC password at prompt (first time only) Generating .netrc entry... nickb@auth2.nersc.gov's password: You should now be logged into your archive home directory Username: nickb UID: 33065 Acct: 33065(33065) Copies: 1 Firewall: off [hsi.3.4.5 Wed Jul 6 16:14:55 PDT 2011][V3.4.5_2010_01_27.01] A:/home/n/nickb-> quit Subsequent logins are now automated

Using HSI: Hands-on Using ssh, log into any NERSC compute platform -bash-3.2$ ssh dtn01.nersc.gov Create a file in your home directory -bash-3.2$ echo foo > abc.txt Start HPSS storage client “hsi” -bash-3.2$ hsi Store file in archive A:/home/n/nickb-> put abc.txt Retrieve file and rename A:/home/n/nickb-> get abc_1.txt : abc.txt A:/home/n/nickb-> quit Compare files* -bash-3.2$ sha1sum abc.txt abc_1.txt f1d2d2f924e986ac86fdf7b36c94bcdf32beec15 abc.txt f1d2d2f924e986ac86fdf7b36c94bcdf32beec15 abc_1.txt * Note: checksums supported in the next HSI release with: ‘hsi ‘put –c on local_file : remote_file’

Using HTAR: Hands-on Using ssh, log into any NERSC compute platform -bash-3.2$ ssh dtn01.nersc.gov Create a subdirectory in your home directory -bash-3.2$ mkdir mydir Create a few files in the subdirectory -bash-3.2$ echo foo > ./mydir/a.txt -bash-3.2$ echo bar > ./mydir/b.txt Store subdirectory in archive as “mydir.tar” with HTAR -bash-3.2$ htar –cvf mydir.tar ./mydir List newly created aggregate in archive -bash-3.2$ htar –tvf mydir.tar Remove local directory and contents -bash-3.2$ rm –rf ./mydir Extract directory and files from archive -bash-3.2$ htar –xvf mydir.tar

National Energy Research Scientific Computing Center

Section Title