Katie Antypas NERSC User Services Lawrence Berkeley National Lab 10 February 2012 JGI Compute User Training.

Slides:



Advertisements
Similar presentations
Tivoli SANergy. SANs are Powerful, but... Most SANs today offer limited value One system, multiple storage devices Multiple systems, isolated zones of.
Advertisements

Exchange… …and the Vault Anton Lawrence IT Services 6 th March 2007.
Introduction to Physics IT Support. To learn about IT Support available with the Department of Physics, and across the University. To find out a little.
Page 1 Organize for Success IST Organization Design January, 2013 MALCOLM BERNSTEIN CONSULTING.
Katie Antypas NP Requirements Review April 29, 2014 NERSC Data Services Update and Plans (Draft) - 1 -
Rhea Analysis & Post-processing Cluster Robert D. French NCCS User Assistance.
© 2008 Waterford Technologies, Inc. All Rights Reserved & File Archiver Benefits Understand Primary Storage Usage Optimise Primary Storage Control.
Katie Antypas NERSC User Services Lawrence Berkeley National Lab NUG Meeting 1 February 2012 Best Practices for Reading and Writing Data on HPC Systems.
1 Distributed File System, and Disk Quotas (Week 7, Thursday 2/21/2007) © Abdou Illia, Spring 2007.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 5: Managing File Access.
Low level CASE: Source Code Management. Source Code Management  Also known as Configuration Management  Source Code Managers are tools that: –Archive.
Guide To UNIX Using Linux Third Edition
Guide To UNIX Using Linux Third Edition
ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.
Hands-On Microsoft Windows Server 2003 Administration Chapter 6 Managing Printers, Publishing, Auditing, and Desk Resources.
ORNL is managed by UT-Battelle for the US Department of Energy Data Management User Guide Suzanne Parete-Koon Oak Ridge Leadership Computing Facility.
Project Directory Service: Supporting Courses Through Network Storage J. Richard McFerron Director, Academic Technology Services Indiana University of.
Amazon EC2 Quick Start adapted from EC2_GetStarted.html.
Chapter 4: Operating Systems and File Management 1 Operating Systems and File Management Chapter 4.
A crash course in njit’s Afs
Storage Refresh Project Migration of Enterprise Leased Shares Websites Home Directory Service.
Senior Design – Spring 2009 Richard Gory Focus: Networking & Web.
Cornell 18,000 students 2,000 faculty Twelve colleges on Ithaca campus Four are state colleges, eight are private (including grad school and school of.
1 Secure Services. 2 Secure is a hosted application that provides users with enterprise-grade business features including calendaring, contacts.
Introduction to Shell Script Programming
1 Review For Exam 2 (Summary Questions) (Week 8, Monday 3/1/2004) © Abdou Illia, Spring 2004.
1 Review For Exam 2 (Slides) (Summary Questions) © Abdou Illia, Fall 2006.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
VIPBG LINUX CLUSTER By Helen Wang March 29th, 2013.
Barracuda Message Archiver. Integrated hardware and software Archiving and policy management Search and retrieval Internal storage and support for external.
Logging into the linux machines This series of view charts show how to log into the linux machines from the Windows environment. Machine name IP address.
RNA-Seq 2013, Boston MA, 6/20/2013 Optimizing the National Cyberinfrastructure for Lower Bioinformatic Costs: Making the Most of Resources for Publicly.
Katie Antypas User Services Group Lawrence Berkeley National Lab 17 February 2012 JGI Training Series.
9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.
1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall Nirav Merchant Bio Computing & iPlant Collaborative Eric Lyons.
Module 3 Planning and Deploying Mailbox Services.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
CENTER FOR HIGH PERFORMANCE COMPUTING Introduction to I/O in the HPC Environment Brian Haymore, Sam Liston,
Logging into the linux machines This series of view charts show how to log into the linux machines from the Windows environment. Machine name IP address.
1 Day 18 Bash and the.files. 2 The.files ls shows you the files in your directory –Or at least most of them. –Some files are hidden. Try: ls –a –This.
| nectar.org.au NECTAR TRAINING Module 9 Backing up & Packing up.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
CSC414 “Introduction to UNIX/ Linux” Lecture 6. Schedule 1. Introduction to Unix/ Linux 2. Kernel Structure and Device Drivers. 3. System and Storage.
Lecture 02 File and File system. Topics Describe the layout of a Linux file system Display and set paths Describe the most important files, including.
RDA Data Support Section. Topics 1.What is it? 2.Who cares? 3.Why does the RDA need CISL? 4.What is on the horizon?
LBNL/NERSC/PDSF Site Report for HEPiX Catania, Italy April 17, 2002 by Cary Whitney
ITM/S Governance Model. SAP & Rollout Shared Service | ITM/S Governance Model Department Site Owner Project Site Owner.
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
Patrick Gartung 1 CMS 101 Mar 2007 Introduction to the User Analysis Facility (UAF) Patrick Gartung - Fermilab.
Automated File Server Disk Quota Management May 13 th, 2008 Bill Claycomb Computer Systems Analyst Infrastructure Computing Systems Department Sandia is.
«My future profession»
Answer to Summary Questions
A Brief Introduction to NERSC Resources and Allocations
Chapter 8 Environments, Alternatives, and Decisions.
Introducing FOR LIBRARIES.
LQCD Computing Operations
Computing Infrastructure for DAQ, DM and SC
Welcome to our Nuclear Physics Computing System
USF Health Informatics Institute (HII)
Unit 27: Network Operating Systems
Introduce yourself Presented by
Welcome to our Nuclear Physics Computing System
CSCI The UNIX System Shell Startup and Variables
BusinessObjects IN Cloud ……InfoSol’s story
Introduction to High Performance Computing Using Sapelo2 at GACRC
Logging into the linux machines
MMG: from proof-of-concept to production services at scale
Creating and Managing Folders
Presentation transcript:

Katie Antypas NERSC User Services Lawrence Berkeley National Lab 10 February 2012 JGI Compute User Training

Today we want to share plans, introduce new services, test workflows, answer questions and hear your feedback New file systems and data management Beta Web documentation JGI User Survey Results Fair share batch systems Crius Rhea Thei a Kronos? Hyperion Oceanus Iapetus Themis

3 Breakdown of NERSC Users’ Science Areas NERSC serves over 4000 users across 500 distinct projects across an array of science areas

JGI users are similar to traditional NERSC users in their need for: JGI users have special workflow, throughput, and software needs Stable, reliable systems Large data management and storage Fast queue turn around Access to millions of compute hours

5 Where we have come from … One-on-one collaborations MOU reached for NERSC to support JGI computational and IT systems File system stabilization Cluster Consolidation 2009 Spring 2010 Fall Fall 2011 May present Crius Rhea Thei a Kronos? Hyperion Oceanus Iapetus Themis

Merge JGI systems into Crius Crius Rhea Thei a Kronos? Hyperion Oceanus Iapetus Themis

Next: Move Crius to NERSC Space But … before we do this, we want to make sure all pipelines and workflows are tested so we cause minimum disruption to users Primary benefit: Access to new 2PB file system

JGI Sys-ops members have been incorporated into NERSC groups Ilya Malinov Jeremy Brand Matt Dunford Brian Yumae Ravi Cheema Patrick Hajek Fred Loebl Networking, Security, Servers: Brent Draney Continue to contact JGI sys-ops for day to day problems Computational Systems Group: Jay Srinivasan Storage Systems Group: Jason Hick

The IT Steering Committee makes policy decisions regarding the cluster JGI Alex Copeland Daniel Rokhsar Harris Shapiro Henrik Nordberg Igor Grigoriev James Bristow Kostas Mavrommatis Len Pennacchio Nikolaos Kyrpidis Ray Turner Rob Egan Victor Markowitz NERSC Brent Draney Jason Hick Jay Srinivasan Jeff Broughton Katie Antypas Shane Canon Contact your representative on the steering committee if you have concerns.

10 The NERSC consultants serve as user advocates Woo-Sun Yang Tools/Math Libraries Richard Gerber Astro/Web Services Helen He Climate Katie Antypas Group Leader Dave Turner Everything Zhengji Zhao Mat. Sci/Chemistry Mike Stewart Compilers Harvey Wasserman Chemistry ? ? Bioinformatics Consultant Yushu Yao Data Analytics Jack Deslippe Mat. Sci/Chemistry Eric Hjort High Energy Physics

Logging into Phoebe

When you login to Phoebe you will be in your “global home” directory The full UNIX path is stored in the environment variable $HOME Your $HOME quota is 40GB and 1,000,000 inodes We realize most users have a different home directory in /house. Reference your old home directory as $OLD_HOME if you need We use $HOME to initialize environment so do not redefine Note /house is available on Phoebe, NetApps is not

Your default shell on Phoebe is bash NERSC sets.bashrc for all users as a read- only file. Do not change your.bashrc file! NERSC uses.bashrc file to make global configuration changes for all users Put your own customizations in.bashrc.ext Want to change your shell? Just let us know.

When you login the environment is pre-configured for you /jgi/tools bin and lib directories in your path Batch system environment setup

Login to Phoebe ssh

Jason Hick Storage Systems Group Lawrence Berkeley National Lab 10 February 2012 A New 2PB GPFS file system for the JGI “projectb”

The new 2PB “projectb” file system is available on Phoebe now Some high level specs for users 2PB XXX

File systems best practices Unfortunately disk is still expensive All of the JGI’s data can not be stored on disk within the current budget Archive and delete data you no longer need Disk usage will be controlled through quotas in some cases and purging in others

There are two areas of storage within the “project” layout of the “projectb” file system /projectb/ projectdirs/scratch/ PI/ RD/ fungal/ metagenome/ micro/ plant/user/ Group directories Not purged Subject to quota User directories Purged

It is important for every group to come up with a data retention policy How long should we keep the raw data? Can the data be deleted or should it be archived? Can we set up an automated way to archive and delete data?