Introduction to HPC resources for BCB 660 Nirav Merchant

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

Operating System.
Chapter 2 Data Manipulation Dr. Farzana Rahman Assistant Professor Department of Computer Science James Madison University 1 Some sldes are adapted from.
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Computer Basics 1 Computer Basic 1 includes two lessons:
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
GCSE Computing - The CPU
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access memory.
Chapter 3.1:Operating Systems Concepts 1. A Computer Model An operating system has to deal with the fact that a computer is made up of a CPU, random access.
Operating Systems Operating System
Introduction to HP LoadRunner Getting Familiar with LoadRunner >>>>>>>>>>>>>>>>>>>>>>
Topics Introduction Hardware and Software How Computers Store Data
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
Operating System. Architecture of Computer System Hardware Operating System (OS) Programming Language (e.g. PASCAL) Application Programs (e.g. WORD, EXCEL)
1 TOPIC 1 INTRODUCTION TO COMPUTER SCIENCE AND PROGRAMMING Topic 1 Introduction to Computer Science and Programming Notes adapted from Introduction to.
CS 1308 Computer Literacy and the Internet Computer Systems Organization.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Introduction to the HPCC Jim Leikert System Administrator High Performance Computing Center.
CS 1308 Computer Literacy and the Internet. Introduction  Von Neumann computer  “Naked machine”  Hardware without any helpful user-oriented features.
©Brooks/Cole, 2003 Chapter 7 Operating Systems. ©Brooks/Cole, 2003 Define the purpose and functions of an operating system. Understand the components.
Introduction to the HPCC Dirk Colbry Research Specialist Institute for Cyber Enabled Research.
Chapter 5: Computer Systems Organization Invitation to Computer Science, Java Version, Third Edition.
Computer Systems Organization CS 1428 Foundations of Computer Science.
Lab System Environment
INTRODUCTION SOFTWARE HARDWARE DIFFERENCE BETWEEN THE S/W AND H/W.
Common Practices for Managing Small HPC Clusters Supercomputing 12
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 3: Operating-System Structures System Components Operating System Services.
1.1 Operating System Concepts Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered.
Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
HPC for Statistics Grad Students. A Cluster Not just a bunch of computers Linked CPUs managed by queuing software – Cluster – Node – CPU.
1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall Nirav Merchant Bio Computing & iPlant Collaborative Eric Lyons.
Intro to Computers Computer Applications. What is a Computer? Initially the term computer referred to an individual whose job it was to perform mathematical.
Silberschatz and Galvin  Operating System Concepts Module 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming.
Nanco: a large HPC cluster for RBNI (Russell Berrie Nanotechnology Institute) Anne Weill – Zrahia Technion,Computer Center October 2008.
Modeling Big Data Execution speed limited by: –Model complexity –Software Efficiency –Spatial and temporal extent and resolution –Data size & access speed.
CPSC 171 Introduction to Computer Science System Software and Virtual Machines.
CS 1308 Computer Literacy and the Internet. Objectives In this chapter, you will learn about:  The components of a computer system  Putting all the.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Department of Computer Science Operating Systems OPS621S Semester 2.
Slide 6-1 Chapter 6 System Software Considerations Introduction to Information Systems Judith C. Simon.
1.1 Sandeep TayalCSE Department MAIT 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming Batched Systems Time-Sharing Systems.
Silberschatz and Galvin  Operating System Concepts Module 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming.
Software Design and Development Computer Architecture Computing Science.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Chapter I: Introduction to Computer Science. Computer: is a machine that accepts input data, processes the data and creates output data. This is a specific-purpose.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
CFI 2004 UW A quick overview with lots of time for Q&A and exploration.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
An Brief Introduction Charlie Taylor Associate Director, Research Computing UF Research Computing.
Advanced Computing Facility Introduction
Compute and Storage For the Farm at Jlab
Unit 2 Technology Systems
Applied Operating System Concepts
Welcome to Indiana University Clusters
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Operating System.
Spatial Analysis With Big Data
IB Computer Science Topic 2.1.1
Computer Science I CSC 135.
Operating System Concepts
Chapter 5: Computer Systems Organization
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Introduction to Operating Systems
Introduction to High Performance Computing Using Sapelo2 at GACRC
Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.
Operating System Concepts
Presentation transcript:

Introduction to HPC resources for BCB 660 Nirav Merchant

 What is Parallel Computing ?  General overview of HPC systems  Overview of batch system (and why we need them)  Getting started with Ranger  Understanding the default user environment  Introduction to modules (and why we need them)  Submitting your first job (and monitoring it)  Moving your data in and out of HPC systems  Q/A Topic Coverage

 von Neumann Architecture  Named after the Hungarian mathematician John von Neumann who first authored the general requirements for an electronic computer in his 1945 papers.  Since then, virtually all computers have followed this basic design of:  Memory (RAM)  Control Unit (CPU)  Arithmetic Logic Unit (ALU)  Input/Output (Keyboard) What is computing ?

What does it look like (your computer) ? Image courtesy Univ. of Washington

Parallel computing: use of multiple processors or computers working together on a common task.  Each processor works on part of the problem  Processors can exchange information What is Parallel Computing? A good introduction to concepts for parallel programing is at:

 Traditional software is written to execute serially i.e. one task at a time running on one CPU  As the size of data (tasks) is increasing we need to utilize multiple CPU’s  Size of data also has implications on how much RAM and disk space is required for the task (we need more RAM or disk that fits on one computer) Why we need it

HPC systems: Not very different Image courtesy TACC at Univ of Texas

 HPC: High Performance Computing = Super Computing  Node: One self contained computer (many of which are connected together to form a “cluster”)  CPU = Socket = Processor = Cores  Interconnect: networking between Nodes (can be fiber optic, or regular ethernet like your computers) e.g. Infiniband or GigE Some Terminology (Jargon) of HPC

 Scalability: Ability to use additional resources to execute tasks faster  Embarrassingly Parallel: Data Parallel tasks where each task is independent and not much communication or coordination is required among tasks  Observed Speedup: “wall time” taken for serial task divided by wall time for parallel task More Terminology (Jargon) of HPC

 Shared memory  All CPU (processors) have access to shared RAM  Distributed memory  Each CPU (processor) has its own local memory, but can be connected to others nodes via fast interconnect Types of HPC

 Limits of single CPU computing  Performance  Available memory (Disk and RAM)  Parallel computing allows one to:  Execute Tasks that don’t fit on a single CPU  Complete tasks in a reasonable time  Again Please check: for basic intro to parallel computing concepts Again why do we need it ?

 Compute power  504 Teraflops  3,936 four socket nodes  62,976 cores, 2.0 GHz AMD Opteron  Memory  125Terabytes  2GB/core, 32 GB/node  Disk subsystem  1.7 PB Storage (Lustre Parallel File System)  1 PB in /work filesystem  Interconnect  8 Gb/s InfiniBand  Lonestar and others machines have similar (much larger specs) RANGER

 HOME  Store your source code and build your executables here  Use $HOME to reference your home directory in scripts  WORK  Store large files here  This file system is NOT backed up, use $ARCHIVE for important files!  Use $WORK to reference this directory in scripts  SCRATCH  Store large input or output files here – TEMPORARILY  This file system is NOT backed up, use $ARCHIVE for important files!  Use $SCRATCH to reference this directory in scripts  ARCHIVE  Massive, long-term storage and archive system  Check with staff before using this on your account Filesystem Access

Limits on your filesystem

How is it connected

 Please visit the TACC new user guide for RANGER  You will pick up many hints that will make your life MUCH easier for running tasks on TACC resources  guides/ranger-user-guide guides/ranger-user-guide  (same as above) MUST READ THIS

 With multiple users we need a way to organize tasks  We need a way to assign suitable resources to the tasks (track, prioritize)  With multiple software we need a way to deal with conflicts in version and dependency per tasks  Batch scheduler user on all TACC systems is SGE (Sun Grid Engine) now owned by Oracle. Batch, Module system

Batch submission

RANGER: Queue Options

Common SGE commands

Lets get working ssh

Module Commands

Compbio stack/modules

 Modules are for global use, hard to get cutting edge code as modules (limited staff time)  You can always compile and use your own versions without waiting for a module to be built  When possible, build your applications from source rather than running pre-compiled binaries  If you choose to use “make Install”, you will need to modify the “configure” script to change where it is installed ./configure --prefix=$HOME/bin  For best performance, use the the intel compilers  For best compatibility, use the gcc compilers  More in “bleeding edge s/w” slide But my favorite app is …

 Number of cores and nodes to use is set with: #$ -pe Nway 16*M  N represents the number of cores to utilize per node Ranger: 1≤N≤16 Lonestar:1≤N≤12  M is the number of nodes to utilize  The TOTAL number of cores used is thus: N*M Preparing for tasks

Preparing a job submission

Some more SGE options

 /wiki /wiki  (same url as above just short)  Lets look at the tutorial section towards the end of the page Working with bleeding edge s/w

More from that page

 SCP will work well for most smaller files  Specialized options (bbcp and gridftp need special end point installation)  As you get larger files (10Gb+) it gets time consuming to move it around  Easier to move your data into iPlant data store from your desktop/server (parallel transfers)  Pull that data where you need (and push more into it)  Command line and GUI options (including dropbox for science) Getting data in and out

 Details at:   Connecting from RANGER  module load irods  iinit  Answer the prompts using info from above link  You are now connected (without future need of passwords to iPlant data store) iPlant data store

From RANGER After loading irods module i.e module load irods

 You have many tasks that you want to run and they are naturally parallel (“embarrassingly parallel” )  Parametric Job Launcher: a simple utility for submitting multiple serial applications simultaneously.  % module load launcher  2 key components:  paramlist execution command  launcher.sge job submission script Parametric Launcher

 Check  /wiki/TACC_NGS_Course_Practical_1 /wiki/TACC_NGS_Course_Practical_1   Look at the shrimp_launcher.sge for ideas Parametric Launcher

 TACC Staff for slides  Matt Vaughn  Michael Gonzalez  And many more  URL  guides/ guides/ Gratitude