INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Distributed Systems CS
Class CS 775/875, Spring 2011 Amit H. Kumar, OCCS Old Dominion University.
Introduction to MIMD architectures
Background Computer System Architectures Computer System Software.
An Introduction to Parallel Computing Dr. David Cronk Innovative Computing Lab University of Tennessee Distribution A: Approved for public release; distribution.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Computer System Architectures Computer System Software
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
STRATEGIC NAMING: MULTI-THREADED ALGORITHM (Ch 27, Cormen et al.) Parallelization Four types of computing: –Instruction (single, multiple) per clock cycle.
AN EXTENDED OPENMP TARGETING ON THE HYBRID ARCHITECTURE OF SMP-CLUSTER Author : Y. Zhao 、 C. Hu 、 S. Wang 、 S. Zhang Source : Proceedings of the 2nd IASTED.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
Lecture 3: Computer Architectures
1 Lecture 1: Parallel Architecture Intro Course organization:  ~18 parallel architecture lectures (based on text)  ~10 (recent) paper presentations 
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Parallel Computing Presented by Justin Reschke
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Background Computer System Architectures Computer System Software.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered System Real.
Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.
These slides are based on the book:
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
Chapter 1: Introduction
Chapter 1: Introduction
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Introduction to parallel programming
CS5102 High Performance Computer Systems Thread-Level Parallelism
Distributed Processors
Parallel Processing - introduction
Chapter 1: Introduction
Chapter 1: Introduction
CS 147 – Parallel Processing
The University of Adelaide, School of Computer Science
Chapter 1: Introduction
Team 1 Aakanksha Gupta, Solomon Walker, Guanghong Wang
Morgan Kaufmann Publishers
Chapter 1: Introduction
Chapter 1: Introduction
Operating System Concepts
Operating Systems (CS 340 D)
Multi-Processing in High Performance Computer Architecture:
Distributed System Structures 16: Distributed Structures
Chapter 4: Threads.
Introduction to Multiprocessors
CSE8380 Parallel and Distributed Processing Presentation
Chapter 1: Introduction
AN INTRODUCTION ON PARALLEL PROCESSING
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Distributed Systems CS
Multithreaded Programming
Operating Systems (CS 340 D)
Subject Name: Operating System Concepts Subject Number:
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 4 Multiprocessors
Database System Architectures
Chapter 1: Introduction
Chapter 1: Introduction
Presentation transcript:

INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY

INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection of integrated advanced digital resources and services in the world”  Five-year, $121-million project is supported by the National Science Foundation

HIGH PERFORMANCE COMPUTING  The practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. Practical applications  Airline ticket purchasing systems  Finance companies calculating recommendations for client portfolios

NODE  Discrete unit of computer system, runs its own instance of OS (Ex. Laptop is one node)  Computing units  Cores: can process separate streams of instructions  Number of cores in a node = (number of processing chips) x (number of cores on a chip)  Note: Modern nodes contain processor chips sharing memory, disk

CLUSTER  Collection of machines (nodes) that function in some way as a single resource (e.g. Stampede)  Nodes of cluster are assigned by a “scheduler”  Job: Assignment of nodes for a certain user, certain amount of time

GRID  Software stack facilitating shared resources across networks, institutions;  Deals with heterogeneous clusters  Most grids are cross-institutional groups of clusters, common software, common user authentication

Major grids in the U.S.  OSG: Institutions organize themselves into virtual organizations (VOs) with similar computing interests. They all install OSG software  XSEDE: Similar to OSG, limits usage to dedicated high- performance network

PARALLEL CODE  Single thread of execution, multiple data items simultaneously  Multiple threads of execution in single thread  Multiple executables working on same problem  Any combination of above

TAXONOMY OF PARALLEL COMPUTERS

SHARED MEMORY  Multiple cores that have access to the same physical memory.  The cores may be part of multicore processor chips, or they may be on discrete chips  Symmetric multiprocessor (SMP): access to memory locations is equally fast from all cores, or uniform memory access (UMA)  Non-uniform memory access (NUMA): If multiple chips are involved but access is not necessarily uniform  Programs for shared memory computers typically use multiple threads in the same process

DISTRIBUTED MEMORY  In a distributed memory system, the memory is associated with individual processors and a processor is only able to address its own memory.  In a distributed memory program, each task has a different virtual address space.  Programming using distributed arrays is called data parallel programming, because each task is working on a different section of an array of data

DISTRIBUTED MEMORY

MESSAGE PASSING INTERFACE (MPI)  In distributed memory programming, the typical way of communicating between tasks and coordinating their activities is through message passing.  The Message Passing Interface (MPI) is a communication system that was designed with a standard for distributed-memory parallel programming.

MESSAGE-PASSING VS SHARED MEMORY  The message-passing programming model  Is widely used because of its portability  Some applications are too complex to code while trying to balance computation load and avoid redundant computations  The shared-memory programming model  Simplifies coding  Not portable and often provides little control over interprocessor data transfer costs

PERFORMANCE EVALUATION  Parallel efficiency is defined by how effectively you're making use of your multiple processors.  100% efficiency would mean that you're getting a factor of p speedup from using p processors.  Efficiency is defined in terms of speedup per processor:

PERFECT SPEEDUP

 Parallel applications tend not to exhibit perfect speedup because:  There are inherently serial parts of a calculation  Parallelization introduces overhead into the calculation to copy and transfer data  The individual tasks compete for resources like memory or disk.

REFERENCES Introduction to Parallel Computing Parallel Programming Concepts and High-Performance Computing