Parallel Processing: Architecture Overview

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

Operating System Architecture and Distributed Systems
Operating System Architecture and Distributed Systems
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
1. Introduction to Parallel Computing
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Parallel Processing: Architecture Overview Subject Code: Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
1 Threads, SMP, and Microkernels Chapter 4. 2 Process: Some Info. Motivation for threads! Two fundamental aspects of a “process”: Resource ownership Scheduling.
Operating Systems (OS) Threads, SMP, and Microkernel, Unix Kernel
Chapter 17 Parallel Processing.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
Parallel Processing: Architecture Overview Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia.
Introduction to Parallel Processing Ch. 12, Pg
1 Parallel computing and its recent topics. 2 Outline 1. Introduction of parallel processing (1)What is parallel processing (2)Classification of parallel.
KUAS.EE Parallel Computing at a Glance. KUAS.EE History Parallel Computing.
Introduction to Parallel Computing: Architectures, Systems, and Programming Prof. Rajkumar Buyya Cloud Computing and Distributed Systems (CLOUDS) Lab.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Department of Computer Science University of the West Indies.
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Introduction to Parallel Processing
Operating System 4 THREADS, SMP AND MICROKERNELS.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Server HW CSIS 4490 n-Tier Client/Server Dr. Hoganson Server Hardware Mission-critical –High reliability –redundancy Massive storage (disk) –RAID for redundancy.
Outline Why this subject? What is High Performance Computing?
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
Background Computer System Architectures Computer System Software.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Introduction to Operating Systems Concepts
These slides are based on the book:
CS203 – Advanced Computer Architecture
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Introduction to Parallel Processing
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
Chapter 1: Introduction
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Overview Parallel Processing Pipelining
CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.
18-447: Computer Architecture Lecture 30B: Multiprocessors
CMSC 611: Advanced Computer Architecture
Network Operating Systems (NOS)
Parallel Processing - introduction
Course Outline Introduction in algorithms and applications
Lecture 1: Network Operating Systems (NOS)
Operating System Structure
Flynn’s Classification Of Computer Architectures
Morgan Kaufmann Publishers
Introduction to Parallel Processing
Chapter 1: Introduction
Chapter 1: Introduction
Multi-Processing in High Performance Computer Architecture:
Threads, SMP, and Microkernels
Different Architectures
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
Chapter 17 Parallel Processing
Symmetric Multiprocessing (SMP)
Overview Parallel Processing Pipelining
AN INTRODUCTION ON PARALLEL PROCESSING
Introduction to Operating Systems
Operating System 4 THREADS, SMP AND MICROKERNELS
Introduction to Operating Systems
Part 2: Parallel Models (I)
Chapter 4 Multiprocessors
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
Presentation transcript:

Parallel Processing: Architecture Overview WW Grid Rajkumar Buyya Grid Computing and Distributed Systems (GRIDS) Lab. The University of Melbourne Melbourne, Australia www.gridbus.org

Serial Vs. Parallel COUNTER 2 COUNTER COUNTER 1 Q Please

Overview of the Talk Introduction Why Parallel Processing ? Parallel System H/W Architecture Parallel Operating Systems

Multi-Processor Computing System Computing Elements Applications Programming paradigms P . Microkernel Multi-Processor Computing System Threads Interface Operating System Hardware Process Processor Thread P

Two Eras of Computing Architectures System Software/Compiler Applications P.S.Es System Software Sequential Era Parallel Era 1940 50 60 70 80 90 2000 2030 Commercialization R & D Commodity

History of Parallel Processing The notion of parallel processing can be traced to a tablet dated around 100 BC. Tablet has 3 calculating positions capable of operating simultaneously. From this we can infer that: They were aimed at “speed” or “reliability”.

Motivating Factors Just as we learned to fly, not by constructing a machine that flaps its wings like birds, but by applying aerodynamics principles demonstrated by the nature... Similarly parallel processing has been modeled after those of biological species. Aggregated speed with which complex calculations carried out by (billions of) neurons demonstrate feasibility of PP. Individual neuron response speed is slow (ms) –

Why Parallel Processing? Computation requirements are ever increasing -- visualization, distributed databases, simulations, scientific prediction (earthquake), etc. Silicon based (sequential) architectures reaching physical limits in processing limits as they are constrained by: the speed of light, thermodynamics

Human Architecture! Growth Performance Vertical Horizontal Growth 5 10 15 20 25 30 35 40 45 . . . . Age

Computational Power Improvement Multiprocessor Uniprocessor C.P.I 1 2 . . . . No. of Processors

Why Parallel Processing? Hardware improvements like pipelining, superscalar are not scaling well and require sophisticated compiler technology to exploit performance out of them. Techniques such as vector processing works well for certain kind of problems.

Why Parallel Processing? Significant development in networking technology is paving a way for network-based cost-effective parallel computing. The parallel processing technology is mature and is being exploited commercially.

Parallel Programs Consist of multiple active “processes” simultaneously solving a given problem. And the communication and synchronization between them (parallel processes) forms the core of parallel programming efforts.

Types of Parallel Systems Tightly Couple Systems: Shared Memory Parallel Smallest extension to existing systems Program conversion is incremental Distributed Memory Parallel Completely new systems Programs must be reconstructed Loosely Coupled Systems: Clusters Built using commodity systems Centralised management Grids Aggregation of distributed systems Decentralized management

Processing Elements Architecture

Processing Elements Flynn proposed a classification of computer systems based on a number of instruction and data streams that can be processed simultaneously. They are: SISD (Single Instruction and Single Data) Conventional computers SIMD (Single Instruction and Multiple Data) Data parallel, vector computing machines MISD (Multiple Instruction and Single Data) Systolic arrays MIMD (Multiple Instruction and Multiple Data) General purpose machine

SISD : A Conventional Computer Processor Data Input Data Output Instructions Speed is limited by the rate at which computer can transfer information internally. Ex: PCs, Workstations

The MISD Architecture Data Input Stream Output Processor A B C Instruction Stream A Stream B Instruction Stream C More of an intellectual exercise than a practical configuration. Few built, but commercially not available

SIMD Architecture Instruction Stream Processor A B C Data Input stream A stream B stream C Data Output Ci<= Ai * Bi Ex: CRAY machine vector processing, Thinking machine cm* Intel MMX (multimedia support)

MIMD Architecture Instruction Stream A Instruction Stream B Instruction Stream C Data Output stream A Data Input stream A Processor A Data Output stream B Data Input stream B Processor B Data Output stream C Processor C Data Input stream C Unlike SISD, MISD, MIMD computer works asynchronously. Shared memory (tightly coupled) MIMD Distributed memory (loosely coupled) MIMD

Shared Memory MIMD machine Processor A Processor B Processor C MEMORY BUS MEMORY BUS MEMORY BUS Global Memory System Comm: Source PE writes data to GM & destination retrieves it Easy to build, conventional OSes of SISD can be easily be ported Limitation : reliability & expandibility. A memory component or any processor failure affects the whole system. Increase of processors leads to memory contention. Ex. : Silicon graphics supercomputers....

Distributed Memory MIMD IPC channel IPC channel Processor A Processor B Processor C MEMORY BUS MEMORY BUS MEMORY BUS Memory System A System B System C Communication : IPC (Inter-Process Communication) via High Speed Network. Network can be configured to ... Tree, Mesh, Cube, etc. Unlike Shared MIMD easily/ readily expandable Highly reliable (any CPU failure does not affect the whole system)

Laws of caution..... Speed of computation is proportional to the square root of system cost. i.e. Speed = Cost Speedup by a parallel computer increases as the logarithm of the number of processors. Speedup = log2(no. of processors) C S S P log2P

Caution.... Very fast development in network computing and related area have blurred concept boundaries, causing lot of terminological confusion : concurrent computing, parallel computing, multiprocessing, supercomputing, massively parallel processing, cluster computing, distributed computing, Internet computing, grid computing, etc. At the user level, even well-defined distinctions such as shared memory and distributed memory are disappearing due to new advances in technology. Good tools for parallel program development and debugging are yet to emerge.

Caution.... There is no strict delimiters for contributors to the area of parallel processing: computer architecture, operating systems, high-level languages, algorithms, databases, computer networks, … All have a role to play.

Operating Systems for High Performance Computing

Types of Parallel Systems Shared Memory Parallel Smallest extension to existing systems Program conversion is incremental Distributed Memory Parallel Completely new systems Programs must be reconstructed Clusters Slow communication form of Distributed

Operating Systems for PP MPP systems having thousands of processors requires OS radically different from current ones. Every CPU needs OS : to manage its resources to hide its details Traditional systems are heavy, complex and not suitable for MPP

Operating System Models Frame work that unifies features, services and tasks performed Three approaches to building OS.... Monolithic OS Layered OS Microkernel based OS Client server OS Suitable for MPP systems Simplicity, flexibility and high performance are crucial for OS.

Monolithic Operating System Application Programs System Services Hardware User Mode Kernel Mode Better application Performance Difficult to extend Ex: MS-DOS

Memory & I/O Device Mgmt Layered OS Application Programs Application Programs User Mode Kernel Mode System Services Memory & I/O Device Mgmt Process Schedule Hardware Easier to enhance Each layer of code access lower level interface Low-application performance Ex : UNIX

Traditional OS OS User Mode Kernel Mode OS Designer Application Programs User Mode Kernel Mode OS Hardware OS Designer

New trend in OS design Servers Microkernel User Mode Kernel Mode Application Programs Application Programs User Mode Kernel Mode Microkernel Hardware

Microkernel/Client Server OS (for MPP Systems) Application Thread lib. File Server Network Server Display Server User Kernel Microkernel Send Reply Hardware Tiny OS kernel providing basic primitive (process, memory, IPC) Traditional services becomes subsystems Monolithic Application Perf. Competence OS = Microkernel + User Subsystems Ex: Mach, PARAS, Chorus, etc.

Few Popular Microkernel Systems MACH, CMU PARAS, C-DAC Chorus QNX, (Windows)