Presentation On Parallel Computing  By  Abdul Mobin  KSU ID : 810784768.  Id :

Slides:



Advertisements
Similar presentations
© 2009 Fakultas Teknologi Informasi Universitas Budi Luhur Jl. Ciledug Raya Petukangan Utara Jakarta Selatan Website:
Advertisements

Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
The University of Adelaide, School of Computer Science
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Reference: Message Passing Fundamentals.
Supercomputers Daniel Shin CS 147, Section 1 April 29, 2010.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
Chapter 17 Parallel Processing.
 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.
Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Flynn’s Taxonomy of Computer Architectures Source: Wikipedia Michael Flynn 1966 CMPS 5433 – Parallel Processing.
Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.
Advanced Computer Architectures
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
Computer System Architectures Computer System Software
Multi-core architectures. Single-core computer Single-core CPU chip.
Multi-Core Architectures
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Super computers Parallel Processing By Lecturer: Aisha Dawood.
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Classic Model of Parallel Processing
Parallel Computing.
Processor Architecture
Outline Why this subject? What is High Performance Computing?
EKT303/4 Superscalar vs Super-pipelined.
Lecture 3: Computer Architectures
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
Parallel Computing Presented by Justin Reschke
LECTURE #1 INTRODUCTON TO PARALLEL COMPUTING. 1.What is parallel computing? 2.Why we need parallel computing? 3.Why parallel computing is more difficult?
Background Computer System Architectures Computer System Software.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Page 1 2P13 Week 1. Page 2 Page 3 Page 4 Page 5.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
PipeliningPipelining Computer Architecture (Fall 2006)
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
These slides are based on the book:
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Advanced Architectures
CHAPTER SEVEN PARALLEL PROCESSING © Prepared By: Razif Razali.
PARALLEL COMPUTING.
Distributed Processors
Parallel Processing - introduction
Morgan Kaufmann Publishers
What is Parallel and Distributed computing?
Chapter 17 Parallel Processing
High Performance Computing (Supercomputer/Parallel Computing)
CSE8380 Parallel and Distributed Processing Presentation
AN INTRODUCTION ON PARALLEL PROCESSING
Chapter 1 Introduction.
Chapter 4 Multiprocessors
Presentation transcript:

Presentation On Parallel Computing  By  Abdul Mobin  KSU ID :  Id :

Contents.  Introduction to Parallel Computing.  Introduction to Parallel Computers.  Approaches of Parallel Computing.  Approaches based on Computation.  Types Of parallelism.  Classes Of parallel Computing.  Uses Of parallel Computing.  Application of parallel Computing  References.  Queries.

Introduction  Parallel computing is a form of computation in which many instructions are carried out simultaneously operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently (in parallel). With the increased use of computers in every sphere of human activity, computer scientists are faced with two crucial issues today.  With the increased use of computers in every sphere of human activity, computer scientists are faced with two crucial issues today. 1.Processing has to be done faster like never before. 2. Larger or complex computation problems need to be solved.  Power consumption has been a major issue recently, as it causes a problem of processor heating.  The perfect solution is parallelism

Example Diagram

Contd..  The computational problem should be able to be broken apart into discrete pieces of work that can be solved simultaneously.  Execute multiple program instructions at any moment in time.  Be solved in less time with multiple compute resources than with a single compute resource.  The compute resources are typically: A single computer with multiple processors/cores  An arbitrary number of such computers connected by a network

Parallel Computers:  Virtually all stand-alone computers today are parallel from a hardware perspective:  Multiple functional units (L1 cache, L2 cache, branch, prefetch, decode, floating- point, graphics processing (GPU), integer, etc.)  Multiple execution units/cores.  Multiple hardware threads.

Example Figure IBM BG/Q Compute Chip with 18 cores (PU) and 16 L2 Cache units (L2).

Example of Network Networks connect multiple stand-alone computers (nodes) to make larger parallel computer clusters.

Approaches To Parallel Computing Flynn’s Taxonomy  SISD(Single Instruction Single Data).  SIMD(Single Instruction Multiple Data).  MISD(Multiple Instruction Single Data).  MIMD(Multiple Instruction Multiple Data).

1. Single Instruction Single Data(SISD):  A sequential computer exploits no parallelism in either the instruction or data streams.  Single Control Unit (CU) fetches single Instruction Stream (IS) from memory.  The CU then generates appropriate control signals to direct Single Processing Element (PE) to operate on Single Data Stream (DS) that is one operation at a time.  Examples : Uniprocessor machines like a PC or old mainframes. 2. Single Instruction Multiple Data(SIMD):  A computer which exploits multiple data streams against a single instruction stream to perform operations which may be naturally parallelized.  Examples: An Array processor, GPU( Graphic Processing Unit).

3. Multiple Instruction Single Data (MISD):  Multiple instructions operate on a single data stream. Uncommon architecture which is generally used for fault tolerance.  Heterogeneous systems operate on the same data stream and must agree on the result.  Examples: Space Shuttle flight control computer. 4. Multiple Instruction Multiple Data (MIMD):  Multiple autonomous processors simultaneously executing different instructions on different data.  Distributed systems are generally recognized to be single shared memory space or a distributed memory space.  Examples: A multi-core superscalar Processor is an MIMD processor.

Pictorial Representation:

Approaches Based On Computation Massively Parallel  Massively Parallel  Embarrassingly Parallel  Grand Challenge Problems

Massively Parallel Systems:  It signifies the presence of many independent units or entire microprocessors, that run in parallel.  The term massive connotes hundreds if not thousands of such units.  The Below figure Shows The Super Computer from named “The Earth Simulator”

Embarrassingly Parallel Systems  An embarrassingly parallel system is one for which no particular effort is needed to segment the problem into a very large number of parallel tasks.  Examples include surfing two websites simultaneously, or running two applications on a home computer.  They lie to an end of spectrum of parallelization where tasks can be readily parallelized.  No communication or very little communication between processes.  Each process can do its tasks without any interaction with other processes.

Grand Challenge Problems  A grand challenge is a fundamental problem in science or engineering, with broad applications, whose solution would be enabled by the application of high performance computing resources that could become available in the near future.  Grand Challenges were USA policy terms set as goals in the late 1980s for funding high-performance computing and communications research in part in response to the Japanese 5th Generation (or Next Generation) 10-year project.

Types Of Parallelism  Bit-Level  Instructional  Data  Task

Bit-Level Parallelism  Bit-level parallelism is a form of parallel computing based on increasing processor word size.  From the advent of very-large-scale integration (VLSI) computer chip fabrication technology in the 1970s until about 1986, advancements in computer architecture were done by increasing bit-level parallelism.  Increasing the word size reduces the number of instructions the processor must execute in order to perform an operation on variables whose sizes are greater than the length of the word.  When an 8-bit processor needs to add two 16-bit integers, it's to be done in two steps.  The processor must first add the 8 lower-order bits from each integer using the standard addition instruction.  Then add the 8 higher-order bits using an add-with-carry instruction and the carry bit from the lower order addition.

Instruction Level Parallelism  Instruction-level parallelism (ILP) is a measure of how many of the operations in a computer program can be performed simultaneously.  The potential overlap among instructions is called instruction level parallelism.  There are two approaches to instruction level parallelism:  Hardware.  Software.  Hardware level works upon dynamic parallelism whereas, the software level works on static parallelism.  The Pentium processor works on the dynamic sequence of parallel execution but the Itanium processor works on the static level parallelism.  The instructions given to a computer for processing can be divided into groups, or re-ordered and then processed without changing the final result.  This is known as instruction-level parallelism.

An Example 1. e = a + b 2. f = c + d 3. g = e * f  Here, instruction 3 is dependent on instruction 1 and 2.  However, instruction 1 and 2 can be independently processed.  If we assume that each operation can be completed in one unit of time then these three instructions can be completed in a total of two units of time, giving an ILP of 3/2.

Data Level parallelism  Data parallelism is a form of parallelization of computing across multiple processors in parallel computing environments.  Data parallelism focuses on distributing the data across different parallel computing nodes. It contrasts to task parallelism as another form of parallelism.  Data parallelism emphasizes the distributed (parallelized) nature of the data, as opposed to the processing (task parallelism). Most real programs fall somewhere on a continuum between task parallelism and data parallelism.  Example: Consider a 2-processor system ( CPUS A and B) in a parallel environment, and we wish to do a task on some data ‘d’. It is possible to tell CPU A to do that task on one part of ‘d’ and CPU B on another part simultaneously, thereby reducing the duration of execution.  The data can be assigned using conditional statements. As a specific example, consider adding two matrices, in a data parallel implementation, CPU A could add all elements from the top half of the matrices, while CPU B could add all elements from the bottom half of the matrices.  Since the two processors work in parallel, the job of performing matrix addition would take one half of the time of performing the same operating in serial using one CPU alone.

Task Parallelism  Task parallelism (also known as function parallelism and control parallelism ) is a form of parallelization of computer code across multiple processors in parallel computing environments.  Task parallelism focuses on distributing execution processes (threads) across different parallel computing nodes. It contrasts to data parallelism as another form of parallelism.  In a multiprocessor system, task parallelism is achieved when each processor executes a different thread (or process) on the same or different data.  The threads may execute the same or different code. In the general case, different execution threads communicate with one another as they work.  Communication usually takes place by passing data from one thread to the next as part of a workflow.

An Example  As a simple example, if we are running code on a 2-processor system (CPUs "a" & "b") in a parallel environment and we wish to do tasks "A" and "B", it is possible to tell CPU "a" to do task "A" and CPU "b" to do task 'B" simultaneously, thereby reducing the run time of the execution.  Task parallelism emphasizes the distributed (parallelized) nature of the processing (i.e. threads), as opposed to the data (data parallelism). Most real programs fall somewhere on a continuum between task parallelism and data parallelism.

Implementation Of Parallel Computing In Software  When implemented in software(or rather algorithms), the terminology calls it ‘parallel programming’.  An algorithm is split into pieces and then executed, as seen earlier.  Important Points to remember:  Dependencies: A typical scenario when line 6 of an algorithm is dependent on lines 2,3,4 and 5.  Application checkpoints: Just like saving the algorithm, or like creating backup point.  Automatic parallelization: Identifying dependencies and parallelizing algorithms automatically. This has achieved limited success.  When implemented in hardware, it is called as ‘parallel processing’.  Typically, when a chunk of load for execution is divided for processing by units like cores, processors, CPUs, etc.

Classes Of Parallel Computing 1. Multicore Computing: A multicore processor is a processor that includes multiple execution units ("cores") on the same chip. These processors differ from superscalar processors, which can issue multiple instructions per cycle from one instruction stream (thread); in contrast, a multicore processor can issue multiple instructions per cycle from multiple instruction streams. IBM's Cell microprocessor, designed for use in the Sony PlayStation 3, is another prominent multicore processor. 2. Symmetric Multiprocessing : A symmetric multiprocessor (SMP) is a computer system with multiple identical processors that share memory and connect via a bus. Bus contention prevents bus architectures from scaling. As a result, SMPs generally do not comprise more than 32 processors. Because of the small size of the processors and the significant reduction in the requirements for bus bandwidth achieved by large caches, such symmetric multiprocessors are extremely cost-effective, provided that a sufficient amount of memory bandwidth exists.

3. Distributed Computing : A distributed computer (also known as a distributed memory multiprocessor) is a distributed memory computer system in which the processing elements are connected by a network. Distributed computers are highly scalable. 4. Cluster Computing : A cluster is a group of loosely coupled computers that work together closely, so that in some respects they can be regarded as a single computer. Clusters are composed of multiple standalone machines connected by a network. While machines in a cluster do not have to be symmetric, load balancing is more difficult if they are not. 5. Grid Computing: Grid computing is the most distributed form of parallel computing. It makes use of computers communicating over the Internet to work on a given problem. Because of the low bandwidth and extremely high latency available on the Internet, distributed computing typically deals only with embarrassingly parallel problems. Many distributed computing applications have been created, of which and are the best-known examples.

Uses Of Parallel Computing  Saves Time and Money.  Solves Larger/Complex Problems.  Provides Concurrency.  In a natural world, many complex, interrelated events happen at the same time, yet within a temporal sequence.  Compared to serial computing, parallel computing is much better suited for modelling, simulating and understanding complex, real world phenomena.  Data is processed quicker.

Applications of Parallel Computing  It is used in different fields like..  Science and Engineering.  Industrial and Commercial.  Aerospace.  Medicine.  Weather Forecasting  Digital Media.  There are more than 500 fields where Parallel Computing is being used with great effect.

References:  tis. tis   Michael Sipser’s book on Theory Of Computation.  

Queries????? Thank You