Parallel Processing & Distributed Systems Thoai Nam And Vu Le Hung.

Slides:

Advertisements

Similar presentations

CSE431 Chapter 7A.1Irwin, PSU, 2008 CSE 431 Computer Architecture Fall 2008 Chapter 7A: Intro to Multiprocessor Systems Mary Jane Irwin (

Advertisements

Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.

Chapter 1 An Introduction To Microprocessor And Computer

Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.

Parallell Processing Systems1 Chapter 4 Vector Processors.

Parallel Programming Yang Xianchun Department of Computer Science and Technology Nanjing University Introduction.

1 Programming Languages Storage Management Cao Hoaøng Truï Khoa Coâng Ngheä Thoâng Tin Ñaïi Hoïc Baùch Khoa TP. HCM.

1 Lecture 5: Part 1 Performance Laws: Speedup and Scalability.

Parallel Processing & Distributed Systems Thoai Nam Chapter 2.

GPUs. An enlarging peak performance advantage: –Calculation: 1 TFLOPS vs. 100 GFLOPS –Memory Bandwidth: GB/s vs GB/s –GPU in every PC and.

Introduction What is Parallel Algorithms? Why Parallel Algorithms? Evolution and Convergence of Parallel Algorithms Fundamental Design Issues.

Why Parallel Architecture? Todd C. Mowry CS 495 January 15, 2002.

1 CS402 PPP # 1 Computer Architecture Evolution. 2 John Von Neuman original concept.

Parallel Algorithms - Introduction Advanced Algorithms & Data Structures Lecture Theme 11 Prof. Dr. Th. Ottmann Summer Semester 2006.

Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.

3.1Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:

10-1 Chapter 10 - Trends in Computer Architecture Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Principles.

Introduction to Microcontrollers Dr. Konstantinos Tatas

1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.

1 Parallel computing and its recent topics. 2 Outline 1. Introduction of parallel processing (1)What is parallel processing (2)Classification of parallel.

Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.

Gary MarsdenSlide 1University of Cape Town Computer Architecture – Introduction Andrew Hutchinson & Gary Marsden (me) ( ) 2005.

-1.1- PIPELINING 2 nd week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PIPELINING 2 nd week References Pipelining concepts The DLX.

Parallel Computing Laxmikant Kale

Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University.

ECE 568: Modern Comp. Architectures and Intro to Parallel Processing Fall 2006 Ahmed Louri ECE Department.

C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.

[Tim Shattuck, 2006][1] Performance / Watt: The New Server Focus Improving Performance / Watt For Modern Processors Tim Shattuck April 19, 2006 From the.

Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"

Problem is to compute: f(latitude, longitude, elevation, time)  temperature, pressure, humidity, wind velocity Approach: –Discretize the.

PARALLEL PARADIGMS AND PROGRAMMING MODELS 6 th week.

1 Multithreaded Programming Concepts Myongji University Sugwon Hong 1.

Parallel Processing Steve Terpe CS 147. Overview What is Parallel Processing What is Parallel Processing Parallel Processing in Nature Parallel Processing.

-1- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Parallel Computer Architectures 2 nd week References Flynn’s Taxonomy Classification of Parallel.

ECE 569: High-Performance Computing: Architectures, Algorithms and Technologies Spring 2006 Ahmed Louri ECE Department.

Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8.

Hyper Threading Technology. Introduction Hyper-threading is a technology developed by Intel Corporation for it’s Xeon processors with a 533 MHz system.

MATRIX MULTIPLICATION 4 th week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM MATRIX MULTIPLICATION 4 th week References Sequential matrix.

1)Leverage raw computational power of GPU  Magnitude performance gains possible.

Morgan Kaufmann Publishers

Parallel Processing & Distributed Systems Thoai Nam Chapter 2.

1 EGRE 426 Handout 1 8/25/09. 2 Preliminary EGRE 365 is a prerequisite for this class. Class web page egre426/index.html.

10-1 Chapter 10 - Trends in Computer Architecture Department of Information Technology, Radford University ITEC 352 Computer Organization Principles of.

3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.

3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.

Chao Han ELEC6200 Computer Architecture Fall 081ELEC : Han: PowerPC.

An Overview of Parallel Processing

Central Processing Unit (CPU) The Computer’s Brain.

Fast and parallel implementation of Image Processing Algorithm using CUDA Technology On GPU Hardware Neha Patil Badrinath Roysam Department of Electrical.

Parallel Processing & Distributed Systems Thoai Nam Chapter 3.

Lecture # 10 Processors Microcomputer Processors.

The Pentium Series CS 585: Computer Architecture Summer 2002 Tim Barto.

1 Potential for Parallel Computation Chapter 2 – Part 2 Jordan & Alaghband.

1 A simple parallel algorithm Adding n numbers in parallel.

Distributed and Parallel Processing George Wells.

Chapter Overview General Concepts IA-32 Processor Architecture

University of Technology

Parallel Processing - introduction

Assembly Language for Intel-Based Computers, 5th Edition

Morgan Kaufmann Publishers

Array Processor.

3.1 Introduction to CPU Central processing unit etched on silicon chip called microprocessor Contain tens of millions of tiny transistors Key components:

Overview Parallel Processing Pipelining

Chapter 1 Introduction.

Computer Evolution and Performance

The University of Adelaide, School of Computer Science

COMPUTER ARCHITECTURES FOR PARALLEL ROCESSING

Parallel Speedup.

Utsunomiya University

Presentation transcript:

Parallel Processing & Distributed Systems Thoai Nam And Vu Le Hung

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Chapter 1: Introduction  Introduction –What is parallel processing? –Why do we use parallel processing?  Applications  Parallel Processing Terminology

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Sequential Processing  1 CPU  Simple  Big problems???

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Application Demands

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Power Computing (1)  Power Processor –50 Hz -> 100 Hz -> 1 GHz -> 2 Ghz ->... -> Upper bound???  Smart worker –Better algorithm  Parallel processing

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Power Computing (2)  Processor –Performance of killer micro  50~60% / year –Transistor count  Doubles every 3 years –On-chip parallel processing  100 million transistors by early 2000’s  Ex) Hyper-threading by Intel –Clock rates go up only slowly  30% / year  Memory –DRAM size  4 times every 3 years  Speed gap between memory and processor increases!!!

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Architectural Trend  Greatest trend in VLSI is increase in parallelism –Up to 1985 : bit level parallelism: 4 bit -> 8 bit -> 16 bit -> 32 bit –After mid 80’s : instruction level parallelism, valuable but limited.  Coarse-grain parallelism becomes more popular  Today’s microprocessors have multiprocessor support  Bus-based multiprocessor servers and workstations –Sun, SGI, Intel, etc

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Parallel Processing Terminology  Parallel Processing  Parallel Computer –Multi-processor computer capable of parallel processing  Throughput  Speedup S = Time(the most efficient sequential algorithm) / Time(parallel algorithm)  Parallelism: –Pipeline –Data parallelism –Control parallelism

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Pipeline  A number of steps called segments or stages  The output of one segment is the input of other segment Stage 1 Stage 2 Stage 3

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Data Parallelism  Applying the same operation simultaneously to elements of a data set

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Pipeline & Data Parallelism A B C w4w3w2w1 A BC w5 w2w1 A B C w5w2 A B C w4w1 A B C w6w3 1. Sequential execution 2. Pipeline 3. Data Parallelism

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Pipeline & Data Parallelism  Pipeline is a special case of control parallelism  T(s): Sequential execution time T(p): Pipeline execution time (with 3 stages) T(dp): Data-parallelism execution time (with 3 processors) S(p): Speedup of pipeline S(dp): Speedup of data parallelism widget T(s) T(p) T(dp) S(p)11+1/21+4/522+1/72+1/42+1/32+2/52+5/112+1/2 S(dp) /232+1/32+2/332+1/2

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Pipeline & Data Parallelism

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Control Parallelism  Applying different operations to different data elements simultaneously

Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Scalability  An algorithm is scalable if the level of parallelism increases at least linearly with the problem size.  An architecture is scalable if it continues to yield the same performance per processor, albeit used in large problem size, as the number of processors increases.  Data-parallelism algorithms are more scalable than control-parallelism algorithms