Parallel Computing Laxmikant Kale

Slides:



Advertisements
Similar presentations
COE 502 / CSE 661 Parallel and Vector Architectures Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and Minerals.
Advertisements

Zhao Lixing.  A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation.  Supercomputers.
Introduction CS 524 – High-Performance Computing.
Tuesday, September 04, 2006 I hear and I forget, I see and I remember, I do and I understand. -Chinese Proverb.
ECE669 L1: Course Introduction January 29, 2004 ECE 669 Parallel Computer Architecture Lecture 1 Course Introduction Prof. Russell Tessier Department of.
MD240 - Management Information Systems Sept. 13, 2005 Computing Hardware – Moore's Law, Hardware Markets, and Computing Evolution.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.
Introduction What is Parallel Algorithms? Why Parallel Algorithms? Evolution and Convergence of Parallel Algorithms Fundamental Design Issues.
CIS 629 Parallel Arch. Intro Parallel Computer Architecture Slides blended from those of David Patterson, CS 252 and David Culler, CS 258 UC Berkeley.
Chapter 1 An Overview of Personal Computers
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
Why Parallel Architecture? Todd C. Mowry CS 495 January 15, 2002.
Lecture 1: Introduction to High Performance Computing.
CMSC 611: Advanced Computer Architecture Parallel Computation Most slides adapted from David Patterson. Some from Mohomed Younis.
Advanced Computer Architectures
Computer performance.
Computer System Architectures Computer System Software
WHAT IS A COMPUTER? Computer is an electronic device designed to manipulate data so that useful information can be generated. Computer is multifunctional.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture.
An Introduction to 64-bit Computing. Introduction The current trend in the market towards 64-bit computing on desktops has sparked interest in the industry.
Classification of Computers
Multi-core architectures. Single-core computer Single-core CPU chip.
Multi-Core Architectures
Outline Course Administration Parallel Archtectures –Overview –Details Applications Special Approaches Our Class Computer Four Bad Parallel Algorithms.
Led the WWII research group that broke the code for the Enigma machine proposed a simple abstract universal machine model for defining computability devised.
ECE 568: Modern Comp. Architectures and Intro to Parallel Processing Fall 2006 Ahmed Louri ECE Department.
2015/10/14Part-I1 Introduction to Parallel Processing.
CS 320 Spring 2003 Introduction Laxmikant Kale
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February Session 6.
Problem is to compute: f(latitude, longitude, elevation, time)  temperature, pressure, humidity, wind velocity Approach: –Discretize the.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
Message Passing Computing 1 iCSC2015,Helvi Hartmann, FIAS Message Passing Computing Lecture 1 High Performance Computing Helvi Hartmann FIAS Inverted CERN.
COMP25212: System Architecture Lecturers Alasdair Rawsthorne Daniel Goodman
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
ECE 569: High-Performance Computing: Architectures, Algorithms and Technologies Spring 2006 Ahmed Louri ECE Department.
Parallel Processing & Distributed Systems Thoai Nam And Vu Le Hung.
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.
INEL6067 Technology ---> Limitations & Opportunities Wires -Area -Propagation speed Clock Power VLSI -I/O pin limitations -Chip area -Chip crossing delay.
CS433 Spring 2001 Introduction Laxmikant Kale. 2 Course objectives and outline You will learn about: –Parallel programming models Emphasis on 3: message.
1 Lecture 1: Parallel Architecture Intro Course organization:  ~18 parallel architecture lectures (based on text)  ~10 (recent) paper presentations 
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Reduced Instruction Set Computing Ammi Blankrot April 26, 2011 (RISC)
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
ECE 252 / CPS 220 Advanced Computer Architecture I Reading Discussion 1 Benjamin Lee Electrical and Computer Engineering Duke University
Background Computer System Architectures Computer System Software.
Hardware Trends CSE451 Andrew Whitaker. Motivation Hardware moves quickly OS code tends to stick around for a while “System building” extends way beyond.
Vector computers.
Conclusions on CS3014 David Gregg Department of Computer Science
Introduction to Parallel Processing
Web: Parallel Computing Rabie A. Ramadan , PhD Web:
CMSC 611: Advanced Computer Architecture
Constructing a system with multiple computers or processors
Introduction to Reconfigurable Computing
CS775: Computer Architecture
What is Parallel and Distributed computing?
Lecture 1: Parallel Architecture Intro
Performance of computer systems
CS 258 Parallel Computer Architecture
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Unit One - Computing Fundamentals
Constructing a system with multiple computers or processors
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
Computer Evolution and Performance
Performance of computer systems
Presentation transcript:

Parallel Computing Laxmikant Kale

2 Advent of parallel computing “Parallel computing is necessary to increase speeds” –cry of the ‘70s –processors kept pace with Moore’s law: Doubling speeds every 18 months Now, finally, the time is ripe –uniprocessors are commodities (and proc. speeds shows signs of slowing down) –Highly economical to build parallel machines

3 What is Parallel Computing? Use of multiple processors to solve a single computational problem faster. –Distinct from distributed computing Why parallel computing? –“Parallel computing is necessary to increase speeds” –cry of the ‘70s –processors kept pace with Moore’s law: Doubling speeds every 18 months Now, finally, the time is ripe –uniprocessors are commodities (and proc. speeds shows signs of slowing down) –Highly economical to build parallel machines

4 Why parallel computing It is the only way to increase speed beyond uniprocessors –Except, of course, waiting for uniprocessors to become faster! –Several applications require orders of magnitude higher performance than feasible on uniprocessors Cost effectiveness: –older argument –in 1985, a supercomputer cost 2000 times more than a desktop, yet performed only 400 times faster. –So: combine microcomputers to get speed at lower costs –Incremental scalability: can get inbetween performance points with 20, 50, 100,… processors –But: You may get speedup lower than 400 on 2000 processors! Microcomputers became faster, killing supercomputers, effectively

5 Technology Trends The natural building block for multiprocessors is now also about the fastest!

6 Architectural Trends Greatest trend in VLSI generation is increase in parallelism –Up to 1985: bit level parallelism: 4-bit -> 8 bit -> 16-bit slows after 32 bit adoption of 64-bit now under way, 128-bit far (not performance issue) great inflection point when 32-bit micro and cache fit on a chip –Mid 80s to mid 90s: instruction level parallelism pipelining and simple instruction sets, + compiler advances (RISC) on-chip caches and functional units => superscalar execution greater sophistication: out of order execution, speculation, prediction –to deal with control transfer and latency problems

7 Economics Commodity microprocessors not only fast but CHEAP Development cost is tens of millions of dollars (5-100 typical) BUT, many more are sold compared to supercomputers –Crucial to take advantage of the investment, and use the commodity building block –Exotic parallel architectures no more than special-purpose Multiprocessors being pushed by software vendors (e.g. database) as well as hardware vendors Standardization by Intel makes small, bus-based SMPs commodity Desktop: few smaller processors versus one larger one? –Multiprocessor on a chip

8 What to Expect? Parallel Machine classes: –Cost and usage defines a class! Architecture of a class may change. –Desktops, Engineering workstations, database/web servers, suprtcomputers, Commodity (home/office) desktop: –less than $5,000 –possible to provide 5-25 processors for that price! –Driver applications: games, video /signal processing, possibly “peripheral” AI: speech recognition, natural language understanding (?), smart spaces and agents New applications?

9 Engineeering workstations Price: less than $100,000 (used to be): –new proce level acceptable may be $50,000 –100+ processors, large memory, –Driver applications: CAD (Computer aided design) of various sorts VLSI Structural and mechanical simulations… Etc. (many specialized applications)

10 Commercial Servers Price range: variable ($10,000 - several hundreds of thousands) –defining characteristic: usage –Database servers, decision support (MIS), web servers, e-commerce High availability, fault tolerance are main criteria Trends to watch out for: –Likely emergence of specialized architectures/systems E.g. Oracle’s “No Native OS” approach Currently dominated by database servers, and TPC benchmarks –TPC: transactions per second –But this may change to data mining and application servers, with corresponding impact on architecure.

11 Supercomputers “Definition”: expensive system?! –Used to be defined by architecture (vector processors,..) –More than a million US dollars? –Thousands of processors Driving applications –Grand challenges in science and engineering: –Global weather modeling and forecast –Rational Drug design / molecular simulations –Processing of genetic (genome) information –Rocket simulation –Airplane design (wings and fluid flow..) –Operations research?? Not recognized yet –Other non-traditional applications?

12 Scientific Computing Demand

13 Engineering Computing Demand Large parallel machines a mainstay in many industries –Petroleum (reservoir analysis) –Automotive (crash simulation, drag analysis, combustion efficiency), –Aeronautics (airflow analysis, engine efficiency, structural mechanics, electromagnetism), –Computer-aided design –Pharmaceuticals (molecular modeling) –Visualization in all of the above entertainment (films like Toy Story) architecture (walk-throughs and rendering) –Financial modeling (yield and derivative analysis) –etc.

14 What is Challenging? Writing parallel programs is difficult –Office worker analogy Issues of Coordination: I thought you were going to get the pizza Asynchrony: what happens before what Race conditions: can’t determine which will happen first And finally: Performance!