Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University.

Slides:



Advertisements
Similar presentations
4. Shared Memory Parallel Architectures 4.4. Multicore Architectures
Advertisements

Multicore Architectures Michael Gerndt. Development of Microprocessors Transistor capacity doubles every 18 months © Intel.
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
Structure of Computer Systems
Computer Abstractions and Technology
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
SYNAR Systems Networking and Architecture Group CMPT 886: Special Topics in Operating Systems and Computer Architecture Dr. Alexandra Fedorova School of.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Multicore & Parallel Processing P&H Chapter ,
An Introduction To PARALLEL PROGRAMMING Ing. Andrea Marongiu
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
1  1998 Morgan Kaufmann Publishers Chapter 9 Multiprocessors.
1 Multi - Core fast Communication for SoPC Multi - Core fast Communication for SoPC Technion – Israel Institute of Technology Department of Electrical.
Chapter Hardwired vs Microprogrammed Control Multithreading
1 Interfacing Processors and Peripherals I/O Design affected by many factors (expandability, resilience) Performance: — access latency — throughput — connection.
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.
Joram Benham April 2,  Introduction  Motivation  Multicore Processors  Overview, CELL  Advantages of CMPs  Throughput, Latency  Challenges.
Lecture 2 : Introduction to Multicore Computing Bong-Soo Sohn Associate Professor School of Computer Science and Engineering Chung-Ang University.
1 Parallel computing and its recent topics. 2 Outline 1. Introduction of parallel processing (1)What is parallel processing (2)Classification of parallel.
GPU Programming with CUDA – Accelerated Architectures Mike Griffiths
Computer System Architectures Computer System Software
Thinking in Parallel Adopting the TCPP Core Curriculum in Computer Systems Principles Tim Richards University of Massachusetts Amherst.
Guide to Operating Systems, 4th ed.
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
Lecture 2 : Introduction to Multicore Computing
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Last Time Performance Analysis It’s all relative
Multi-Core Architectures
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
1 Thread level parallelism: It’s time now ! André Seznec IRISA/INRIA CAPS team.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Multi-core.  What is parallel programming ?  Classification of parallel architectures  Dimension of instruction  Dimension of data  Memory models.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
Message Passing Computing 1 iCSC2015,Helvi Hartmann, FIAS Message Passing Computing Lecture 1 High Performance Computing Helvi Hartmann FIAS Inverted CERN.
VTU – IISc Workshop Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8.
Dr. Alexandra Fedorova School of Computing Science SFU
Processor Level Parallelism. Improving the Pipeline Pipelined processor – Ideal speedup = num stages – Branches / conflicts mean limited returns after.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Outline Why this subject? What is High Performance Computing?
Computer Organization CS224 Fall 2012 Lesson 52. Introduction  Goal: connecting multiple computers to get higher performance l Multiprocessors l Scalability,
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Background Computer System Architectures Computer System Software.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
1  2004 Morgan Kaufmann Publishers Fallacies and Pitfalls Fallacy: the rated mean time to failure of disks is 1,200,000 hours, so disks practically never.
1 A simple parallel algorithm Adding n numbers in parallel.
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Feeding Parallel Machines – Any Silver Bullets? Novica Nosović ETF Sarajevo 8th Workshop “Software Engineering Education and Reverse Engineering” Durres,
Lynn Choi School of Electrical Engineering
مقدمة في علوم الحاسب Lecture 5
CS5102 High Performance Computer Systems Thread-Level Parallelism
Multi-Processing in High Performance Computer Architecture:
Chapter 17 Parallel Processing
Hybrid Programming with OpenMP and MPI
EE 4xx: Computer Architecture and Performance Programming
CSC3050 – Computer Architecture
Chapter 4 Multiprocessors
Introduction To Distributed Systems
Distributed Systems and Concurrency: Distributed Systems
Prebared by Omid Mustefa.
Presentation transcript:

Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University

Parallel and distributed systems What is a parallel computer? –A collection of processing elements that communicate and coorperate to solve large problems fast. What is a distributed system? –A collection of independent computers that appear to its users as a single coherent system. –A parallel computer is implicitly a distributed system.

What are they? More than one processing element Elements communicate and cooperate Appear as one system PDS hardware is everywhere. –diablo.cs.fsu.edu (2-CPU SMP) –Multicore workstations/laptops –linprog.cs.fsu.edu (4-node cluster, 2-CPU SMP, 8 cores per node) –Hadoop clusters –IBM Blue Gene/L at DOE/NNSA/LLNL ( processors). –Data centers

Why parallel and distributed computing? Moore’s law: the number of transistors double every 18 months.

Why parallel and distributed computing? How to make good use of the increasing number of transistors on a chip? –Increase the computation width (70’s and 80’s) 4bit->8bit->16bit->32bit->64bit->…. –Instruction level parallelism (90’s) Pipeline, LIW, etc –ILP to the extreme (early 00’s) Out of order execution, 6-way issues, etc –Sequential program performance keeps increasing. The clock rate keeps increasing, clock cycles get smaller and smaller.

Why parallel and distributed computing? The fundament limit of the speed of a sequential processor. –Power wall (high frequency results in heat) –Latency wall (speed of light does not change)

Power wall: what will happen if we keep pushing ILP?

Latency wall Speed of light = m/s One cycle at 4Ghz = s The distance that the light can move at one cycle: – * = 7.5cm Intel chip dimension = 1.47 in x 1.47 in = 3.73cm x 3.73cm

Why parallel and distributed computing? Power wall and latency wall indicate that the era of single thread performance improvement through Moore’s law is ending/has ended. More transistors on a chip are now applied to increase system throughput, the total number of instructions executed, but not latency, the time for a job to finish. –Improving ILP improves both. –We see a multi-core era

The marching of multi-core (2004-now) –Mainstream processors: INTEL Quad-core Xeon (4 cores), AMD Quad-core Opteron (4 cores), SUN Niagara (8 cores), IBM Power6 (2 cores) –Others Intel Tflops (80 cores) CISCO CSR-1 (180 cores) (network processors) –Increase the throughput without increasing the clock rate.

Why parallel and distributed computing? –Programming multi-core systems is fundamentally different from programming traditional computers. Parallelism needs to be explicitly expressed in the program. –This is traditionally referred to as parallel computing. PDC is not really a choice anymore.

For (I=0; I<500; I++) a[I] = 0; Performance GO PARALLEL

In the foreseeable future, parallel systems with multi-core processors are going mainstream. –Lots of hardware, software, and human issues in mainstream parallel/distributed computing How to make use a large number of processors simultaneously How to write parallel programs easily and efficiently? What kind of software supports is needed? Analyzing and optimizing parallel programs is still a hard problem. How to debug concurrent program?

Parallel computing issues Data sharing – single versus multiple address space. Process coordination – synchronization using locks, messages, and other means. Distributed versus centralized memory. Connectivity – single shared bus versus network with many different topologies. Fault tolerance/reliability.

Distributed systems issues Transparency –Access (data representation), location, migration, relocation, replication, concurrency, failure, persistence Scalability –Size, geographically Reliability/fault tolerance

What we will be doing in this course? This is the first class in PDC. –Systematically introduce most mainstream parallel computing platforms and programming paradigms. The coverage is not necessary in depth.