A few words about parallel computing

Slides:



Advertisements
Similar presentations
CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.
Advertisements

Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Parallel System Performance CS 524 – High-Performance Computing.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
DISTRIBUTED COMPUTING
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
Computer System Architectures Computer System Software
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
1 Measuring and Discussing Computer System Performance or “My computer is faster than your computer” Reading: 2.4, Peer Instruction Lecture Materials.
Architectures of distributed systems Fundamental Models
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
4-Jun-16COMP28112 Lecture 181 It’s been three months since the first lecture… We went a long way… We covered a number of topics… Let’s see some… COMP28112.
Vector/Array ProcessorsCSCI 4717 – Computer Architecture CSCI 4717/5717 Computer Architecture Topic: Vector/Array Processors Reading: Stallings, Section.
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
Lecture 8: 9/19/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
Architecture Models. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Data Structures and Algorithms in Parallel Computing Lecture 1.
Advanced Computer Networks Lecture 1 - Parallelization 1.
CSIS Parallel Architectures and Algorithms Dr. Hoganson Speedup Summary Balance Point The basis for the argument against “putting all your (speedup)
CSC 480 Software Engineering Lecture 17 Nov 4, 2002.
Concurrency and Performance Based on slides by Henri Casanova.
Canadian Bioinformatics Workshops
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
30-Sep-16COMP28112 Lecture 21 A few words about parallel computing Challenges/goals with distributed systems: Heterogeneity Openness Security Scalability.
DCS/1 CENG Distributed Computing Systems Measures of Performance.
Distributed Systems Architectures. Topics covered l Client-server architectures l Distributed object architectures l Inter-organisational computing.
Measuring Performance II and Logic Design
A few words about parallel computing
Lecture 2: Performance Evaluation
COMP28112 – Lecture 16 The Quest for Performance (lab exercise 3)
ECE 4100/6100 Advanced Computer Architecture Lecture 1 Performance
Introduction to parallel programming
What Exactly is Parallel Processing?
Distributed System 電機四 陳伯翰 b
CSC 480 Software Engineering
Introduction to Parallelism.
Morgan Kaufmann Publishers
EE 193: Parallel Computing
Chapter 3: Principles of Scalable Performance
Data Structures and Algorithms in Parallel Computing
Chapter 17: Database System Architectures
COMP28112 – Lecture 18 The quest for performance (lab exercise 3)
CSE 451: Operating Systems Autumn 2003 Lecture 16 RPC
COMP28112 – Lecture 16 The quest for performance (lab exercise 3)
CSE8380 Parallel and Distributed Processing Presentation
Architectures of distributed systems Fundamental Models
Distributed Systems CS
COMP28112 Distributed Computing
Architectures of distributed systems Fundamental Models
COMP28112 Distributed Computing
COMP60621 Fundamentals of Parallel and Distributed Systems
COMP28112 Lecture 2 A few words about parallel computing
COMP28112 Distributed Computing
Prof. Leonardo Mostarda University of Camerino
COMP28112 Distributed Computing
COMP28112 Distributed Computing
A few words about parallel computing
Chapter 4 Multiprocessors
Architectures of distributed systems
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Introduction To Distributed Systems
Architectures of distributed systems Fundamental Models
CSE 451: Operating Systems Winter 2003 Lecture 16 RPC
COMP60611 Fundamentals of Parallel and Distributed Systems
COMP28112 Distributed Computing
COMP28112 – Lecture 15 The quest for performance (lab exercise 3)
Network File System (NFS)
CSE 451: Operating Systems Messaging and Remote Procedure Call (RPC)
Presentation transcript:

A few words about parallel computing COMP28112 Lecture 2 Challenges/goals with distributed systems: Heterogeneity Openness Security Scalability Failure Handling Concurrency Transparency A few words about parallel computing 30-Apr-19 COMP28112 Lecture 2

“Latency is not zero” Two options are made available to a client to retrieve some information from a remote server: The first option allows the client to retrieve a 100MB file (containing all the relevant information) at once. The second option requires the client to make 10 remote requests to the server. Each remote request will return a file of 1MB (all 10 files are needed by the client to retrieve the relevant information). If the latency of the network link between the client and the server is 1sec and the available bandwidth is 100MB/sec which option would be faster? What if the latency is 1msec? What do you conclude? 30-Apr-19 COMP28112 Lecture 2

Why would you need 100 (or 1000 for that matter) networked computers Why would you need 100 (or 1000 for that matter) networked computers? What would you do? Common characteristic: chop into pieces! 30-Apr-19 COMP28112 Lecture 2

Increased Performance There are many applications (especially in science) where good response time is needed. How can we increase execution time? By using more computers to work in parallel This is not as trivial as it sounds and there are limits (we’ll come back to this) Parallel computing Examples: Weather prediction; Long-running simulations; … 30-Apr-19 COMP28112 Lecture 2

How do we speed up things? Many scientific operations are inherently parallel: for(i=1;i<100;i++) A[i]=B[i]+C[i] for(j=1;j<100;j++) A[i,j]=B[i,j]+C[i,j] Different machines can operate on different data. Ideally, this may speedup execution by a factor equal to the number of processors (speedup = ts / tp = sequential_time / parallel_time) 30-Apr-19 COMP28112 Lecture 2

Mesa et al, “Scalability of Macroblock-level Parallelism for H Mesa et al, “Scalability of Macroblock-level Parallelism for H.264 Decoding” 30-Apr-19 COMP28112 Lecture 2

Parallel Computing vs Distributed Computing Multiple CPUs in the same computer Distributed Computing: Networked Computers connected by a network. But don’t use the distinction above as a universal rule: in practice, it is not clear where exactly we draw the line between the two. Some of the problems are common, but with a different flavour… 30-Apr-19 COMP28112 Lecture 2

An example: IBM Deep Blue A 30 node parallel computer. In May 1997 the machine won a 6-game match of chess against the then world champion. The machine could evaluate about 200 million positions per second or up to 40 turns in advance. In parallel computing, we need to deal with: Communication (parallel computers may need to send data to each other) Synchronisation (think about the problems with process synchronisation from COMP25111: Operating Systems). 30-Apr-19 COMP28112 Lecture 2

Not all applications can be parallelised! x=3 x=3 y=4 y=x+4 z=180*224 z=y+x w=x+y+z w=x+y+z What can you parallelise? If the proportion of an application that you can parallelise is x (0<x<1), then the maximum speed up you expect from parallel execution is: 1/(1-x) The running time, tp, of a programme running on p CPUs, which when running on one machine runs in time ts, will be equal to: tp=to+ts*(1-x+x/p), where to is the overhead added due to parallelisation (e.g., communication, synchronisation, etc). (Read about Amdahl’s law) 30-Apr-19 COMP28112 Lecture 2

Example x=0.2; Then, tp=to+ts*(0.8+0.2/p) Assume that to=1, ts=1000, then:tp=801+200/p p=1, tp=1001; p=2, tp=901; p=4, tp=851 … In practice, to is a function of the number of processors, p. Assume that tp=to*p+ts*(0.8+0.2/p): Then, tp=800+p+200/p: p=1, tp=1001; p=2, tp=902; p=4, tp=854; p=10, tp=830; p=100, tp=854; p=200, tp=1001, … 30-Apr-19 COMP28112 Lecture 2

How to parallelise an application Try to identify instructions that can be executed in parallel. These instructions should be independent of each other (the result of one should not depend on the other) Try to identify instructions that are executed on multiple data; different machines operate on different data. (Loops are the biggest source of parallelism) Dependences can be schematically represented – see graph! 30-Apr-19 COMP28112 Lecture 2

In Summary... Several challenges / goals: Reading: Heterogeneity, openness, security, scalability, failure handling, concurrency, transparency. Parallel computing: not really a topic for this module; some of the problems are common, but the priorities are different. Reading: Coulouris, Section 1.4, 1.5 (all of Chapter 1 covered). Tanenbaum, Section 1.2 (all of Chapter 1 covered) Next time: Architectures and Models. 30-Apr-19 COMP28112 Lecture 2