30-Sep-16COMP28112 Lecture 21 A few words about parallel computing Challenges/goals with distributed systems: Heterogeneity Openness Security Scalability.

Slides:



Advertisements
Similar presentations
CSE 160 – Lecture 9 Speed-up, Amdahl’s Law, Gustafson’s Law, efficiency, basic performance metrics.
Advertisements

Parallel System Performance CS 524 – High-Performance Computing.
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
Computer System Architectures Computer System Software
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
1 Measuring and Discussing Computer System Performance or “My computer is faster than your computer” Reading: 2.4, Peer Instruction Lecture Materials.
Architectures of distributed systems Fundamental Models
CS 8625 High Performance and Parallel, Dr. Hoganson Copyright © 2005, 2006 Dr. Ken Hoganson CS8625-June Class Will Start Momentarily… Homework.
Super computers Parallel Processing By Lecturer: Aisha Dawood.
Scaling Area Under a Curve. Why do parallelism? Speedup – solve a problem faster. Accuracy – solve a problem better. Scaling – solve a bigger problem.
4-Jun-16COMP28112 Lecture 181 It’s been three months since the first lecture… We went a long way… We covered a number of topics… Let’s see some… COMP28112.
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
1 CS/COE0447 Computer Organization & Assembly Language CHAPTER 4 Assessing and Understanding Performance.
Lecture 8: 9/19/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
Architecture Models. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Data Structures and Algorithms in Parallel Computing Lecture 1.
Advanced Computer Networks Lecture 1 - Parallelization 1.
CSIS Parallel Architectures and Algorithms Dr. Hoganson Speedup Summary Balance Point The basis for the argument against “putting all your (speedup)
Concurrency and Performance Based on slides by Henri Casanova.
Canadian Bioinformatics Workshops
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
DCS/1 CENG Distributed Computing Systems Measures of Performance.
Introduction to threads
A few words about parallel computing
Lecture 2: Performance Evaluation
A Level Computing – a2 Component 2 1A, 1B, 1C, 1D, 1E.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Software Architecture in Practice
Introduction to parallel programming
Defining Performance Which airplane has the best performance?
What Exactly is Parallel Processing?
Distributed System 電機四 陳伯翰 b
CSC 480 Software Engineering
Introduction to Parallelism.
Morgan Kaufmann Publishers
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs (cont.) Dr. Xiao.
CS2100 Computer Organisation
Chapter 16: Distributed System Structures
Chapter 3: Principles of Scalable Performance
Data Structures and Algorithms in Parallel Computing
Complexity Measures for Parallel Computation
CSE8380 Parallel and Distributed Processing Presentation
Architectures of distributed systems Fundamental Models
Distributed computing deals with hardware
COMP28112 Distributed Computing
Architectures of distributed systems Fundamental Models
COMP28112 Distributed Computing
COMP60621 Fundamentals of Parallel and Distributed Systems
COMP28112 Lecture 2 A few words about parallel computing
Multithreaded Programming
COMP28112 Distributed Computing
Prof. Leonardo Mostarda University of Camerino
COMP28112 Distributed Computing
COMP28112 Distributed Computing
A few words about parallel computing
Chapter 4 Multiprocessors
Architectures of distributed systems
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
Introduction To Distributed Systems
A few words about parallel computing
Architectures of distributed systems Fundamental Models
CS 704 Advanced Computer Architecture
COMP60611 Fundamentals of Parallel and Distributed Systems
CS 111 – Sept Beyond the CPU and memory
COMP28112 Distributed Computing
Network File System (NFS)
CS2100 Computer Organisation
Presentation transcript:

30-Sep-16COMP28112 Lecture 21 A few words about parallel computing Challenges/goals with distributed systems: Heterogeneity Openness Security Scalability Failure Handling Concurrency Transparency

30-Sep-16COMP28112 Lecture 22 “Latency is not zero” Two options are made available to a client to retrieve some information from a remote server: The first option allows the client to retrieve a 100MB file (containing all the relevant information) at once. The second option requires the client to make 10 remote requests to the server. Each remote request will return a file of 1MB (all 10 files are needed by the client to retrieve the relevant information). If the latency of the network link between the client and the server is 1sec and the available bandwidth is 100MB/sec which option would be faster? What if the latency is 1msec? What do you conclude?

30-Sep-16COMP28112 Lecture 23 Why would you need 100 (or 1000 for that matter) networked computers? What would you do?

30-Sep-16COMP28112 Lecture 24 Increased Performance There are many applications (especially in science) where good response time is needed. How can we increase execution time? –By using more computers to work in parallel This is not as trivial as it sounds and there are limits (we’ll come back to this) –Parallel computing Examples: –Weather prediction; Long-running simulations; …

30-Sep-16COMP28112 Lecture 25 How do we speed up things? Many scientific operations are inherently parallel: for(i=1;i<100;i++) A[i]=B[i]+C[i] for(i=1;i<100;i++) for(j=1;j<100;j++) A[i,j]=B[i,j]+C[i,j] Different machines can operate on different data. Ideally, this may speedup execution by a factor equal to the number of processors (speedup = ts / tp = sequential_time / parallel_time)

30-Sep-16COMP28112 Lecture 26 Mesa et al, “Scalability of Macroblock-level Parallelism for H.264 Decoding”

30-Sep-16COMP28112 Lecture 27 Parallel Computing vs Distributed Computing Parallel Computing: –Multiple CPUs in the same computer Distributed Computing: –Networked Computers connected by a network. But don’t use the distinction above as a universal rule: in practice, it is not clear where exactly we draw the line between the two. Some of the problems are common, but with a different flavour…

30-Sep-16COMP28112 Lecture 28 An example: IBM Deep Blue A 30 node parallel computer. In May 1997 the machine won a 6-game match of chess against the then world champion. The machine could evaluate about 200 million positions per second or up to 40 turns in advance. In parallel computing, we need to deal with: –Communication (parallel computers may need to send data to each other) –Synchronisation (think about the problems with process synchronisation from COMP25111: Operating Systems).

30-Sep-16COMP28112 Lecture 29 Not all applications can be parallelised!x=3 y=4y=x+4 z=180*224z=y+xw=x+y+z What can you parallelise? If the proportion of an application that you can parallelise is x (0<x<1), then the maximum speed up you expect from parallel execution is: 1/(1-x) The running time, tp, of a programme running on p CPUs, which when running on one machine runs in time ts, will be equal to: tp=to+ts*(1-x+x/p), where to is the overhead added due to parallelisation (e.g., communication, synchronisation, etc). (Read about Amdahl’s law)

30-Sep-16COMP28112 Lecture 210 Example x=0.2; Then, tp=to+ts*( /p) Assume that to=1, ts=1000, then:tp= /p p=1, tp=1001; p=2, tp=901; p=4, tp=851 … In practice, to is a function of the number of processors, p. Assume that tp=to*p+ts*( /p): Then, tp=800+p+200/p: p=1, tp=1001; p=2, tp=902; p=4, tp=854; p=10, tp=830; p=100, tp=854; p=200, tp=1001, … Further reading: (“Amdahl, the story of a law that gets broken all the time” – “The total amount of work divided by the length of the critical path is a much better way to get an upper bound on useful parallelism for a given task” )

30-Sep-16COMP28112 Lecture 211 How to parallelise an application 1.Try to identify instructions that can be executed in parallel. These instructions should be independent of each other (the result of one should not depend on the other) 2.Try to identify instructions that are executed on multiple data; different machines operate on different data. (Loops are the biggest source of parallelism) Dependencies can be schematically represented – see graph!

30-Sep-16COMP28112 Lecture 212 Summary... Several challenges / goals: –Heterogeneity, openness, security, scalability, failure handling, concurrency, transparency. Parallel computing: not really a topic for this module; some of the problems are common, but the priorities are different. Reading: –Coulouris4, Coulouris5, skim through Chapter 1. –Tanenbaum, Section 1.2 (all of Chapter 1 covered) –Read about Amdahl’s law! Next time: Architectures and Models.