Download presentation
Presentation is loading. Please wait.
Published byCuthbert Patterson Modified over 9 years ago
1
Parallel Processing1 Parallel Processing (CS 676) Overview Jeremy R. Johnson
2
Parallel Processing2 Goals Parallelism: To run large and difficult programs fast. Course: To become effective parallel programmers –“How to Write Parallel Programs” –“Parallelism will become, in the not too distant future, an essential part of every programmer’s repertoire” –“Coordination – a general phenomenon of which parallelism is one example – will become a basic and widespread phenomenon in CS” Why? –Some problems require extensive computing power to solve –The most powerful computer by definition is a parallel machine –Parallel computing is becoming ubiquitous –Distributed & networked computers with simultaneous users require coordination
3
Parallel Processing3 Top 500
4
Parallel Processing4 LINPACK Benchmark Solve a dense N N system of linear equations, y = Ax, using Gaussian Elimination with partial pivoting –2/3N 3 + 2N 2 FLOPS High Performance LINPACK used to measure performance for TOP500 (introduced by Jack Dongarra)
5
Parallel Processing5 Example LU Decomposition Solve the following linear system Find LU decomposition A = PLU
6
Parallel Processing6 Big Machines Cray 2 DoE-Lawrence Livermore National Laboratory (1985) 3.9 gigaflops 8 processor vector machine Cray XMP/4 DoE, LANL,… (1983) 941 megaflops 4 processor vector machine
7
Parallel Processing7 Big Machines Cray Jaguar ORNL (2009) 1.75 petaflops 224,256 AMD Opteron cores Tianhe-1A NSC Tianjin, China (2010) 2.507 petaflops 14,336 Xeon X5670 processors 7,168 Nvidia Tesla M2050 GPUS
8
Parallel Processing8 Need for Parallelism
9
Parallel Processing9 Multicore Intel Core i7
10
Parallel Processing10 Multicore IBM Blue Gene/L 2004-2007 478.2 teraflops 65,536 "compute nodes” Cyclops64 80 gigaflops 80 cores @ 500 megahertz multiply-accumulate
11
Parallel Processing11 Multicore
12
Parallel Processing12 Multicore
13
Parallel Processing13 GPU Nvidia GTX 480 1.34 teraflops 480 SP (700 MHz) Fermi chip 3 billion transistors
14
Parallel Processing14 Google Server 2003: 15,000 servers ranging from 533 MHz Intel Celeron to dual 1.4 GHz Intel Pentium III 2005: 200,000 servers 2006: upwards of servers
15
Drexel Machines Tux 5 nodes –4 Quad-Core AMD Opteron 8378 processors (2.4 GHz) –32 GB RAM Draco 20 nodes –Dual Xeon Processor X5650 (2.66 GHz) –6 GTX 480 –72 GB RAM 4 nodes –6 C2070 GPUs Parallel Processing15
16
Parallel Processing16 Programming Challenge “But the primary challenge for an 80-core chip will be figuring out how to write software that can take advantage of all that horsepower.” Read more: http://news.cnet.com/Intel-shows-off-80-core- processor/21001006_36158181.html?tag=mncol#ixzz1AHCK 1LEc
17
Parallel Processing17 Basic Idea One way to solve a problem fast is to break the problem into pieces, and arrange for all of the pieces to be solved simultaneously. The more pieces, the faster the job goes - upto a point where the pieces become too small to make the effort of breaking-up and distributing worth the bother. A “parallel program” is a program that uses the breaking up and handing-out approach to solve large or difficult problems.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.