Download presentation
Presentation is loading. Please wait.
Published byNelson Warner Modified over 8 years ago
1
Intro to Parallel and Distributed Processing Some material adapted from slides by Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed Computing Seminar, 2007 (licensed under Creation Commons Attribution 3.0 License) Slides from: Jimmy Lin The iSchool University of Maryland
2
Web-Scale Problems? How to approach problems in: – Biocomputing – Nanocomputing – Quantum computing –…–… It all boils down to… – Divide-and-conquer – Throwing more hardware at the problem Clusters, Grids, Clouds Simple to understand… a lifetime to master…
3
Cluster computing Improve performance and availability over single computers Compute nodes located locally, connected by fast LAN A few nodes to large number Group of linked computers working closely together, in parallel Typically machines were homogeneous Sometimes viewed as forming single computer
4
Cluster computing Can be: – Tightly coupled Share resources (memory, I/O) – Loosely coupled Connected on network but don’t share resources
5
Memory Typology: Shared Memory Processor
6
Memory Typology: Distributed MemoryProcessorMemoryProcessor MemoryProcessorMemoryProcessor Network
7
Memory Typology: Hybrid Memory Processor Network Processor Memory Processor Memory Processor Memory Processor
8
Clusters Nodes have computational devices and local storage, networked disk storage allows for sharing Static system, centralized control Data Centers Multiple clusters at a datacenter Google cluster 1000s servers – Racks 40-80 servers
9
Grids Grids: – Large number of users – Large volume of data – Large computational task involved Connecting resources through a network
10
Grid Term Grid borrowed from electrical grid Users obtains computing power through Internet by using Grid just like electrical power from any wall socket
11
Grid Computing Heterogeneous, geographically dispersed Loosely coupled – Nodes do not share resources – connect by network WAN Thousands of users worldwide Can be formed from computing resources belonging to multiple organizations Although can be dedicated to single application, typically for variety of purposes
12
Grid By connecting to a Grid, can get: – needed computing power – storage spaces and data – Specialized equipment Middleware needed to use grid Each user - a single login account to access all resources Can connect a data center (cluster) to a grid Compute grids, data grids
13
Grids Resources - owned by diverse organizations – Virtual Organization Sharing of data and resources in dynamic, multi- institutional virtual organizations – Academia – Access to specialized resources not available locally
14
An Illustrative Example NASA research scientist: – collected microbiological samples in the tidewaters around Wallops Island, Virginia. Needs: – high-performance microscope at National Center for Microscopy and Imaging Research (NCMIR), University of California, San Diego. Samples sent to San Diego and used NPACI’s Telescience Grid and NASA’s Information Power Grid (IPG) to view and control the output of the microscope from her desk on Wallops Island. Viewed the samples, and move platform holding them, making adjustments to the microscope.
15
Example (continued) The microscope produced a huge dataset of images This dataset was stored using a storage resource broker on NASA’s IPG Scientist was able to run algorithms on this very dataset while watching the results in real time
16
EU Grid (European Grid Infrastructure) SE – storage element, CE – computing element
17
Grids Domains as diverse as: – Global climate change – High energy physics – Computational genomics – Biomedical applications Volunteer Grid Computing – SetiAtHome SetiAtHome – World Community Grid World Community Grid – FoldingAtHome FoldingAtHome – EinsteinAtHome EinsteinAtHome
18
Grids Grid is: – heterogeneous, geographically distributed, independent site – Gridware manages the resources for Grids Grid differs from: – Parallel computing – grid is more than a single task on multiple machines – Distributed system – grid is more than distributing the load of a program across two or more processes – Cluster computing – grid is more than homogeneous sites connected by LAN (grid can be multiple clusters)
19
Grid Precursor to Cloud computing but no virtualization Some say there wouldn’t be cloud computing if there hadn’t been grid computing “Virtualization distinguishes cloud from grid” Zissis
20
Parallel/Distributed Background
21
Flynn’s Taxonomy Instructions Single (SI)Multiple (MI) Data Multiple (MD) SISD Single-threaded process MISD Pipeline architecture SIMD Vector Processing MIMD Multi-threaded Programming Single (SD)
22
SISD DDDDDDD Processor Instructions
23
SIMD D0D0 Processor Instructions D0D0 D0D0 D0D0 D0D0 D0D0 D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D1D1 D2D2 D3D3 D4D4 … DnDn D0D0
24
MISD DDDD D D D Processor Instructions D
25
MIMD DDDDDDD Processor Instructions DDDDDDD Processor Instructions
26
The Cloud - Parallelization
27
Divide and Conquer SIMD or MIMD? “Work” w1w1 w2w2 w3w3 r1r1 r2r2 r3r3 “Result” “worker” Partition Combine
28
Different Workers Different threads in the same core Different cores in the same CPU Different CPUs in a multi-processor system Different machines in a distributed system
29
Choices, Choices, Choices Commodity vs. “exotic” hardware Number of machines vs. processor vs. cores Bandwidth of memory vs. disk vs. network Different programming models
30
Parallelization Problems How do we assign work units to workers? What if we have more work units than workers? What if workers need to share partial results? How do we aggregate partial results? How do we know all the workers have finished? What if workers die? What is the common theme of all of these problems? “Work” w1w1 w2w2 w3w3 r1r1 r2r2 r3r3 “Result” “ w o r k e r” Partition Combine
31
General Theme? Parallelization problems arise from: – Communication between workers – Access to shared resources (e.g., data) Thus, we need a synchronization system! This is tricky: – Finding bugs is hard – Solving bugs is even harder
32
Managing Multiple Workers Difficult because – (Often) don’t know the order in which workers run – (Often) don’t know where the workers are running – (Often) don’t know when workers interrupt each other Thus, we need what structures?: – Semaphores (lock, unlock) – Conditional variables (wait, notify, broadcast) – Barriers Still, lots of problems: – Deadlock, livelock, race conditions,... Moral of the story: be careful! – Even trickier if the workers are on different machines
33
Patterns for Parallelism Parallel computing has been around for decades Here are some “design patterns” …
34
Master/Slaves slaves master GFS
35
Producer/Consumer PC P/C Bounded buffer GFS
36
Work Queues C P P P C C shared queue WWWWW Multiple producers/consumers
37
Rubber Meets Road From patterns to implementation: – pthreads, OpenMP for multi-threaded programming – MPI (message passing interface) for clustering computing –…–… The reality: – Lots of one-off solutions, custom code – Write your own dedicated library, then program with it – Burden on the programmer to explicitly manage everything MapReduce to the rescue!
38
Parallelization Ajax: the “front-end” of cloud computing – Highly-interactive Web-based applications MapReduce: the “back-end” of cloud computing – Batch-oriented processing of large datasets – Highly parallel – solve problem in parallel, reduce e.g. count words in document Hadoop MapReduce: a programming model and software framework – for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes.
39
Multiple threads on a VM If 2 hyper threads on 4 cores, VM thinks have 8 If want all 8 threads at once?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.