Download presentation
Presentation is loading. Please wait.
1
Data Parallel Algorithms
Article by W. DANIEL HILLIS and GUY L. STEELE, JR. Presented by: ALAN MOSER Tuesday June 28, 2005
2
Overview Review: Data Parallel vs. Control Parallel
Connection Machine Programming Model Differences of the Connection Machine Model Algorithms Summation Prefix summation by doubling Finding the end of a Linked List All partial sums of a Linked List Matching up elements of two Linked Lists
3
Data vs. Control Parallel
Data parallel (SIMD) same instruction is executed synchronous by all processors on multiple data items. Control parallel (MIMD) each processor may execute a different instruction from the same code asynchronously on multiple data items.
4
Connections Machine Programming Model
Consists of two parts: 1. Front-end computer -traditional SISD computer serves as controller, VAX or Symbolics 3600 2. Array of Connection Machine processors -each with own local memory -to the front end processor array appears as memory
5
Executing Instructions & Selecting Processors
Executes in SIMD fashion -A single instruction stream from front-end acts on multiple data items. Each processor has state bit or context flag -Context flag set to 1 means CPU is selected Instructions are one of two types -Conditional, only CPU’s selected will execute -Unconditional, all CPU’s will execute regardless of context flag
6
Differences of the Connection Machine Model
General pointer-based communication Virtual Processors
7
General Communication?
Typical computers of fine-grained SIMD style restrict communication to patterns such as a grid or tree wired into the hardware. The connection machine model allows any CPU to communicate with any other CPU while other CPU’s communicate concurrently via a SEND instruction.
8
SEND Instruction SEND Instruction takes two operands
1. address of the data to be sent 2. A processor pointer -i.e. CPU number and field within that CPU to which data is to be placed.
9
Virtual Processors The connection machine model is abstracted from the hardware that supports it. (i.e. number and size of its processors) Programs described in terms of virtual processors.
10
Benefits of Virtual Processors
Same program can run unchanged on different sizes of the connection machine Number of CPU’s may be regarded as expandable rather than fixed. CPU’s may be allocated dynamically “on the fly” processor-cons instruction allocates memory, memory comes with own CPU attached
11
Data Parallel Algorithms
Summation Prefix Summation Finding the end of a Linked List All Partial sums of a Linked List Matching the elements of two Linked Lists
12
Summation of an Array for j := 1 to log n do for all k in parallel do
if ((k + 1) mod 2^j) = 0 then x[k] := x[k – 2^(j-1)] + x[k] fi od
13
Diagram of Summation of Array
14
Prefix Summation of an Array
for j := 1 to log n do for all k in parallel do if (k > = 2^j) then x[k] := x[k – 2^(j-1)] + x[k] fi od
15
Diagram of Prefix Summation of Array by Doubling
16
Count Instruction Every CPU unconditionally examines its context flag compute 1 if set, 0 if clear then pre-forms an unconditionally summation of the integer values Used to count the number of selected CPU’s implicit use of summation algorithm
17
Enumerate Instruction
Every CPU unconditionally examines its context flag compute 1 if set, 0 if clear then pre-forms an unconditional prefix summation of the integer values. Used to count and number the selected CPUs (implicit use of prefix summation) Result every CPU receives a count of the number of active processors that precede it (including itself)
18
Finding the end of a Linked List
for all k in parallel do chum[k] := next [k] while chum [k] != null and chum [chum [k]] != null do chum [k] := chum [chum[k]] od
19
Linked List after each Iteration of loop
Original Linked List Linked List after each Iteration of loop
20
All Partial Sums of a Linked List
for all k in parallel do chum [k] := next [k] while chum[k] != null do value[chum[k]] := value[k] + value[chum[k]] chum [k] := chum [chum [k] ] od
21
Linked List after execution of chum[k]:=next[k]
Original Linked List Linked List after execution of chum[k]:=next[k] Linked List after first Iteration of while loop
22
Linked List after last Iteration of while loop
Final product shown without chum pointers
23
Matching the elements in two Linked Lists
for all k in parallel do friend[k] := null od friend[list1] := list2 friend[list2] := list1 chum [k] := next [k] while chum [k] != null do if friend[k] != null then friend[chum[k]] := chum [friend[k]] chum [k] := chum [ckum[k]] fi
24
Two original Linked Lists
27
Properties of Matching two Linked Lists
Possible to match two lists of different lengths If list2 is friend of list1 but not vise versa then list2 will have friend components that are null (unaffected) This algorithm can process many lists or pairs of linked lists simultaneously
28
Reference W. DANIEL HILLIS and GUY L. STEELE, JR. “DATA PARALLEL ALGORITHMS,” Communications of the ACM. December 1986 Volume 29 Number 12
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.