Data Parallel Algorithms

Data Parallel Algorithms
Article by W. DANIEL HILLIS and GUY L. STEELE, JR. Presented by: ALAN MOSER Tuesday June 28, 2005

Overview Review: Data Parallel vs. Control Parallel
Connection Machine Programming Model Differences of the Connection Machine Model Algorithms Summation Prefix summation by doubling Finding the end of a Linked List All partial sums of a Linked List Matching up elements of two Linked Lists

Data vs. Control Parallel
Data parallel (SIMD) same instruction is executed synchronous by all processors on multiple data items. Control parallel (MIMD) each processor may execute a different instruction from the same code asynchronously on multiple data items.

Connections Machine Programming Model
Consists of two parts: 1. Front-end computer -traditional SISD computer serves as controller, VAX or Symbolics 3600 2. Array of Connection Machine processors -each with own local memory -to the front end processor array appears as memory

Executing Instructions & Selecting Processors
Executes in SIMD fashion -A single instruction stream from front-end acts on multiple data items. Each processor has state bit or context flag -Context flag set to 1 means CPU is selected Instructions are one of two types -Conditional, only CPU’s selected will execute -Unconditional, all CPU’s will execute regardless of context flag

Differences of the Connection Machine Model
General pointer-based communication Virtual Processors

General Communication?
Typical computers of fine-grained SIMD style restrict communication to patterns such as a grid or tree wired into the hardware. The connection machine model allows any CPU to communicate with any other CPU while other CPU’s communicate concurrently via a SEND instruction.

SEND Instruction SEND Instruction takes two operands
1. address of the data to be sent 2. A processor pointer -i.e. CPU number and field within that CPU to which data is to be placed.

Virtual Processors The connection machine model is abstracted from the hardware that supports it. (i.e. number and size of its processors) Programs described in terms of virtual processors.

Benefits of Virtual Processors
Same program can run unchanged on different sizes of the connection machine Number of CPU’s may be regarded as expandable rather than fixed. CPU’s may be allocated dynamically “on the fly” processor-cons instruction allocates memory, memory comes with own CPU attached

Data Parallel Algorithms
Summation Prefix Summation Finding the end of a Linked List All Partial sums of a Linked List Matching the elements of two Linked Lists

Summation of an Array for j := 1 to log n do for all k in parallel do
if ((k + 1) mod 2^j) = 0 then x[k] := x[k – 2^(j-1)] + x[k] fi od

Diagram of Summation of Array

Prefix Summation of an Array
for j := 1 to log n do for all k in parallel do if (k > = 2^j) then x[k] := x[k – 2^(j-1)] + x[k] fi od

Diagram of Prefix Summation of Array by Doubling

Count Instruction Every CPU unconditionally examines its context flag compute 1 if set, 0 if clear then pre-forms an unconditionally summation of the integer values Used to count the number of selected CPU’s implicit use of summation algorithm

Enumerate Instruction
Every CPU unconditionally examines its context flag compute 1 if set, 0 if clear then pre-forms an unconditional prefix summation of the integer values. Used to count and number the selected CPUs (implicit use of prefix summation) Result every CPU receives a count of the number of active processors that precede it (including itself)

Finding the end of a Linked List
for all k in parallel do chum[k] := next [k] while chum [k] != null and chum [chum [k]] != null do chum [k] := chum [chum[k]] od

Linked List after each Iteration of loop
Original Linked List Linked List after each Iteration of loop

All Partial Sums of a Linked List
for all k in parallel do chum [k] := next [k] while chum[k] != null do value[chum[k]] := value[k] + value[chum[k]] chum [k] := chum [chum [k] ] od

Linked List after execution of chum[k]:=next[k]
Original Linked List Linked List after execution of chum[k]:=next[k] Linked List after first Iteration of while loop

Linked List after last Iteration of while loop
Final product shown without chum pointers

Matching the elements in two Linked Lists
for all k in parallel do friend[k] := null od friend[list1] := list2 friend[list2] := list1 chum [k] := next [k] while chum [k] != null do if friend[k] != null then friend[chum[k]] := chum [friend[k]] chum [k] := chum [ckum[k]] fi

Two original Linked Lists

Properties of Matching two Linked Lists
Possible to match two lists of different lengths If list2 is friend of list1 but not vise versa then list2 will have friend components that are null (unaffected) This algorithm can process many lists or pairs of linked lists simultaneously

Reference W. DANIEL HILLIS and GUY L. STEELE, JR. “DATA PARALLEL ALGORITHMS,” Communications of the ACM. December 1986 Volume 29 Number 12

Data Parallel Algorithms

Similar presentations

Presentation on theme: "Data Parallel Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Parallel Algorithms

Similar presentations

Presentation on theme: "Data Parallel Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback