Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Parallel Algorithms

Similar presentations


Presentation on theme: "Data Parallel Algorithms"— Presentation transcript:

1 Data Parallel Algorithms
Article by W. DANIEL HILLIS and GUY L. STEELE, JR. Presented by: ALAN MOSER Tuesday June 28, 2005

2 Overview Review: Data Parallel vs. Control Parallel
Connection Machine Programming Model Differences of the Connection Machine Model Algorithms Summation Prefix summation by doubling Finding the end of a Linked List All partial sums of a Linked List Matching up elements of two Linked Lists

3 Data vs. Control Parallel
Data parallel (SIMD) same instruction is executed synchronous by all processors on multiple data items. Control parallel (MIMD) each processor may execute a different instruction from the same code asynchronously on multiple data items.

4 Connections Machine Programming Model
Consists of two parts: 1. Front-end computer -traditional SISD computer serves as controller, VAX or Symbolics 3600 2. Array of Connection Machine processors -each with own local memory -to the front end processor array appears as memory

5 Executing Instructions & Selecting Processors
Executes in SIMD fashion -A single instruction stream from front-end acts on multiple data items. Each processor has state bit or context flag -Context flag set to 1 means CPU is selected Instructions are one of two types -Conditional, only CPU’s selected will execute -Unconditional, all CPU’s will execute regardless of context flag

6 Differences of the Connection Machine Model
General pointer-based communication Virtual Processors

7 General Communication?
Typical computers of fine-grained SIMD style restrict communication to patterns such as a grid or tree wired into the hardware. The connection machine model allows any CPU to communicate with any other CPU while other CPU’s communicate concurrently via a SEND instruction.

8 SEND Instruction SEND Instruction takes two operands
1. address of the data to be sent 2. A processor pointer -i.e. CPU number and field within that CPU to which data is to be placed.

9 Virtual Processors The connection machine model is abstracted from the hardware that supports it. (i.e. number and size of its processors) Programs described in terms of virtual processors.

10 Benefits of Virtual Processors
Same program can run unchanged on different sizes of the connection machine Number of CPU’s may be regarded as expandable rather than fixed. CPU’s may be allocated dynamically “on the fly” processor-cons instruction allocates memory, memory comes with own CPU attached

11 Data Parallel Algorithms
Summation Prefix Summation Finding the end of a Linked List All Partial sums of a Linked List Matching the elements of two Linked Lists

12 Summation of an Array for j := 1 to log n do for all k in parallel do
if ((k + 1) mod 2^j) = 0 then x[k] := x[k – 2^(j-1)] + x[k] fi od

13 Diagram of Summation of Array

14 Prefix Summation of an Array
for j := 1 to log n do for all k in parallel do if (k > = 2^j) then x[k] := x[k – 2^(j-1)] + x[k] fi od

15 Diagram of Prefix Summation of Array by Doubling

16 Count Instruction Every CPU unconditionally examines its context flag compute 1 if set, 0 if clear then pre-forms an unconditionally summation of the integer values Used to count the number of selected CPU’s implicit use of summation algorithm

17 Enumerate Instruction
Every CPU unconditionally examines its context flag compute 1 if set, 0 if clear then pre-forms an unconditional prefix summation of the integer values. Used to count and number the selected CPUs (implicit use of prefix summation) Result every CPU receives a count of the number of active processors that precede it (including itself)

18 Finding the end of a Linked List
for all k in parallel do chum[k] := next [k] while chum [k] != null and chum [chum [k]] != null do chum [k] := chum [chum[k]] od

19 Linked List after each Iteration of loop
Original Linked List Linked List after each Iteration of loop

20 All Partial Sums of a Linked List
for all k in parallel do chum [k] := next [k] while chum[k] != null do value[chum[k]] := value[k] + value[chum[k]] chum [k] := chum [chum [k] ] od

21 Linked List after execution of chum[k]:=next[k]
Original Linked List Linked List after execution of chum[k]:=next[k] Linked List after first Iteration of while loop

22 Linked List after last Iteration of while loop
Final product shown without chum pointers

23 Matching the elements in two Linked Lists
for all k in parallel do friend[k] := null od friend[list1] := list2 friend[list2] := list1 chum [k] := next [k] while chum [k] != null do if friend[k] != null then friend[chum[k]] := chum [friend[k]] chum [k] := chum [ckum[k]] fi

24 Two original Linked Lists

25

26

27 Properties of Matching two Linked Lists
Possible to match two lists of different lengths If list2 is friend of list1 but not vise versa then list2 will have friend components that are null (unaffected) This algorithm can process many lists or pairs of linked lists simultaneously

28 Reference W. DANIEL HILLIS and GUY L. STEELE, JR. “DATA PARALLEL ALGORITHMS,” Communications of the ACM. December 1986 Volume 29 Number 12


Download ppt "Data Parallel Algorithms"

Similar presentations


Ads by Google