Download presentation
Presentation is loading. Please wait.
1
Heiko Schröder, 2003 Parallel Architectures 1 Various communication networks State of the art technology Important aspects of routing schemes Known results (theory) The internet
2
Heiko Schröder, 2003 Parallel Architectures 2 Routing Models Store-and-forward (packet switching) model: --Packet is entity – one packet per edge per time unit --Queues can be allowed to build up in nodes – try to keep them short Circuit switching (path-lockdown) --entire path is dedicated to packet (from source to destination) Wormhole routing Static routing problems: all packets are present when routing commences. (Dynamic routing: packets arrive at arbitrary times.) Types of static routing problems: General assumption: each processor sends only one packet One-to-one: -- each packet has precisely one destination -- at most one packet is destined for each processor Many-to-one: More than one packet can have same destination. One-to-many: A single packet can have more than one destinations (copies). Hot spots = bottlenecks (example: Many-to-one) – try to avoid !
3
Heiko Schröder, 2003 Parallel Architectures 3 Wormhole routing Used in clusters
4
Heiko Schröder, 2003 Parallel Architectures 4 Hot potato routing Try to move as many as possible into a “good” direction Very good average performance! Hot potato routing on the internet
5
Heiko Schröder, 2003 Parallel Architectures 5 Greedy routing Move along row to correct column move along column Possible queue
6
Heiko Schröder, 2003 Parallel Architectures 6 Butterfly network Unique path FFT routing sorting
7
Heiko Schröder, 2003 Parallel Architectures 7 Benes network
8
Heiko Schröder, 2003 Parallel Architectures 8 Benes network 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
9
Heiko Schröder, 2003 Parallel Architectures 9 Packet-Routing Algorithms Most important in parallel architectures Meshes have big diameter Benes networks – fast routing – no fast way of finding the paths is known (might be computed off-line – might not be suitable) On-line algorithms ?
10
Heiko Schröder, 2003 Parallel Architectures 10 Greedy Routing – in BF 0 level log N (= k) 1 N 1 N row 000 row 001 row 010 row 011 row 100 row 101 row 110 row 111 (u 1 u 2 … u k-1 u k, 0) (v 1 u 2 … u k-1 u k, 1) (v 1 v 2 … v k-1 v k, k) (v 1 v 2 … u k-1 u k, 2) ... u 1 u 2 … u k v 1 v 2 … v k (u 1 u 2 …u (k-1)/2 00…0, 0) (00…0u (k-1)/2 …u 2 u 1, k)
11
Heiko Schröder, 2003 Parallel Architectures 11 Greedy Routing – Worst Cases Rout N packets in a butterfly: : [1,N] [1,N] ; Example: bit-reversal permutation 0 level log N (= k) 1 N 1 N row 000 row 001 row 010 row 011 row 100 row 101 row 110 row 111 (u 1 u 2 …u (k-1)/2 00…0, 0) (0u 2 …u (k-1)/2 00…0, 1) ... (00..0u (k-1)/2 00…0, (k-3)/2) (00..0000…0, (k-1)/2) (00..0000…0, (k+1)/2) (00..00u (k-1)/2 0…0, (k+3)/2) ... (00…0u (k-1)/2 …u 2 u 1, k) 2 (k-1)/2 = paths go thru Time
12
Heiko Schröder, 2003 Parallel Architectures 12 Oblivious Routing Definition: A routing algorithm is called oblivious if its path depends only on the addresses of source and destination of the packet. Example: Greedy routing. Theorem: Let G=(V,E) be any N-node degree-d network. Then for every oblivious routing algorithm there exists a 1-1 packet routing problem which Will take at least steps to complete. Proof: see Leighton. Thus a “good” routing algorithm cannot be oblivious (or greedy) – it has to take into account other packets and/or congestions.
13
Heiko Schröder, 2003 Parallel Architectures 13 Routing via sorting Routing can be (and is often) done via sorting. Merge-sort on the hypercube and hypercubic networks can be done in time O(log 2 N) – much better results are known – it might be possible to sort in time O(log N) (unknown for hypercubic networks). If M<N keys need to be sorted it is advisable to “pack” first, then sort, then “spread”.
14
Heiko Schröder, 2003 Parallel Architectures 14 Packing on the butterfly A B C D E C D A B E row 000 row 001 row 010 row 011 row 100 row 101 row 110 row 111 Unique greedy path monotone packing without collisions. Proof? Destination unknown firstly determine destination!
15
Heiko Schröder, 2003 Parallel Architectures 15 neighbor not neighbor distance < 4distance >= 4
16
Heiko Schröder, 2003 Parallel Architectures 16 Prefix sum Complete binary tree is sub-graph of butterfly
17
Heiko Schröder, 2003 Parallel Architectures 17 Wrapped butterfly (WBF)
18
Heiko Schröder, 2003 Parallel Architectures 18 0/1 principle If an oblivious comparison exchange algorithm sorts all input sets consisting solely of 0s and 1s, then it sorts all input sets with r values. Proof (by contradiction): Assume it sorts all input sets consisting solely of 0s and 1s, but it fails to sort some sequence of arbitrary values. Instead of the correct output: x 1 x 2 x 3 … x k-1 x k … x n it outputs: x 1 x 2 x 3 … x k-1 < x r … x k... Now replace all x i with i k with 0s and all others with 1s. 0 x k 1 x s 0 x k 1 x s 0 x k 0 x c 0 x k wrong position! An 0 ends up where x k ended up, i.e. in a wrong position -- contradiction!. 0 x k 0 x c 0 x k
19
Heiko Schröder, 2003 Parallel Architectures 19 Inductive proof: butterfly sorts bitonic sequences Use 0/1-principle bitonic sequence : a concatenation of two sorted sequences (arbitrary length) -- sorted in opposite directions
20
Heiko Schröder, 2003 Parallel Architectures 20 Inductive step Case 1: at least n/2 1s n/2 1s (max) bitonic (min) Case 2: at most n/2 1s bitonic (max) n/2 0s (min) Case 3 & 4 : 0 1
21
Heiko Schröder, 2003 Parallel Architectures 21 Time/Area complexity? For sorting on BF Last merge: log n steps previous merge: log n -1 steps... first merge: 1 step Total time: (log n +1) log n / 2 steps Time: (log 2 n) Area of butterfly: (n 2 ) -- # of crossing points! AT 2 = (n 2 log 4 n) not quite optimal.
22
Heiko Schröder, 2003 Parallel Architectures 22 Sorting on the ISA Repeat log n times: vertical merge; horizontal merge.
23
Heiko Schröder, 2003 Parallel Architectures 23 Sorting on the ISA 2x24x2
24
Heiko Schröder, 2003 Parallel Architectures 24 Horizontal merge in-shuffle: out-shuffle -- result? sorted!Only one dirty row - prove!
25
Heiko Schröder, 2003 Parallel Architectures 25 In-shuffle Same number per column xxxxxxx
26
Heiko Schröder, 2003 Parallel Architectures 26 Time/Area complexity for sorting on the mesh Time for a merge step from k x k to 2k x 2k: Ck Total time: O(log n n) ( n x n mesh) (remark: is possible) Area: n log n AT 2 = n 2 log 3 n (n 2 log 2 n is possible) AT: BF: AT= n 2 log 2 n Mesh: AT=n 3/2 log 2 n
27
Heiko Schröder, 2003 Parallel Architectures 27 Warshall’s algorithm for k:=1 to n do for i:=1 to n do for j:=1 to n do a ij :=F(a ij, a ik, a kj ) a kj a ik a ij Algebraic path problem. Examples: all shortest paths 1.) a ij := a ij ( a ik a kj ) -- start with adjacency matrix A 2.) d ij := min d ij ; ( d ik + d kj ) -- start with distance matrix D [also carry first/last node on path]
28
Heiko Schröder, 2003 Parallel Architectures 28 Parallaxis versus ISA a kj a ik a ij a 1j a ij a i1 a 2j a i2 a ij ParallaxisISA V:=C V C:=A V Adjacency matrix in C Instructions (only a suggestion): C:=CN C:=CE C:=CW C:=CS A:=C C:=A V:=C AC CA VC
29
? ? ? ?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.