Presentation is loading. Please wait.

Presentation is loading. Please wait.

Heiko Schröder, 2003 Parallel Architectures 1 Various communication networks State of the art technology Important aspects of routing schemes Known results.

Similar presentations


Presentation on theme: "Heiko Schröder, 2003 Parallel Architectures 1 Various communication networks State of the art technology Important aspects of routing schemes Known results."— Presentation transcript:

1 Heiko Schröder, 2003 Parallel Architectures 1 Various communication networks State of the art technology Important aspects of routing schemes Known results (theory) The internet

2 Heiko Schröder, 2003 Parallel Architectures 2 Routing Models Store-and-forward (packet switching) model: --Packet is entity – one packet per edge per time unit --Queues can be allowed to build up in nodes – try to keep them short Circuit switching (path-lockdown) --entire path is dedicated to packet (from source to destination) Wormhole routing Static routing problems: all packets are present when routing commences. (Dynamic routing: packets arrive at arbitrary times.) Types of static routing problems: General assumption: each processor sends only one packet One-to-one: -- each packet has precisely one destination -- at most one packet is destined for each processor Many-to-one: More than one packet can have same destination. One-to-many: A single packet can have more than one destinations (copies). Hot spots = bottlenecks (example: Many-to-one) – try to avoid !

3 Heiko Schröder, 2003 Parallel Architectures 3 Wormhole routing Used in clusters

4 Heiko Schröder, 2003 Parallel Architectures 4 Hot potato routing Try to move as many as possible into a “good” direction Very good average performance! Hot potato routing on the internet

5 Heiko Schröder, 2003 Parallel Architectures 5 Greedy routing Move along row to correct column move along column Possible queue

6 Heiko Schröder, 2003 Parallel Architectures 6 Butterfly network Unique path FFT routing sorting

7 Heiko Schröder, 2003 Parallel Architectures 7 Benes network

8 Heiko Schröder, 2003 Parallel Architectures 8 Benes network 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

9 Heiko Schröder, 2003 Parallel Architectures 9 Packet-Routing Algorithms Most important in parallel architectures Meshes have big diameter Benes networks – fast routing – no fast way of finding the paths is known (might be computed off-line – might not be suitable) On-line algorithms ?

10 Heiko Schröder, 2003 Parallel Architectures 10 Greedy Routing – in BF 0 level log N (= k) 1 N 1 N row 000 row 001 row 010 row 011 row 100 row 101 row 110 row 111 (u 1 u 2 … u k-1 u k, 0)  (v 1 u 2 … u k-1 u k, 1)  (v 1 v 2 … v k-1 v k, k) (v 1 v 2 … u k-1 u k, 2) ... u 1 u 2 … u k v 1 v 2 … v k (u 1 u 2 …u (k-1)/2 00…0, 0)  (00…0u (k-1)/2 …u 2 u 1, k)

11 Heiko Schröder, 2003 Parallel Architectures 11 Greedy Routing – Worst Cases Rout N packets in a butterfly:  :  [1,N]  [1,N] ; Example: bit-reversal permutation 0 level log N (= k) 1 N 1 N row 000 row 001 row 010 row 011 row 100 row 101 row 110 row 111 (u 1 u 2 …u (k-1)/2 00…0, 0)  (0u 2 …u (k-1)/2 00…0, 1) ... (00..0u (k-1)/2 00…0, (k-3)/2)  (00..0000…0, (k-1)/2)  (00..0000…0, (k+1)/2)  (00..00u (k-1)/2 0…0, (k+3)/2) ... (00…0u (k-1)/2 …u 2 u 1, k) 2 (k-1)/2 = paths go thru Time

12 Heiko Schröder, 2003 Parallel Architectures 12 Oblivious Routing Definition: A routing algorithm is called oblivious if its path depends only on the addresses of source and destination of the packet. Example: Greedy routing. Theorem: Let G=(V,E) be any N-node degree-d network. Then for every oblivious routing algorithm there exists a 1-1 packet routing problem which Will take at least steps to complete. Proof: see Leighton. Thus a “good” routing algorithm cannot be oblivious (or greedy) – it has to take into account other packets and/or congestions.

13 Heiko Schröder, 2003 Parallel Architectures 13 Routing via sorting Routing can be (and is often) done via sorting. Merge-sort on the hypercube and hypercubic networks can be done in time O(log 2 N) – much better results are known – it might be possible to sort in time O(log N) (unknown for hypercubic networks). If M<N keys need to be sorted it is advisable to “pack” first, then sort, then “spread”.

14 Heiko Schröder, 2003 Parallel Architectures 14 Packing on the butterfly A B C D E C D A B E row 000 row 001 row 010 row 011 row 100 row 101 row 110 row 111 Unique greedy path  monotone packing without collisions. Proof? Destination unknown  firstly determine destination!

15 Heiko Schröder, 2003 Parallel Architectures 15 neighbor not neighbor distance < 4distance >= 4

16 Heiko Schröder, 2003 Parallel Architectures 16 Prefix sum Complete binary tree is sub-graph of butterfly

17 Heiko Schröder, 2003 Parallel Architectures 17 Wrapped butterfly (WBF)

18 Heiko Schröder, 2003 Parallel Architectures 18 0/1 principle If an oblivious comparison exchange algorithm sorts all input sets consisting solely of 0s and 1s, then it sorts all input sets with r values. Proof (by contradiction): Assume it sorts all input sets consisting solely of 0s and 1s, but it fails to sort some sequence of arbitrary values. Instead of the correct output: x 1  x 2  x 3  …  x k-1  x k  …  x n it outputs: x 1  x 2  x 3  …  x k-1 < x r … x k... Now replace all x i with i  k with 0s and all others with 1s.  0 x k 1 x s 0 x k 1 x s  0 x k 0 x c 0 x k wrong position! An 0 ends up where x k ended up, i.e. in a wrong position -- contradiction!. 0 x k 0 x c 0 x k

19 Heiko Schröder, 2003 Parallel Architectures 19 Inductive proof: butterfly sorts bitonic sequences Use 0/1-principle bitonic sequence : a concatenation of two sorted sequences (arbitrary length) -- sorted in opposite directions

20 Heiko Schröder, 2003 Parallel Architectures 20 Inductive step Case 1: at least n/2 1s n/2 1s (max) bitonic (min) Case 2: at most n/2 1s bitonic (max) n/2 0s (min) Case 3 & 4 : 0  1

21 Heiko Schröder, 2003 Parallel Architectures 21 Time/Area complexity? For sorting on BF Last merge: log n steps previous merge: log n -1 steps... first merge: 1 step Total time: (log n +1) log n / 2 steps Time:  (log 2 n) Area of butterfly:  (n 2 ) -- # of crossing points! AT 2 =  (n 2 log 4 n) not quite optimal.

22 Heiko Schröder, 2003 Parallel Architectures 22 Sorting on the ISA Repeat log n times: vertical merge; horizontal merge.

23 Heiko Schröder, 2003 Parallel Architectures 23 Sorting on the ISA 2x24x2

24 Heiko Schröder, 2003 Parallel Architectures 24 Horizontal merge in-shuffle: out-shuffle -- result? sorted!Only one dirty row - prove!

25 Heiko Schröder, 2003 Parallel Architectures 25 In-shuffle Same number per column xxxxxxx

26 Heiko Schröder, 2003 Parallel Architectures 26 Time/Area complexity for sorting on the mesh Time for a merge step from k x k to 2k x 2k: Ck Total time: O(log n  n) (  n x  n mesh) (remark: is possible) Area: n log n AT 2 = n 2 log 3 n (n 2 log 2 n is possible) AT: BF: AT= n 2 log 2 n Mesh: AT=n 3/2 log 2 n

27 Heiko Schröder, 2003 Parallel Architectures 27 Warshall’s algorithm for k:=1 to n do for i:=1 to n do for j:=1 to n do a ij :=F(a ij, a ik, a kj ) a kj a ik a ij Algebraic path problem. Examples: all shortest paths 1.) a ij := a ij  ( a ik  a kj ) -- start with adjacency matrix A 2.) d ij := min  d ij ; ( d ik + d kj )  -- start with distance matrix D [also carry first/last node on path]

28 Heiko Schröder, 2003 Parallel Architectures 28 Parallaxis versus ISA a kj a ik a ij a 1j a ij a i1 a 2j a i2 a ij ParallaxisISA   V:=C  V C:=A  V Adjacency matrix in C Instructions (only a suggestion): C:=CN C:=CE C:=CW C:=CS A:=C C:=A V:=C AC CA VC

29 ? ? ? ?


Download ppt "Heiko Schröder, 2003 Parallel Architectures 1 Various communication networks State of the art technology Important aspects of routing schemes Known results."

Similar presentations


Ads by Google