Download presentation
Presentation is loading. Please wait.
Published byHenry Perkins Modified over 9 years ago
1
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE 655 Part 8 Networks - 3
2
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.2 Hypercube Networks H n - An n-dimensional hypercube network - 2 nodes A 0-dimensional hypercube H 0 - a single node H n constructed by connecting the corresponding nodes of two H n-1 networks The edges added to connect corresponding nodes are called dimension-(n-1) edges n Dimension-0 edge H 1
3
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.3 Hypercube - Examples Dimension-0 edges Dimension-1 edges Dimension-2 edges Dimension-3 edges H 4 H 1 H 2 H 3 H 3
4
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.4 Routing in Hypercubes Specific numbering to simplify routing Number expressed in binary - if nodes i and j are connected by a dimension-k edge, the names of i and j differ in only the k-th bit position Example - nodes 0000 and 0010 differ in only the 2 bit position - connected by a dimension-1 edge Example - a packet needs to travel from node 14=1110 to node 2=0010 in an H network Possible routings - 1110 0110 (dimension 3) 0010 (dimension 2) 1110 1010 (dimension 2) 0010 (dimension 3) 2 4 2 Dimension-0 edges Dimension-1 edges Dimension-2 edges Dimension-3 edges 1
5
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.5 Routing - General In general - the distance between source and destination is the number of different bits in addresses Going from X to Y can be accomplished by traveling once along each dimension in which they differ X = x... x ; Y=y... y Define z = x y - is the exclusive-or operator Packet must traverse an edge in every dimension i for which z = 1 n- 1 0 0 i i i i
6
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.6 Fault Tolerance in Hypercubes Hn (for n 2) can tolerate link failures - multiple paths from any source to any destination Node failures can disrupt the operation One way is to increase the number of communication ports of each node from n to n+1 and connecting these extra ports through additional links to one or more spare nodes Example - two spare nodes - each a spare for 2 nodes of an Hn- 1 sub-cube Spare nodes may require 2 ports - can be reduced by using several crossbar switches whose outputs is connected to the corresponding spare node Number of ports of the spare node is reduced to n+ 1 - same as for all other nodes n- 1
7
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.7 An H 4 Hypercube with Two Spare Nodes S S
8
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.8 Different Method of Fault-Tolerance Duplicating the processor in a few selected nodes Each additional processor - spare also for any of the processors in the neighboring nodes Example - nodes 0, 7, 8, 15 in H 4 - modified to duplex nodes Every node now has a spare at a distance no larger than 1 Replacing a faulty processor by a spare results in an additional communication delay
9
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.9 Routing in Injured Hypercubes Routing algorithm must be modified to route around the faulty nodes or links Basic idea - list the dimensions along which the packet must travel, and traverse them one by one As edges are traversed and are crossed off the list If, due to a link or a node failure, the desired link is not available - another edge in the list, if any, is chosen for traversal If packet arrives at some node to find all dimensions on its list down - it backtracks to the previous node and tries again
10
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.10 Formal Routing Algorithm - Notations TD - list of dimensions that the message has traveled on - in order of traversal; TD - in reversed order - exclusive-or operation carried out k times, sequentially Example - a means (a a ) a D - destination, S - source, d=D S ( - bitwise exclusive-or operation on corresponding bits of D and S) SC(A) - set of nodes visited if we travel on each of the dimensions listed in set A Example - at node 0010 - SC(1,3)={0000,1000} R k i 3 1 2 3 i=1
11
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.11 Notations - Cont. e - n-bit vector consisting of a 1 in the i-th bit position and 0 everywhere else Example - e = 100 Packets are assumed to consist of (I) d; d=D S (II) Message being transmitted (the ``payload'') (III) List of dimensions taken so far - TD - append operation TD x - append x to the list TD transmit(j) - send packet (d e, message, TD j) along the j-th-dimensional link from the present node i 2 n j 3
12
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.12 Routing Algorithm for Injured Hypercubes
13
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.13 Example - H 3 H 3 with faulty node 011 node 000 wants to send a packet to 111 At 000, d=111 - sends the message out on dimension-0, to node 001 At 001, d = 110 and TD=(0) - attempts dimension-1 edge - impossible Bit 2 of d is also 1 - checks and finds that the dimension-2 edge to 101 is available - message is sent to 101 and then to 111 Exercise - What if both 011 and 101 are down? Dimension-0 edges Dimension-1 edges Dimension-2 edges
14
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.14 Reliability of Point-to-Point Networks Not necessarily a regular structure - often more than one path between any two nodes Terminal Reliability - the probability that there exists an operational path between two specific nodes, given the probabilities of link failures Example - calculating the terminal reliability for the source-sink pair N - N 4 1
15
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.15 Terminal Reliability - Example I Three paths from N to N P ={X,X } P ={X,X,X } p (q ) - probability that link X is good (faulty) Nodes are assumed fault-free - if not, their failure probability is incorporated into outgoing links Set of paths must be modified to an equivalent set of mutually exclusive events - otherwise some events will be counted more than once Mutually exclusive events - (I) P up ; (II) P up and P down ; (III) P up and both P and P down 4 1 1 1,2 2,4 2 3 3,4 1,3 1,2 2,3 3,4 i,j 3 2 1 1 1 2
16
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.16 Calculating Terminal Reliability Terminal reliability of a network with m paths P,..., P from source to sink E (E ) - event in which path P is operational (faulty) R = P(Operational Path Exists) = P( E ) Set of events can be decomposed into mutually exclusive events - O. P. Exists = E ( E E ) ( E E E ) ... ( E E … E ) R = P ( E ) +P ( E E ) +P ( E E E ) +... + P ( E E … E ) 1 m m i=1 i i i 1 i 1 1 1 2 2 m-1 m 3 1 1 1 1 2 2 3 m _ _ _ _ _ _ _ _ _ _ _
17
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.17 Terminal Reliability - Cont. The last expression can be rewritten using conditional probabilities R = P ( E ) +P ( E ) P ( E /E ) +P ( E ) P ( E E /E ) +... + P ( E )P(E … E /E ) The problem is calculating the probabilities P(E … E /E ) To identify the links which must must fail so that E occurs but not E,…, E, conditional sets are used S = P - P = { x | x P and x P } Identifying disjoint events in the general case is not always straightforward 1 1 1 2 2 2 m 3 3 1 m-1 m i-1 1 i _ _ _ _ __ _ i 1 j/i i i j j
18
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.18 Terminal Reliability - Example II This six-node network has 9 links - 6 uni-directional and 3 bi-directional All paths from N 1 to N 6 - Paths are ordered from shortest to longest
19
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.19 Calculating Terminal Reliability for Example II The first term for the reliability equation is P(E )=p p p To calculate the second term in the reliability equation - the conditional set is used S = P - P = {x, x } At least one of the links in this set must fail so that P is faulty (while P is operational) The second term in the probability equation - p p p (1-p p ) 1/2 2 1 1,3 3,5 1 2 1,2 3,5 1,3 5,6 2,5 1 5,6 1,3 3,5
20
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.20 Example II - Cont. For calculating other terms in the sum - intersection of several conditional sets must be considered Calculating the fourth term - expression for P - the conditional sets are: S ={x }; S ={x,x,x }; S ={x,x } S is included in S - if S is faulty, S is faulty S can be ignored The fourth term in the reliability equation - p p p p (1-p )(1-p p ) 3/4 1/4 5,6 4 2/4 2,5 1,3 1,2 5,6 2,4 3,5 4,5 4,6 5,6 1,2 2,4 1,2 1/4 2/4
21
Copyright 2004 Koren & Krishna ECE655/Koren Part.8.21 Example II - Cont. Calculating the third term S = {x,x,x } ; S = {x,x } The two conditional sets are not disjoint The event both S and S are faulty needs to be divided into disjoint events: (I) x is faulty (II) x is operational and both x and x are faulty (III) Both x and x are up, and both x and x are faulty Resulting expression for third term p p p (q + p q q + p p q q ) Remaining terms - calculated similarly Terminal reliability is the sum of all thirteen terms 2/3 1/3 1,3 5,6 3,5 2,5 5,6 1/3 2/3 5,6 1,3 1,2 1,3 2,5 3,5 2,5 5,6 3,5 2,4 4,6 1,3 2,5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.