Presentation is loading. Please wait.

Presentation is loading. Please wait.

rain technology (redundant array of independent nodes)

Similar presentations


Presentation on theme: "rain technology (redundant array of independent nodes)"— Presentation transcript:

1 rain technology (redundant array of independent nodes)
Presented by: Ravikrishnan.P

2 contents RAID Flavors RAID 0 RAID 5 rain architecture Features of rain
rain platform rain project goals Distributed storage Advantages

3 INTRODUCTION The name of the original research project is RAIN which stand’s for Reliable Array of Independent Nodes. The RAIN technology originated in a research project at the California Institute of Technology (Caltech),in collaboration with NASA’s Jet Propulsion Laboratory A component that stores data across distributed processors and retrieves it even if some of the processors fail

4 RAID flavors Commonly used ones: RAID 0 RAID 1 RAID 5 RAID 10
Other types used…but rarely: RAID 2,3,4,6,50……

5 RAID 0

6 RAID 1 RAID1 is ‘data mirroring’.
Two copies of the data are held on two physical disks, and the data is always identical.

7 RAID 5 “Distributed Parity” is the key word here.

8 Network Connections Application MPI/PVM TCP/IP RAIN Ethernet Myrinet
rain s/w architecture Servernet:Error correction ATM:aysnchronous transfer mode protocol PVM:parallel virtual machine MPI:message passing i/f Network Connections Application MPI/PVM TCP/IP RAIN Ethernet Myrinet ATM Servernet

9 rain Network Connections Application MPI/PVM TCP/IP RAIN Ethernet
Myrinet ATM Servernet rain Global state sharing protocol Can use in different platform Provide group communication ip manage Ip assign Ip release Pool of ip

10 Myrinet NIC card Myrinet Switch:10Gb/s

11 Features of rain  Communication
Bundled Interface Link Monitoring Fault Tolerant Interconnect Topology  Communication  Group Membership: Identifies healthy nodes that are participating in the cluster  Data Storage

12 rain platform Heterogeneous network of nodes and switches
bus network switch Computing/Storage Nodes Multiple Network Interfaces Variety of types of Networks node node node

13 Rain testbed Myrinet & Ethernet. Come by and see our RAIN demos!
Myrinet Switches 10 Pentium boxes w/multiple NICs Myrinet & Ethernet. Come by and see our RAIN demos!

14 Proof of Concept: Video Server
Video client & server on every node. A B C D switch switch

15 DATA STORED IN DISKS Insufficient storage to replicate all the data on each node. A B C D switch switch

16 Node Failure A B C D switch switch

17 Node Failure A B C D switch switch

18 Node Failure Dynamically switch to another node. A B C D switch switch

19 k-of-n Code Erasure-correcting code: a d+c b d+a c a+b d b+c a b c d
recover data from any k of n columns b = a+b a + d = d+c c

20 Link Failure A B C D switch switch

21 Link Failure A B C D switch switch

22 Link Failure Dynamically switch to another network path. switch switch
B C D switch switch

23 Switch Failure A B C D switch switch

24 Switch Failure A B C D switch switch

25 Switch Failure Dynamically switch to another network path. switch
B C D switch switch

26 Node Recovery A B C D switch switch

27 Node Recovery Continuous reconfiguration (e.g., load-balancing).
switch switch

28 Fault-tolerant interconnect technologies
Goal To connect computer nodes to a network of switches in order to maximize the network’s resistance to partitioning S C How do you connect n nodes to a ring of n switches?

29 A ring of switches N S a naïve solution
Given degree-2 & 4, how do we connect them? = Node = Switch S N

30 A ring of switches N S a naïve solution = Node = Switch S N

31 A ring of switches N N S S S a naïve solution N N easily partitioned S
= Node N N S S S = Switch N

32 resistance to partitioning
1 2 3 4 5 6 7 8 nodes on diagonals With switches in a ring, make connections as non-local as possible.

33 resistance to partitioning
1 2 3 4 5 6 7 8 nodes on diagonals With switches in a ring, make connections as non-local as possible.

34 resistance to partitioning
degree-2 compute nodes, degree-4 switches 1 4 6 8 2 8 2 nodes on diagonals 7 7 3 tolerates any 3 switch failures (optimal) generalizes to arbitrary node/switch degrees. 3 Optimal in sense that for degree 2 & 4, no construction can tolerate more switch failures while avoiding partitioning. Also, generalizes to other switch networks, in particular a fully connected network. 4 5 5

35 Point-to-Point Connectivity
A node node node ? Is the path from A to B up or down? Network Local view: what node A sees. node node node B

36 connetivity Bi-directional communication.
Link is seen as up or down by each node. Node A Node B Model is Bi-Direct.: up/down with respect to both sends & receives. Mechanism we’ll use is pings & time-outs. {U,D} {U,D} Each node sends out pings. A node may time-out, deciding the link is down.

37 group membership Consistent global view given local, point-to-point connectivity information ABCD ABCD B A D C link/node failures dynamic reconfiguration Crucial for distributed services/computation: who’s alive? Who’s performing what? ABCD ABCD

38 group membership Token-Ring based Group Membership Protocol
Our solution: simple yet effective Begin with an initial ordering of the nodes into a ring.

39 group membership Token-Ring based Group Membership Protocol
1: ABCD A B Token carries: group membership list sequence number Sequence number is updated at every node. D C

40 group membership Token-Ring based Group Membership Protocol
1: ABCD B 1 A Token carries: group membership list sequence number Sequence number is updated at every node. D C

41 group membership Token-Ring based Group Membership Protocol
1 A 2 Token carries: 2: ABCD group membership list sequence number D C

42 group membership Token-Ring based Group Membership Protocol
1 A B 2 Token carries: group membership list sequence number 3: ABCD D C 3

43 group membership Token-Ring based Group Membership Protocol
1 A B 2 Token carries: 4: ABCD group membership list sequence number D C 4 3

44 group membership Token-Ring based Group Membership Protocol
5 A B 2 Token carries: group membership list sequence number D C 4 3

45 group membership Node or link fails: 5 A B 2 D C 4 3

46 group membership Node or link fails: 5 A B D C 4 3

47 group membership Node or link fails: ? 5 A B D C 4 3

48 group membership Node or link fails: ? 5 A B D C 4 3

49 group membership Node or link fails: If a node is inaccessible,
5 A 5: ACD If a node is inaccessible, it is excluded and bypassed. D C 4 3

50 group membership Node or link fails: If a node is inaccessible,
5 A If a node is inaccessible, it is excluded and bypassed. 6: ACD D C 4 6

51 group membership Node or link fails: If a node is inaccessible,
5 A B If a node is inaccessible, it is excluded and bypassed. D C 7 6

52 group membership Node or link fails: If a node is inaccessible,
5 A B If a node is inaccessible, it is excluded and bypassed. D C 7 6

53 group membership Node with token fails: 5 A B D C 7 6

54 group membership Node with token fails: 5 A B D C 6

55 group membership Node with token fails: ? B 5 A ? D C 6

56 group membership Node with token fails: ? If the token is lost,
5 A If the token is lost, it is regenerated. ? D C 6

57 group membership Node with token fails: If the token is lost,
5 A If the token is lost, it is regenerated. D C 6

58 group membership Node with token fails: If the token is lost,
5 A If the token is lost, it is regenerated. 5: ACD 6: AD D C 6

59 group membership Node with token fails: If the token is lost,
5 A If the token is lost, it is regenerated. 5: ACD Highest sequence number prevails. 6: AD D C 6

60 group membership Node with token fails: If the token is lost,
7 A If the token is lost, it is regenerated. D C 6

61 group membership Node recovers: B 7 A D C 6

62 group membership Node recovers: Recovering nodes are added. 7 6 B D C

63 group membership Node recovers: Recovering nodes are added. 7 6 B
7: ADC D C 6

64 group membership Node recovers: Recovering nodes are added. 7 8 6 B
8: ADC D C 8 6

65 group membership Node recovers: Recovering nodes are added. 7 8 9 B
9: ADC D C 8 9

66 group membership Node recovers: Recovering nodes are added. 10 8 9 B D

67 group membership Features: A B 10 Dynamic reconfiguration D C 8 9

68 distributed storage disk disk disk disk

69 distributed storage Focus: reliability and performance. 1010 10 101 11
disk disk disk disk 1010 10 101 11

70 Rain project : goals Efficient, reliable distributed computing and storage systems: key building blocks Applications Identify and Develop Key BB Networks, Comm., Storage, API’s. Reliability, performance, functionality. Distributed solns. that work. Storage Communication Networks

71 advantages There is no limit on the size of a RAIN cluster.
There is no concept of master-slave relation. A RAIN cluster can tolerate multiple node failure. This is highly efficiency in traffic management. New node can be added into the cluster to participate in load sharing

72 Conclusion High Availability Video Server High Availability Web Server
Distributed Check pointing Mechanism

73 Thank you Any queries ?


Download ppt "rain technology (redundant array of independent nodes)"

Similar presentations


Ads by Google