Download presentation
Presentation is loading. Please wait.
1
Application Layer Multicast for Earthquake Early Warning Systems Valentina Bonsi - April 22, 2008
2
Agenda Architecture of an Earthquake Early Warning System Crisis Alert & Early Warning dissemination Application Layer Multicast implementation Enhancing reliability of ALM
3
Earthquake Early Warning System Sensors (Seismometers etc) Earthquake assessment center 1)determine magnitude and location of the event 2)predict the peak ground motion expected in the region around the event Schools Municipalities Transportation authorities Companies Public Focus of the ElarmS system (www.elarms.com)www.elarms.com Crisis Alert system
4
Crisis Alert – current implementation Notification can be triggered automatically when message from the Earthquake assessment center is received Dedicated engine to process early warning policies No human interaction required Dissemination modalities: CREW protocol Application Layer Multicast (Scribe on Pastry, http://research.microsoft.com/~antr/scribe/) http://research.microsoft.com/~antr/scribe/ Organization based dissemination: no dissemination to the public
5
Application Layer Multicast – Scribe implementation Internet Pastry object location and routing for P2P systems Scribe application-level multicast infrastructure Application level
6
Pastry Pastry node device connected to the Internet and running Pastry node SW Assigned a unique 128-bit id Node state Routing table log 2 b N cols, 2 b -1 rows Entry in i th col shares i digits with current node Each entry is the closest existing node with the appropriate prefix Leaf set L nodes with the L /2 numerically closest larger NodeId and L /2 numerically closest smaller NodeId Neighborhood set M nodes that are physically closest to the local node
7
Pastry: routing Algorithm: Check if the message key falls in the leaf set forward message to destination Find the node in the routing table that shares with the key one more digit than the current node forward message to the next node Find the node in the routing table that shares with the key the same number of digit than the current node but is numerically closer to the key forward message to the next node It always converges Number of routing steps Expected: log 2 b N worst case (many simultaneous node failures): ~N Delivery guaranteed unless L /2 nodes with consecutive ids fail simultaneously
8
Pastry: self-organization & adaptation Node arrival New node X must know a nearby pastry node A X asks A to forward a JOIN message forwarded to Z, the node with id closer to X All nodes that have seen the JOIN send to X their state tables X initiate its state with collected info (L = Z.L, M=A.M) X notify every node in its state table Node departure Leaf node failure detected during routing Contact the live node in its L with the largest index on the size of the failed node X (smaller or grater than himself) Choose the new leaf among X.L Neighbor node failure detected during periodic contact Contact other members in N Choose the closest neighbor to add Routing table node failure detected during routing Contact other members in the same routing table row and look for the node in the same position of the failed one Eventually continue the search among nodes in the next routing table row
9
Scribe Event notification infrastructure for topic- based publish-subscribe applications Scribe system: network of Pastry nodes running the Scribe SW Topic creation: node send a CREATE message with topic id as destination Pastry delivers it to the node with the node id numerically closest to topic id (rendez-vous point)
10
Scribe: membership management Node subscription: Node sends a SUBSCRIBE message with key = topicId Pastry routes message to rendez-vous point but each node along the route: Checks if he is a forwarder for the topic If not, add the topic to its current list and sends a SUBSCRIBE message to the same topic Add the sender to the children list for the topic and stops the original message Node unsubscription Node unsubscribes locally and checks if there are children for the topic If not, sends an UNSUBSCRIBE message to its parent (possible only after having received at least one event from parent) Disseminating a message: If publisher has rendez-vous IP address send a PUBLISH message to it Otherwise send uses Pastry to locate it by sending a PUBLISH message with topicId as key Rendez-vous node disseminate message along the tree
11
Scribe: Reliability Scribe provides reliable & ordered delivery if the TCP connections between nodes in the tree do not fail Repairing the multicast tree Children periodically confirm their interest in the topic by sending a message to their parents Parents send heartbeat to children, when child do not receive heartbeats sends a SUBSCRIBE message with topicId as key. Pastry routes the message to the rendez-vous point rebuilding the tree Root failure Root state is replicated in the k closest nodes When children detect failure, new tree is build with one of these nodes as root
12
Scribe: experimental evaluation Setup: Simulator that models propagation delay but not queuing, packet losses or cross traffic. 100,000 nodes 1,500 groups with different size (11 to 100,000) Delay penalty (compared with IP multicast) Max delay penalty: 1.69 for 50% of groups, 4.26 max Avg delay penalty: 1.68 for 50% of groups, 2 max Node stress Mean number of children tables per node: 2.4 (Max 40) Mean number of entries per node: 6.2 (Max 1059) Link stress (compared with IP multicast) Mean number of messages per link: Scribe 2.4, IP 0.7 Max number of messages per link: Scribe 4031, IP 950 Scalability with many small groups (50,000 nodes, 30,000 groups 11 nodes each) Mean number of entries per node: 21.2, naïve multicast 6.6 Problem is that many long paths with no branches are created algorithm for collapsing the tree removing nodes that are not members and have only one child per table
13
Reliability of ALM In the ALM implementation we consider, nodes that subscribe to a topic are organized in a tree If one of the nodes fails, all the sub-tree rooted at that node will be unreachable until the tree is rebuilt Impossible to guarantee that nodes will receive early warning since rebuilding the tree can require few seconds can be too late Idea: build a graph instead of a tree, where the subscription of each node is maintained in k nodes If we assume single failure model and k=2, in case of a node failure the sub-tree rooted at that node will be notified by the other node that maintain those subscription while the graph is repaired.
14
Membership management Node subscription Node sends one subscribe messages to the topic Id and waits for k subscribe ack Node that receive the subscribe message performs the Scribe subscription and disseminates k-1 subscribe messages to its replicas Subscribe node can receive: SUBSCRIBE ACK add new parent SUBSCRIBE FAILED impossible adding another parent in the current situation (retry when a node joins or leaves) SUBSCRIBE LOST re-send subscribe message Node unsubscription Same as Scribe but UNSUBSCRIBE message is send to all parents. Message dissemination Same as Scribe but each node maintains a cache of sent messages in order to avoid re-multicasting
15
Multicast graph maintenance Repairing the multicast tree Children periodically check that one of their parents is alive and confirm their interest in the topic by sending a SUBSCRIPTION CONFIRM message When child detects parent’s failure sends a new SUBSCRIBE message. Root failure Root state is replicated in the k closest nodes If root node fails, messages with key = topic Id are routed to one of the root replicas (because it now has the closest Id to the topic) Children of the root will detect the failure and send a new SUBSCRIBE message
16
Graph-based ALM: issues Dissemination path contains loops cache sent messages Cache should contain a unique id = hash(source+id) How to choose replica nodes Pastry routing delivers a message to the node with id closest to the message destination each subscription is maintained in the node that would have been the parent in the tree structure Scribe generates and in the k-1 nodes with ids closest to it Root is replicated in the same way if it fails one of the k-1 replicas will become the new root and will already have the list of subscribed nodes Position of the root is dynamic: every time a new node is added, can become the new rendez-vous point if its Id is the closest to the topic Id Issues can arise in this case because potentially only the old rendez- vous point will subscribe to the new one if the old rendez-vous point fails, the dissemination stops 2 possible solutions Impose additional constraints on the fan-out of nodes (h children) Fix the rendez-vous point to be a “stable” node
17
Testing plan Reliability Message dissemination delay in normal conditions and with failures Failure models: Independent failures Geographical failures Scalability Set of EEW receivers includes schools, transportation companies, cities. Test with increasing number of receivers for each organization
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.