Download presentation
Presentation is loading. Please wait.
1
Corona: A High Performance Publish-Subscribe System for the World Wide Web Cornell University, Ithaca, NY Networked System Design and Implementation (NSDI), May 2006
2
Motivation Web contents change rapidly Growing popularity of frequently updated content Weblogs Wikis News sites Existing Web protocols do not provide a mechanism for automatically notifying users of updates
3
State of the Art Uncoordinated polling tools E.g. Micronews syndication tools (e.g. RSS readers) Based on naïve repeated polling Suffer from poor performance and scalability Subscribers are tempted to poll at faster rates to detect updates quickly Different subscribers repeatedly poll for the same content independently Content providers have to handle the high bandwidth load
4
Background and Related Work: Publish-Subscribe Systems Can be classified as: Topic based Content based Prior research focused on content filtering and event delivery mechanisms Main drawback - non-compatibility with the current Web architecture
5
Background and Related Work: Publish-Subscribe Systems Topic-based systems are based on several decentralized mechanisms: Group communication – Isis Shared object spaces – Linda, TSpace, Java Spaces Rendezvous points – TIBCO, Herald Content-based publish-subscribe systems that use in- network content filtering and aggregation SIENA Gryphon Elvin Astrolabe
6
Background and Related Work: Micronews Systems Short descriptions of frequently updated information in XML based formats such as RSS and Atom Accessed via HTTP through URLs and supported by feed readers Commercial services have started disseminating micronews updates to users: Bloglines, NewsGator, Queoo Use fragile servers and relentless polling FeedTree – recent system for micronews dissemination Uses a structured overlay, cooperative polling and shares updates between peers CAM and WIC use techniques for resource allocation similar to Corona, but are limited to a single node
7
Corona Cornell Online News Aggregator Topic based publish – subscribe system for the Web Interoperates with the current pull- based architecture of the Web URLs of Web content serve as topics or channels Any Web object identifiable by a URL can be monitored with Corona
8
Corona Architecture
9
Uses structured overlay network (Pastry) Provides decentralization, good failure resilience, and high scalability Key feature that enables Corona to achieve fast update detection is cooperative polling: Multiple nodes are assigned to periodically poll the same channel and updates detected by any polling node are shared The number of nodes that poll for each channel is determined based on an analysis of the tradeoff between update performance and network load Corona poses this tradeoff as an optimization problem
10
Pastry The network is organized into a ring Each node is assigned an id from a circular numeric space The ids are treated as a sequence of k digits of base b (b is a power of 2) Routing: Pointers to neighbors in each direction “Long distance” contacts - the entry in the i’th row and j’th column of a node's routing table points to a node whose id shares i prefix digits with the present node and whose (i + 1)th digit is j
11
Pastry Routing table of a pastry node with NodeId= 65a1x. b=16, x is chosen so the pointed node is the closest to the present node (according to a scalar proximity metric, such as the round trip time)
12
Pastry Routing a message from node 65a1fc to node d46a1c
13
Analytical Modeling Pastry The NodeId space can be viewed as “level sectors” - the nodes in a sector of level l share the first l digits of the NodeId: Example with b=2, k=4 0000 0010 0100 0111 1000 1111 1101 1011 1010 l=0 l=1l=2l=3 0110 l=4
14
Pastry This way the nodes in the i’th row in a routing table are in the same sector of level i with the present node If the nodes are uniformly distributed only log b (N) levels in the routing table are populated (the smallest sector is of level log b (N)) Any node can be reached in log b (N) hops
15
The Polling Scheme Corona assigns nodes in well - defined wedges of the Pastry ring for polling each channel Each channel is assigned an id from the same circular numeric space A channel with polling level l is polled by all nodes with at least l matching prefix digits in their ids
16
The Polling Scheme Any problems with the scheme? Polling levels selection What about topology?
17
The Polling Scheme Cooperative polling in Corona
18
Analytical Modeling The polling level of a channel quantifies its performance-overhead tradeoff: A channel at level l has, on average, nodes polling it Which can cooperatively detect updates in time on average, – the polling interval The collective load placed on the content server of the channel is proportional to
19
Performance-overhead tradeoff approaches Corona-Lite Performance goal - minimizing the average update detection time while bounding the network load on content servers The overall update performance = average of the update detection time of each channel weighted by the number of clients subscribed to the channels Target network load - the total number of subscriptions in the system M - number of channels N - number of nodes b - base of structured overlay T - performance target li - polling level of channel i qi - number of clients for channel i si - content size for channel i ui - update interval for channel i
20
Performance-overhead tradeoffs Is the average a meaningful metric? Other suggestions? User-effected weights Update detection time What about the propagation to the user Comments on the selection of the polling interval? Other parameters that could be accounted for? User inputs to tackle stickiness, 24 hour polling
21
Performance-overhead tradeoff approaches Corona-Lite Clients of popular channels gain greater benefits than clients of less popular channels. Yet, Corona-Lite does not suffer from “diminishing returns” Corona-Lite performance can vary depending on the current workload
22
Performance-overhead tradeoff approaches Corona-Fast Corona-Fast provides stable update performance, maintained steadily at a desired level through changes in the workload Minimizes the total network load on the content servers while meeting a target average update detection time M - number of channels N - number of nodes b - base of structured overlay T - performance target li - polling level of channel i qi - number of clients for channel i si - content size for channel i ui - update interval for channel i
23
Performance-overhead tradeoff approaches Corona Fair Corona-Fast and Corona-Lite do not consider the actual rate of change of content in a channel Corona-Fair incorporates the update rate of channels into the performance tradeoff to achieve a fairer distribution of update performance between channels It defines a modified update performance metric as the ratio of the update detection time and the polling interval of the channel
24
Performance-overhead tradeoff approaches Corona Fair Minimizes the average of the ratio metric, bounding load on content servers M - number of channels N - number of nodes b - base of structured overlay T - performance target li - polling level of channel i qi - number of clients for channel i si - content size for channel i ui - update interval for channel i
25
Performance-overhead tradeoff approaches Comments about the approaches?
26
Decentralized Optimization Honeycomb Corona determines the optimal polling levels using the Honeycomb optimization toolkit Provides numerical algorithms and decentralized mechanisms for solving optimization problems of the kind: Honeycomb finds an approximate solution in O(M logM logN) time (using Lagrange multiplier)
27
Decentralized Optimization Tradeoff clusters Solving the optimization problem using limited data available locally can produce highly inaccurate solutions Collecting the tradeoff factors for all the channels at each node is expensive and impractical Honeycomb combines channels with similar tradeoff factors into a tradeoff cluster The nodes periodically exchange the clusters with contacts in the routing table and aggregate the clusters received from the contacts The overhead for clusters aggregation is kept low by limiting the number of clusters to a constant
28
System Management Each channel in Corona has a unique id and one or more owner nodes managing it The primary owner of a channel is the Corona node with the numerically closest id to the channel's Corona adds the F closest neighbors of the primary owner as additional owners to tolerate failures
29
System Management Owners take responsibility for managing subscriptions, polling, and updates for a channel Problems? Also keep track of channel-specific factors that affect the performance tradeoffs - the number of subscribers, the content size and the update rate All nodes run a periodic protocol: Optimization phase Maintenance phase Aggregation phase
30
System Management Changing Polling Levels (Maintenance phase) Corona nodes operate independently and make decisions to increase or decrease polling levels locally Initially only the owner nodes poll for the channels When a level i node lowers the level to i-1 or raises the level from i+1 back to i, it instructs its contact in row i-1 of its routing table to start or stop polling for that channel When a node is instructed to begin polling it waits for a random interval of time between 0 and the polling interval before the first poll
31
System Management Updating Tradeoff Factors (Aggregation phase) Owners monitor the number of subscribers and send out fresh estimates along with the maintenance message Descendant nodes propagate these estimates to all the nodes in the wedge The update interval and size feed only change during updates and are therefore sent along with update messages Tradeoff clusters are also sent by contacts in the routing table in response to maintenance messages
32
System Management Failure resilience Inherited from the underlying overlay When new nodes join the system or nodes fail, Corona ensures the transfer of subscription state to new owners Simultaneous failure of more than F adjacent nodes might cause a system failure But clients can easily renew subscriptions
33
System Management Comments on system management?
34
Update Dissemination Version numbers are used to identify new content Either timestamp provided by the content server or numbers assigned by the primary owner Corona nodes share updates as deltas between old and new content Bandwidth can be saved through data encoding When a delta is generated by a node, it shares the update with all other nodes in the channel's polling wedge Why?
35
User Interface Corona employs instant messaging (IM) as its user interface Subscribe/Unsubscribe messages A subscribe or unsubscribe message is routed to all the owner nodes of the channel, which update their subscription state When an update is detected by Corona, the current primary owner sends a message with the delta to all the subscribers through the IM system Comments on the user interface?
36
Just to compare: FeedTree Also uses Pastry Group communication (Scribe) – the subscribers join the overlay Full deployment: Push-based - the publishers join the overlay Updates are sent using the overlay multicast Partial deployment: The publishers are collectively polled by groups of the overlay nodes They produce the updates and distribute them to the subscribers Advantages, disadvantages?
37
Evaluation Large scale simulations Wide-area experiments on PlanetLab Performance is compared to that of the legacy RSS Comments?
38
Simulation Real-life RSS traces are used The tradeoff parameters are extrapolated to a larger scale: 1024 nodes 100,000 channels 5,000,000 subscribers Polling interval – 30 minutes How was that selected? All three schemes are checked and compared to RSS
39
Simulation Results Network Load on Content Servers Average Update Detection Time
40
Simulation Results Number of Pollers per Channel Update Detection Time per Channel
41
Simulation Results There are roughly two levels of polling according to this simulation Most of the channels are polled by ~100 nodes Is Honeycomb really necessary for that?
42
Simulation Results – Corona Fair Update Detection Time
43
Simulation Summary What didn’t they check?
44
Deployment A set of 60 PlanetLab nodes Corona-Lite scheme is used 7500 RSS feeds from www.syndic8.comwww.syndic8.com 150,000 subscriptions Polling interval – 30 minutes
45
Deployment Results Average Update Detection Time Total Polling Load on Servers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.