Download presentation
Presentation is loading. Please wait.
Published byNeil Flynn Modified over 8 years ago
1
FeedTree: Sharing Web micronews with peer-to-peer event notification Dan Sandler, Alan Mislove, Ansley Post, Peter Druschel Presented by: Anupama Atmakur Pooja Adudodla
2
Overview Introduction Background Issues with RSS RSS Bandwidth Enhancement to RSS Design of FeedTree FeedTree Implementation Conclusions and Future Work CS-791/891 Web Syndication Formats, ODU, Spring 2008
3
Early Stage of Web…… HTML pages being static. News web sites updated their content once or twice a day. User’s visit each individual site for updated content. CS-791/891 Web Syndication Formats, ODU, Spring 2008
4
Current Trend of Web….. Explosion of Micronews Highly focused chunks of content. Information being updated Frequently and irregularly scattered over multiple sites. CS-791/891 Web Syndication Formats, ODU, Spring 2008
5
How RSS saved the Web RSS Feed a popular way to handle this information flow. Latest stories of the publisher in an XML based format. By subscribing to the url of RSS feed user instructs the application to fetch data at regular intervals. CS-791/891 Web Syndication Formats, ODU, Spring 2008
6
Special reader software collects periodically the latest information. Its like an email for web news… Img:http://apadiv20.phhp.ufl.edu/newgif/rss2007.gif http://www.cs.rice.edu/~dsandler/talks/rss-iptps05.pdf
7
Issue with RSS Usage Due to incredible usage of RSS readers. Serious impact of bandwidth usage for the RSS providers. Popular provider of RSS feed have begun to eliminate their feed to reduce bandwidth stress of polling clients. CS-791/891 Web Syndication Formats, ODU, Spring 2008
8
RSS Bandwidth RSS uses polling based retrieval architecture. Based on the number of subscribers bandwidth is scaled linearly. CS-791/891 Web Syndication Formats, ODU, Spring 2008
9
Polling Architecture RSS reader poll the feed’s web server independently. Img: http://feedtree.net/images/diagrams/syndication-simple.png
10
Unable to Handle Web servers providing RSS feed tend to suffer with greater traffic load. Load on the web servers is affected due to below reasons: CS-791/891 Web Syndication Formats, ODU, Spring 2008
11
Polling RSS application issues repeated HTTP requests for subscribed feed according to some set schedule. Superfluity Each feed is limited to N most recent entries. For every request previous N entries are emitted irrespective of the fresh entries are new to the client. Same content is being refreshed resulting in waste of bandwidth and load on servers. CS-791/891 Web Syndication Formats, ODU, Spring 2008
12
Stickiness User subscribing to a feed of a popular website and not using the content after period of time. Leading to unending load of RSS client. Twenty-Four-Hour-Traffic RSS client running on the desktop computer even if user not present. RSS reader generate 24 hours traffic from all over the earth. CS-791/891 Web Syndication Formats, ODU, Spring 2008
13
Real Scenario…. The New York Times front page alone claims 7,800 subscribers. Sum of subscribers to all its feeds comes to 24,000. Feeds from the Times tend to be around 3 KB, or 3.5 GB of data per day with 30-minute polling.
14
Improving the Polling Process Enhancing Polling Avoiding transmission of feed content if the requested content is same as old content. Gzip: server compresses the feed before returning it to the newsreader, thus decreasing the bandwidth usage. Polling in a particular time. CS-791/891 Web Syndication Formats, ODU, Spring 2008
15
Outsourcing Aggregation End-user application is build upon the Central server. Central server provide remote procedural interface. This server is polled for updated data and takes in charge for polling authoritative RSS feed in wider internet. CS-791/891 Web Syndication Formats, ODU, Spring 2008
16
RSS Providers Outsourcing Aggregator Readers Central Server Img:http://apadiv20.phhp.ufl.edu/newgif/rss2007.gif http://www.cs.rice.edu/~dsandler/talks/rss-iptps05.pdf
17
Bandwidth Issue….. As the end user start polling the central server instead the website main server. The operation at the central server with have heavy traffic load at its end. CS-791/891 Web Syndication Formats, ODU, Spring 2008
18
Danger Inherits here…. Central RSS aggregator Experience unavailability or outright failure. Change in service any time. Modifying, omit or augment RSS data without the user’s knowledge or concert. CS-791/891 Web Syndication Formats, ODU, Spring 2008
19
Solution-Distributed Approach Publisher web site distributing the new feed content to the list of subscribers Good for small subscription lists but not for large groups. Avoids unnecessary fetches but does not offer a solution for necessary fetches. FeedTree - replaces the polling component of news feeds with peer-to-peer multicast. CS-791/891 Web Syndication Formats, ODU, Spring 2008
20
FeedTree A birds eye view Based on p2p multicast network Band width cost distributed among peers Less load on network links close to source Subscribers receive content immediately as available Img: http://feedtree.net/images/diagrams/syndication-simple.png
21
Feed Tree Technical details Feed tree is a p2p overlay network based on the Scribe Scribe is a group communication and event notification protocol built on Pastry overlay network Pastry provides a self-organizing p2p network of nodes CS-791/891 Web Syndication Formats, ODU, Spring 2008
22
Why Pastry? Unstructured Networks No underlying structure or organization in the network Locating a node is problematic Exhaustive search of the network Maintenance of central index of all nodes Structured Overlay networks Nodes are decentralized and self-organizing Data can be received in logarithmic number of steps Choosing structured overlay networks like Pasty is thus attractive CS-791/891 Web Syndication Formats, ODU, Spring 2008
23
Pastry- Structured Overlay p2p Network Efficient request routing Each Pastry node has a unique 128-bit nodeId Pastry node can route a message with a numeric 128-bit key to the node with a nodeId that is closest to the key in O(log N) forwarding steps ( N is the number of live Pastry nodes in the overlay network) Load balancing Size of the routing table maintained in each Pastry node is only O(log N) Allows application-specific computations Ex: Allows Scribe to multicast data CS-791/891 Web Syndication Formats, ODU, Spring 2008
24
Work Flow of FeedTree Creation of RSS document with a time stamp and a sequence number Sign the RSS document with the publisher’s private key Multicast the RSS document in the Pastry overlay network using Scribe to the members of the Scribe group On receiving the document the authenticity is checked by verifying its signature and then added to the RSS client application. CS-791/891 Web Syndication Formats, ODU, Spring 2008
25
Implementation of FeedTree Full FeedTree - Implements full FeedTree architecture Incremental FeedTree - Publishers or Readers working on conventional RSS polling based retrieval architecture can also be renovated to take advantage of FeedTree architecture CS-791/891 Web Syndication Fomats, ODU, Spring 2008 Img: http://feedtree.net
26
FeedTree Design and Implementation Img: http://www.cs.rice.edu/~dsandler/pub/FeedTree-MSThesis-2007.pdf
27
Bootstrapping the Subscription Process How will the client application know if the feed is published through FeedTree? FeedTree metadata added to the RSS document Metadata can be IP address or DNS name of a host that is already a member of a FeedTree network The client application starts the subscription by making a conventional HTTP request to the publisher Using the metadata, further updates are taken from the FeedTree. CS-791/891 Web Syndication Formats, ODU, Spring 2008
28
Heartbeat Time-to-live: is the maximum interval between consequent FeedTree events When there is no new data in “Time-to-live”, the publisher will send a heartbeat through the FeedTree The purpose of the heartbeat is to let the peers know that an authoritative feed publisher exists CS-791/891 Web Syndication Formats, ODU, Spring 2008
29
Benefits Providers Low cost as the bandwidth is shared by all the participants Opportunity to provide differentiated RSS services Users Receive timely updates thus better news services CS-791/891 Web Syndication Formats, ODU, Spring 2008
30
Recovery of Lost Data Reasons for loss of data Failures of nodes Departures of nodes Detection of lost data A missing sequence number or A missing heartbeat Recovery of lost data The client application polls the RSS publisher for retrieval CS-791/891 Web Syndication Formats, ODU, Spring 2008
31
Development Status The conventional RSS reader application is augmented with an intermediary tool called “HTTP Proxy” which serves the HTTP requests Working: The HTTP proxy joins the FeedTree network. The HTTP proxy actively listens to any request made by the reader application for a conventional feed via HTTP. Any request made is immediately served by the proxy application by fetching the latest feed from the FeedTree network. CS-791/891 Web Syndication Formats, ODU, Spring 2008
32
Conclusions FeedTree is a very good alternative to conventional polling mechanism due to following reasons: It efficiently utilizes the available bandwidth without the need for adding any expansion of hardware or networking capabilities. Reduces load on network links near the server, by evenly distributing it among participating nodes. Can be scaled to accommodate large number of users while still maintaining low latency for arrival of new messages. CS-791/891 Web Syndication Formats, ODU, Spring 2008
33
References http://www.mpi-sws.mpg.de/~abpost/papers/RSS- IPTPS-draft.pdf http://www.mpi-sws.mpg.de/~abpost/papers/RSS- IPTPS-draft.pdf http://www.feedtree.net/ http://freepastry.rice.edu/SCRIBE/default.htm http://research.microsoft.com/~antr/pastry/default.ht m http://research.microsoft.com/~antr/pastry/default.ht m CS-791/891 Web Syndication Formats, ODU, Spring 2008
34
Questions? How is the adoption of FeedTree going to effect the conventional readers who don’t use FeedTree? CS-791/891 Web Syndication Formats, ODU, Spring 2008
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.