Presentation is loading. Please wait.

Presentation is loading. Please wait.

Client Behavior and Feed Characteristics of RSS

Similar presentations


Presentation on theme: "Client Behavior and Feed Characteristics of RSS"— Presentation transcript:

1 Client Behavior and Feed Characteristics of RSS
Presented by Sukumar Manduva Nageswari Vallabhaneni

2 Why This Presentation Previously we dealt with system architecture, event-notification and content filtering algorithms used by RSS. What About fundamental aspects like Work-Load? Usage of system by Clients?

3 Topics Introduction Measurement Methodology Survey Results
Publish-Subscribe Systems Experiment at Cornell University Measurement Methodology Passive Logging Active Polling Survey Results Feed Characteristics Update Characteristics Client Behavior

4 INTRODUCTION Pub-Sub Systems: Topic based Content based
Pub-Sub system (Subscribers, Publishers and infrastructure of event delivery). Infrastructure maps down published events with Subscribers Interest. Pub-Sub systems can be divided into two ways based on how Subscribers specify their interest: Topic based Content based

5 Pub-Sub System S1 CNN Notification Service BBC S2 NGC S3 Publishers
Subscribers Events Event Notification

6 Topic Based Pub-Sub Systems
Generally also known as subject based, group based or channel based event filtering. A subscriber subscribes to a particular channel and will receive all events published to the subscribed channel. e.g. Sports, Stock Market Topic can be hierarchy topic, e.g. Sports/basketball, Stock Market/BOA

7 Content Based Pub-Sub System
More flexibility and power to subscribers This allows Subscribers to query over the contents of the event. e.g. Notify me of news about cricket from cricinfo if the score is greater than 350

8 Experiments at Cornell University:
INTRODUCTION Experiments at Cornell University: 45 days study of about 10,000 feeds. Analyzed Feed Characteristics, Update Characteristics and Client behavior CNN RSS REQ Tracer BBC RSS RESP NGC Cornell University CS Dept

9 Measurement Methodology
Passive Logging: Tracer S/W captures TCP packets, Reassembles the flow Tracer logs the RSS requests/responses from the reassembled flow. Trace length 45 days Number of clients 158 Number of feeds 667 Number of requests 61935

10 Measurement Methodology
Active Polling: Actively polled 99,714 RSS feeds for 84 hours. A snapshot of the feed is gathered when a poll is done. Polling Period 84 Hours Number of feeds 99714 Number of snapshots Bytes received 57GB

11 Analyzing Study Results
Feed Characteristics Popularity distribution Content size Format and version. Update Characteristics Intervals Changes involved in an update Correlation between feed size and update. Client behavior Polling Subscription patterns.

12 Feed Characteristics Feed Popularity:
We measure popularity in two ways: 1.The number of requests received for each RSS feed. 2.The number of clients who subscribed to each RSS feed.

13 Feed Characteristics Feeds Ranked by Number of Requests:

14 Feeds Ranked by Number of Subscribers:
Feed Characteristics Feeds Ranked by Number of Subscribers:

15 Feed Format and Version:
Feed Characteristics Feed Format and Version: Format: 98% are RSS feeds and 2% are Atom feeds. Version:

16 Feed Characteristics Feed Size
The feed size is calculated as the average of all the snapshots of the feed 80% of feeds <10 KB Median = 5.8 KB 99% of feeds < 100KB

17 Update Characteristics
The nature of RSS update can be found using hourly snapshots gathered through polling. An update is valid if there is a valid snapshot preceding the update. Initial snapshot

18 Update Characteristics
No change Invalid snapshot Feed Change 1 hr Duration Valid snapshot 1 hr Duration No change Invalid snapshot

19 Update Characteristics
Update Rate:

20 Update Characteristics
Update Size:

21 Issues with Polling The constant polling by clients poses a significant bandwidth challenge on RSS servers. RSS 2.0 supports the TTL, SkipDays and SkipHours. Send clients only data that actually changes which saves 93.2% bandwidth consumption because of 6.8% average content change

22 Correlations between Feed Size & Update Rate:
Update Characteristics Correlations between Feed Size & Update Rate:

23 Correlations between Feed Size & Update Size:
Update Characteristics Correlations between Feed Size & Update Size:

24 Polling Frequency: Client Behavior
Auto-Client: Fixed Rate (Default 60 Min) Manual-Client: As they need

25 Client Behavior Subscriptions:

26 Conclusion We discussed what are the factors to be considered for constructing an Pub-Sub system in the future How our architecture can influence performance by saving bandwidth and reducing work load.


Download ppt "Client Behavior and Feed Characteristics of RSS"

Similar presentations


Ads by Google