1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant.

Slides:



Advertisements
Similar presentations
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Advertisements

BY PAYEL BANDYOPADYAY WHAT AM I GOING TO DEAL ABOUT? WHAT IS AN AD-HOC NETWORK? That doesn't depend on any infrastructure (eg. Access points, routers)
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Resilient Peer-to-Peer Streaming Paper by: Venkata N. Padmanabhan Helen J. Wang Philip A. Chou Discussion Leader: Manfred Georg Presented by: Christoph.
Haiyun Luo, Fan Ye, Jerry Cheng, Songwu Lu, Lixia Zhang
Spring 2003CS 4611 Content Distribution Networks Outline Implementation Techniques Hashing Schemes Redirection Strategies.
SCAN: A Dynamic, Scalable, and Efficient Content Distribution Network Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy,
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
ZIGZAG A Peer-to-Peer Architecture for Media Streaming By Duc A. Tran, Kien A. Hua and Tai T. Do Appear on “Journal On Selected Areas in Communications,
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Adaptive Push-Pull: Disseminating Dynamic Web Data Pavan Deolasee, Amol Katkar, Krithi,Ramamritham Indian Institute of Technology Bombay Dept. of CS University.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
Probabilistic Data Aggregation Ling Huang, Ben Zhao, Anthony Joseph Sahara Retreat January, 2004.
Understanding Mesh-based Peer-to-Peer Streaming Nazanin Magharei Reza Rejaie.
Hands-On Microsoft Windows Server 2003 Networking Chapter 7 Windows Internet Naming Service.
Efficiently Maintaining Stock Portfolios Up-To-Date On The Web Prashant Shenoy Manish Bhide Krithi Ramamritham 2002 IEEE E-Commerce System Proceedings.
1 Routing as a Service Karthik Lakshminarayanan (with Ion Stoica and Scott Shenker) Sahara/i3 retreat, January 2004.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
PROMISE: Peer-to-Peer Media Streaming Using CollectCast Presented by: Randeep Singh Gakhal CMPT 886, July 2004.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
Ad Hoc Wireless Routing COS 461: Computer Networks
Network Topologies.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.
Communication (II) Chapter 4
Boğaziçi University – Computer Engineering Dept. CMPE 521 An Efficient and Resilient Approach to Filtering &Disseminating Streaming Data PAPER PRESENTATION.
Network Aware Resource Allocation in Distributed Clouds.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Rate-based Data Propagation in Sensor Networks Gurdip Singh and Sandeep Pujar Computing and Information Sciences Sanjoy Das Electrical and Computer Engineering.
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
Resilient Peer-to-Peer Streaming Presented by: Yun Teng.
A Distributed Clustering Framework for MANETS Mohit Garg, IIT Bombay RK Shyamasundar School of Tech. & Computer Science Tata Institute of Fundamental Research.
Benjamin AraiUniversity of California, Riverside Reliable Hierarchical Data Storage in Sensor Networks Song Lin – Benjamin.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
Intro to Network Design
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
Peer Pressure: Distributed Recovery in Gnutella Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University.
Communication Paradigm for Sensor Networks Sensor Networks Sensor Networks Directed Diffusion Directed Diffusion SPIN SPIN Ishan Banerjee
2007/03/26OPLAB, NTUIM1 A Proactive Tree Recovery Mechanism for Resilient Overlay Network Networking, IEEE/ACM Transactions on Volume 15, Issue 1, Feb.
A Membership Management Protocol for Mobile P2P Networks Mohamed Karim SBAI, Emna SALHI, Chadi BARAKAT.
Real-Time Support for Mobile Robotics K. Ramamritham (+ Li Huan, Prashant Shenoy, Rod Grupen)
Computer Science Lecture 14, page 1 CS677: Distributed OS Last Class: Concurrency Control Concurrency control –Two phase locks –Time stamps Intro to Replication.
An Efficient Wireless Mesh Network A New Architecture 指導教授:許子衡 教授 學生:王志嘉.
Push Technology Humie Leung Annabelle Huo. Introduction Push technology is a set of technologies used to send information to a client without the client.
1 Shetal Shah, IITB Dissemination of Dynamic Data: Semantics, Algorithms, and Performance.
1 Data Mining at work Krithi Ramamritham. 2 Dynamics of Web Data Dynamically created Web Pages -- using scripting languages Ad Component Headline Component.
Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.
Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Self-stabilizing energy-efficient multicast for MANETs.
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Hiearchial Caching in Traffic Server. Hiearchial Caching  A set of techniques and mechanisms to increase the size and performance of network caches.
Structure-Free Data Aggregation in Sensor Networks.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Content Distribution Networks
Mohammad Malli Chadi Barakat, Walid Dabbous Alcatel meeting
Introduction There are many situations in which we might use replicated data Let’s look at another, different one And design a system to work well in that.
Dissemination of Dynamic Data on the Internet
Dynamic Replica Placement for Scalable Content Delivery
Design and Implementation of OverLay Multicast Tree Protocol
Presentation transcript:

1 Resilient and Coherence Preserving Dissemination of Dynamic Data Using Cooperating Peers Shetal Shah, IIT Bombay Kirthi Ramamritham, IIT Bombay Prashant Shenoy, UMass

2 Streaming (Dynamic) Data Rapid and unpredictable changes Used in on-line decision making/monitoring Data gathered by (wireless sensor) networks Sensors that monitor light, humidity, pressure, and heat Network traffic passing via switches Sports scores, stock prices Query bw_police : SELECT source_domain, number_of_packets FROM network_state; WHEN bw_used_exceeds_allocation

3 Coherency Requirement (c ) Clients specify the bound on the tolerable imprecision of requested data. #packets sent, max incoherency = 100 Metric: Fidelity: % of time when the coherency requirement is met Goal: To achieve high fidelity and resiliency Source S(t) Repository P(t) Client U(t)

4 Point to Point Data Dissemination Pull: Client pulls data from the source. Push: Source pushes data of interest to the client.

5 Generic Architecture Data sources End-hosts servers sensors wired hosts mobile hosts Network

6 Generic Architecture Data sources Proxies /caches End-hosts servers sensors wired host mobile host Network

7 Where should the queries execute ? At clients can’t optimize across clients, links At source (where changes take place) Advantages Minimum number of refresh messages, high fidelity Main challenge Scalability Multiple sources hard to handle At Data Aggregators -- DAs/proxies -- placed at edge of network Advantages Allows scalability through consolidation, Multiple data sources Main challenge Need mechanisms for maintaining data consistency at DAs

8 The Basic Problem… Executing CQs at the edge of the network -- at Data Aggegators will improve scalability and reduce overheads but poses challenges for consistency maintenance Computation / data dissemination moving query processing on streaming/changing data from sources to execution points (edge servers -- proxies/caches/repositories)

9 Focus of this talk To create a resilient and efficient content distribution network (CDN) for streaming/dynamic data. Existing CDNs for static data and dynamic web pages: Akamai, Dynamai Sources Cooperating Repositories Clients

10 Overview Architecture of our CDN Techniques for data dissemination Techniques to build the overlay network Resilient data dissemination -- suitable replication of disseminated data Sources Cooperating Repositories Clients

11 Cooperative Repository Architecture Source(s), repositories (and clients) Each repository specifies its coherency requirement Source pushes specific changes to selected repositories Repositories cooperate with each other the source

12 Dissemination Graph: Example Data Set: p,q,r,s Max # push connections : 2 Source p:0.2, q:0.2 r:0.2p:0.4, r: 0.3 q: 0.3 A B DC Client q:0.3

13 Overview Architecture of our CDN Techniques for data dissemination Techniques to build the overlay network Resilient data dissemination -- suitable replication of disseminated data Sources Cooperating Repositories Clients

14 Data Dissemination Different users have different coherency req for the same data item. Coherency requirement at a repository should be at least as stringent as that of the dependents. Repositories disseminate only changes of interest. Source p:0.2, q:0.2r:0.2 p:0.4, r: 0.3 q: 0.4 q: 0.3 AB DC Client

15 Data Dissemination A repository P sends changes of interest to the dependent Q if

16 Data dissemination -- must be done with care SourceRepository PRepository Q should prevent missed updates!

17 Dissemination Algorithms Source Based (Centralized) Repository Based (Distributed)

18 Source Based Dissemination Algorithm For each data item, source maintains unique coherency requirements of repositories the last update sent for that coherency For every change, source finds the maximum coherency for which it must be disseminated tags the change with that coherency disseminates (changed data, tag)

19 Source Based Dissemination Algorithm SourceRepository PRepository Q

20 Repository Based Dissemination Algorithm A repository P sends changes of interest to the dependent Q if

21 Repository Based Dissemination Algorithm SourceRepository PRepository Q

22 Dissemination Algorithms – number of checks by source. Repository based algorithm requires fewer checks at source

23 Dissemination Algorithms – number of messages Source based algorithm requires less messages

24 Overview Architecture of our CDN Techniques for data dissemination Techniques to build the overlay network Resilient data dissemination -- suitable replication of disseminated data Sources Cooperating Repositories Clients

25 Logical Structure of the CDN Repositories want many data items, each with different coherency requirements. Issues: Which repository should serve what to whom? Source p:0.2, q:0.2 r:0.2,q:0.4 p:0.4, r: 0.3 q: 0.3 A B D C

26 Constructing the Layout Network Check level by level starting from the source Each level has a load controller. The load controller tries to find data providers for the new repository(Q). Insert repositories one by one

27 Selecting Data Providers Repositories with low preference factor are considered as potential data providers. The most preferred repository with a needed data item is made the provider of that data item. The most preferred repository is made to provide the remaining data items.

28 Preference Factor Resource Availability factor : Can repository (P) be the provider for one more dependent? Data Availability Factor : #data items that P can provide for the new repository Q. Computational delay factor : #dependents P provides for. Communication delay factor: network delay between the 2 repositories.

29 Experimental Methodology Physical network: 4 servers, 600 routers, 100 repositories Communication delay: ms Computation delay: 3-5 ms Real stock traces: Time duration of observations: 10,000 s Tight coherency range: 0.01 to 0.05 loose coherency range: 0.5 to 0.99

30 Loss of fidelity for different coherency requirements The less stringent the coherency requirement, the better the fidelity. For little/no cooperation, loss in fidely is high. Too much cooperation? T% of the data items have stringent coherency requirements Degree of cooperation Loss in fidelity %

31 Controlled cooperation

32 Controlled cooperation is essential Without controlled cooperation With controlled cooperation Degree of cooperation Max degree of cooperation

33 But … Loss in fidelity increases for large # data items. Repositories with stringent coherence requirements should be closer to the source.

34 If parents are not chosen judiciously It may result in Uneven distribution of load on repositories. Increase in the number of messages in the system. Increase in loss in fidelity! Source p:0.2, q:0.2 r:0.2 p:0.4, r: 0.3 q: 0.3 A B D C

35 Data item at a Time Algorithm A dissemination tree for each data item. Source serving the data item is the root. Repositories with more stringent coherency requirements are placed closer to the root. Source C q: 0.3 q: 0.2 A Dynamic Data Dissemination Tree - for q When multiple data items are considered, repositories form a peer to peer network

36 DiTA Repository N needs data item x If the source has available push connections, or the source is the only node in the dissemination tree for x N is made the child of the source Else repository is inserted in most suitable subtree where N’’s ancestors have more stringent coherency requirements N is closest to the root

37 Most Suitable Subtree? l: smallest level in the subtree with coherency requirement less stringent than N’’s. d: communication delay from the root of the subtree to N. smallest (l x d ): most suitable subtree. Essentially, minimize communication delays!

38 Example Source Initially the network consists of the source. q: 0.2 A A and B request service of q with coherency requirement 0.2 q: 0.2 B C requests service of q with coherency requirement 0.1 q: 0.1 C q: 0.2 A

39 Experimental Evaluation

40 Overview Architecture of our CDN Techniques for data dissemination Techniques to build the overlay network Resilient data dissemination -- suitable replication of disseminated data Sources Cooperating Repositories Clients

41 Handling Failures in the Network Need to detect permanent/transient failures in the network and to recover from them Resiliency is obtained by adding redundancy Without redundancy, failures  loss in fidelity Adding redundancy can increase cost  possible loss of fidelity! Handle failures such that cost of adding resiliency is low!

42 Passive/Active Failure Handling Passive failure detection: Parent sends I’m alive messages at the end of every time interval. what should the time interval be? Active failure handling: Always be prepared for failures. For example: 2 repositories can serve the same data item at the same coherency to a child. This means lots of work  greater loss in fidelity. Need to be clever

43 Middle Path A backup parent B is found for each data item that the repository needs Let repository R want data item x with coherency c. P R B serves R with coherency k × c k × c c B At what coherency should B serve R ?

44 If a parent fails Detection: Child gets two consecutive updates from the backup parent with no updates from the parent B R k x ck x c c P c Recovery: Backup parent is asked to serve at coherency c till we get an update from the parent

45 Adding Resiliency to DiTA A sibling of P is chosen as the backup parent of R. If P fails, A serves B with coherency c  change is local. If P has no siblings, a sibling of nearest ancestor is chosen. Else the source is made the backup parent. B R k x ck x c c P A

46 Markov Analysis for k Assumptions Data changes as a random walk along the line The probability of an increase is the same as that of a decrease No assumptions made about the unit of change or time taken for a change Expected # misses for any k <= 2 k 2 – 2 for k = 2, expected # misses <= 6

47 Failure and Recovery Modelling Failures and recovery modeled based on trends observed in practice Analysis of link failures in an IP backbone by G. Iannaccone et al Internet Measurement Workshop 2002 Recovery:10% > 20 min 40% > 1 min & < 20 min, 50% < 1 min Trend for time between failure:

48 In the Presence of Failures, Varying Recovery Times Addition of resiliency does improve fidelity.

49 In the Presence of Failures, Varying Data Items Increasing Load Fidelity improves with addition of resiliency even for large number of data items.

50 In the Absence of Failures Increasing Load Often, fidelity improves with addition of resiliency, even in the absence of failures!

51 Ongoing Work Mapping Clients to Repositories. Locality, coherency requirements, load, mobility Reorganization of the CDN. Snapshot based Dynamic Reorganization