Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge
Advertisements

The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Multicasting in Mobile Ad hoc Networks By XIE Jiawei.
Dynamo: Amazon’s Highly Available Key-value Store
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
| Copyright© 2010 Microsoft Corporation Quick Start into Activating and Selling Office 365.
Epidemic Techniques Algorithms and Implementations.
Ranveer Chandra , Kenneth P. Birman Department of Computer Science
1 Improving the Performance of Distributed Applications Using Active Networks Mohamed M. Hefeeda 4/28/1999.
A Dependable Auction System: Architecture and an Implementation Framework
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
FeedTree: Sharing Web Micronews with Peer-to-Peer Event Notification D. Sandler, A. Mislove, A. Post, P. Druschel Presented by: Andrew Sutton.
Secure Multicast (II) Xun Kang. Content Batch Update of Key Trees Reliable Group Rekeying Tree-based Group Diffie-Hellman Recent progress in Wired and.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Overview Distributed vs. decentralized Why distributed databases
A. Frank 1 Internet Resources Discovery (IRD) Peer-to-Peer (P2P) Technology (1) Thanks to Carmit Valit and Olga Gamayunov.
Astrolabe Serge Kreiker. Problem Need to manage large collections of distributed resources (Scalable system) The computers may be co-located in a room,
Security in Wireless Sensor Networks Perrig, Stankovic, Wagner Jason Buckingham CSCI 7143: Secure Sensor Networks August 31, 2004.
LPT for Data Aggregation in Wireless Sensor networks Marc Lee and Vincent W.S Wong Department of Electrical and Computer Engineering, University of British.
October 2003 Iosif Legrand Iosif Legrand California Institute of Technology.
6/27/2015Page 1 This presentation is based on WS-Membership: Failure Management in Web Services World B. Ramamurthy Based on Paper by Werner Vogels and.
Anonymous Gossip: Improving Multicast Reliability in Mobile Ad-Hoc Networks Ranveer Chandra (joint work with Venugopalan Ramasubramanian and Ken Birman)
Ken Birman Cornell University. CS5410 Fall
Application Layer Multicast for Earthquake Early Warning Systems Valentina Bonsi - April 22, 2008.
Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)
7DS: Node Cooperation in Mostly Disconnected Networks Henning Schulzrinne (joint work with Arezu Moghadan, Maria Papadopouli, Suman Srinivasan and Andy.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
Distributed Publish/Subscribe Network Presented by: Yu-Ling Chang.
Multicast Communication Multicast is the delivery of a message to a group of receivers simultaneously in a single transmission from the source – The source.
Cache Updates in a Peer-to-Peer Network of Mobile Agents Elias Leontiadis Vassilios V. Dimakopoulos Evaggelia Pitoura Department of Computer Science University.
EPIDEMIC TECHNIQUES Ki Suh Lee. OUTLINE Epidemic Protocol Epidemic Algorithms for Replicated Database Maintenance Astrolabe: A Robust and scalable technology.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Barracuda Load Balancer Server Availability and Scalability.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Communication (II) Chapter 4
Why load testing? Application insights.
PIC: Practical Internet Coordinates for Distance Estimation Manuel Costa joint work with Miguel Castro, Ant Rowstron, Peter Key Microsoft Research Cambridge.
On P2P Collaboration Infrastructures Manfred Hauswirth, Ivana Podnar, Stefan Decker Infrastructure for Collaborative Enterprise, th IEEE International.
CH2 System models.
Overcast: Reliable Multicasting with an Overlay Network CS294 Paul Burstein 9/15/2003.
TOMA: A Viable Solution for Large- Scale Multicast Service Support Li Lao, Jun-Hong Cui, and Mario Gerla UCLA and University of Connecticut Networking.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
CS5412: BIMODAL MULTICAST ASTROLABE Ken Birman CS5412 Spring Lecture XIX.
2007/1/15http:// Lightweight Probabilistic Broadcast M2 Tatsuya Shirai M1 Dai Saito.
MobiQuitous 2007 Towards Scalable and Robust Service Discovery in Ubiquitous Computing Environments via Multi-hop Clustering Wei Gao.
Scalable Self-Repairing Publish/Subscribe Robbert van Renesse Ken Birman Werner Vogels Cornell University.
Distributed Monitoring and Management Presented by: Ahmed Khurshid Abdullah Al-Nayeem CS 525 Spring 2009 Advanced Distributed Systems.
A Data Stream Publish/Subscribe Architecture with Self-adapting Queries Alasdair J G Gray and Werner Nutt School of Mathematical and Computer Sciences,
Information-Centric Networks10b-1 Week 10 / Paper 2 Hermes: a distributed event-based middleware architecture –P.R. Pietzuch, J.M. Bacon –ICDCS 2002 Workshops.
Information-Centric Networks Section # 10.2: Publish/Subscribe Instructor: George Xylomenos Department: Informatics.
CS5412: BIMODAL MULTICAST ASTROLABE Ken Birman Gossip-Based Networking Workshop 1 Lecture XIX Leiden; Dec 06.
DATABASE REPLICATION DISTRIBUTED DATABASE. O VERVIEW Replication : process of copying and maintaining database object, in multiple database that make.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Delay Tolerant Network (DTN) Security Key Management Design Alternatives IETF94 DTN Working Group November 3, 2015 Fred L. Templin
Dynamo: Amazon’s Highly Available Key-value Store
Peer-to-peer networking
Net 435: Wireless sensor network (WSN)
CS5412: Bimodal Multicast Astrolabe
湖南大学-信息科学与工程学院-计算机与科学系
TRUST:Team for Research in Ubiquitous Secure Technologies
CS5412: Using Gossip to Build Overlay Networks
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
B. Ramamurthy Based on Paper by Werner Vogels and Chris Re
Dynamic Replica Placement for Scalable Content Delivery
Presentation transcript:

Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for web-based publish/subscribe.: DRAFT :.

© Copyright 2002 Werner Vogels Presentation duality … The case for Collaborative Content Delivery The case for Collaborative Content Deliveryvs The innovative technology used to build the system The innovative technology used to build the system Spectacularly scalable technology Spectacularly scalable technology Secure, reliable, robust & fast Secure, reliable, robust & fast A solution to many distributed management problems A solution to many distributed management problems

© Copyright 2002 Werner Vogels Epidemic Theory of Infectious Diseases and its Applications N.T.J. Bailey Hafner Press Second Edition, 1975 Late night reading

© Copyright 2002 Werner Vogels The Problem Access to real-time information at syndicated news sites is highly inefficient Access to real-time information at syndicated news sites is highly inefficient An estimated 70%-80% of the bandwidth is wasted on redundant transport both at the consumer and at the publisher An estimated 70%-80% of the bandwidth is wasted on redundant transport both at the consumer and at the publisher Consumers frequently return to the website to receive timely updates Consumers frequently return to the website to receive timely updates

© Copyright 2002 Werner Vogels Isn’t this solved already? RSS – channels provide summaries for processing by bots. RSS – channels provide summaries for processing by bots. But the mechanism remains “pull” But the mechanism remains “pull” HTTP – Delta should reduce bw cost HTTP – Delta should reduce bw cost News feeds from major vendors News feeds from major vendors “push” is the right model for frequently changing data with timely delivery “push” is the right model for frequently changing data with timely delivery Proprietary formats and high fees Proprietary formats and high fees summary as cheap alternative summary as cheap alternative Still high bandwidth cost at the publisher Still high bandwidth cost at the publisher Hybrid “push/pull” by organizations exploiting distributed content delivery Hybrid “push/pull” by organizations exploiting distributed content delivery

© Copyright 2002 Werner Vogels Scale is a major obstacle No coordinated action by syndication sites to provide shared information push infrastructure No coordinated action by syndication sites to provide shared information push infrastructure The one-to-many technologies used currently are inherently not scalable The one-to-many technologies used currently are inherently not scalable No technology is available that can deliver data from thousands publishers to millions of subscribers in real-time. No technology is available that can deliver data from thousands publishers to millions of subscribers in real-time.

© Copyright 2002 Werner Vogels We can do better Current push solutions fail to exploit the collaborative power of the Internet Current push solutions fail to exploit the collaborative power of the Internet Ideally the publishers inject one update into the world and all interested subscribers will receive this. Ideally the publishers inject one update into the world and all interested subscribers will receive this. In this model all consumers are collaborating to route the information to right subscribers In this model all consumers are collaborating to route the information to right subscribers The information arrives at all desktops within tens of seconds after publishing The information arrives at all desktops within tens of seconds after publishing

© Copyright 2002 Werner Vogels Peer-to-Peer Solution P2P is the only approach to a cost effective, scalable solution P2P is the only approach to a cost effective, scalable solution Subscribers weave an ad-hoc infrastructure for subscription based routing Subscribers weave an ad-hoc infrastructure for subscription based routing Scalable, autonomous & decentralized management Scalable, autonomous & decentralized management High level of robustness and reliability in message delivery High level of robustness and reliability in message delivery Authentication of publishers Authentication of publishers

© Copyright 2002 Werner Vogels Emerging technologies Astrolabe, CAN, Cord, Pastry, are emerging research technologies. Astrolabe, CAN, Cord, Pastry, are emerging research technologies. Astrolabe the furthest in Astrolabe the furthest in Scalability Scalability Security integration Security integration Manageable Manageable Firewall, proxy and NAT support Firewall, proxy and NAT support Complete technology that we are now using to develop applications Complete technology that we are now using to develop applications

© Copyright 2002 Werner Vogels Astrolabe/Mariner A system for ultra-scalable, distributed state management A system for ultra-scalable, distributed state management Robust, through the use of epidemic techniques Robust, through the use of epidemic techniques Scalable, through the use of information aggregation and fusion Scalable, through the use of information aggregation and fusion Secure, through certificates Secure, through certificates Flexible, through secure mobile code Flexible, through secure mobile code Simulated, Emulated, Tested and Deployed. Simulated, Emulated, Tested and Deployed.

Astrolabe Robust and Scalable Technology for Distributed System Monitoring, Management and Data Mining

© Copyright 2002 Werner Vogels Distributed Systems Management Is extremely important in the deployment of large systems Is extremely important in the deployment of large systems Scalable management of applications and systems is still a major Quest Scalable management of applications and systems is still a major Quest Management technology needs to be integrated into applications Management technology needs to be integrated into applications The management subsystem is often more complex than the application itself The management subsystem is often more complex than the application itself

© Copyright 2002 Werner Vogels Astrolabe Information/state management system Information/state management system Monitors the dynamically changing state of sets of distributed resources Monitors the dynamically changing state of sets of distributed resources Reports summaries to its consumers Reports summaries to its consumers Uses information hierarchies to organize the data Uses information hierarchies to organize the data Uses aggregation techniques to continuously compute the summary nodes in the system Uses aggregation techniques to continuously compute the summary nodes in the system

© Copyright 2002 Werner Vogels Current use of Mariner Monitor and control applications, systems and infrastructure Monitor and control applications, systems and infrastructure Resource discovery Resource discovery Collaboration management Collaboration management Coordination of distributed tasks Coordination of distributed tasks Edge-caching control Edge-caching control CDN dynamic management CDN dynamic management

© Copyright 2002 Werner Vogels Intuitively You can see mariner as a large database with information about the global system You can see mariner as a large database with information about the global system None of this information resides on a single server None of this information resides on a single server Each principal has a row in the virtual database in which it is allowed to update with pairs. Each principal has a row in the virtual database in which it is allowed to update with pairs. A principal can only directly access the rows of other nodes in its zone and its intermediate nodes in the hierarchy to the root. A principal can only directly access the rows of other nodes in its zone and its intermediate nodes in the hierarchy to the root.

© Copyright 2002 Werner Vogels Mariner in a single zone Name1LoadWeblogic?SMTP? Word Version … swift falcon cardinal Lowest level in the hierarchies can be nodes or finer grained if the application requires it Lowest level in the hierarchies can be nodes or finer grained if the application requires it Security key for zone needed to add a new column; user key needed to update row Security key for zone needed to add a new column; user key needed to update row

© Copyright 2002 Werner Vogels Scalability through Hierarchy Leafs are organized into zones Leafs are organized into zones Each leaf has a self-managed attribute list Each leaf has a self-managed attribute list The base zone is the collection of individual attribute lists of its leafs The base zone is the collection of individual attribute lists of its leafs Each intermediate zone is the collection of attribute list constructed out of aggregation of the information in its child zones Each intermediate zone is the collection of attribute list constructed out of aggregation of the information in its child zones Each list has some basic attributes, that Mariner uses to manage itself such contact lists, timestamps, etc. Each list has some basic attributes, that Mariner uses to manage itself such contact lists, timestamps, etc.

© Copyright 2002 Werner Vogels Simple Hierarchy NameLoadWeblogic?SMTP? Word Version … swift falcon cardinal NameLoadWeblogic?SMTP? …gazelle zebra gnu Name Avg Load WL contact SMTP contact SF NJ Paris San Francisco New Jersey

© Copyright 2002 Werner Vogels Information Aggregation Aggregation functions are programmable Aggregation functions are programmable Subset of SQL Subset of SQL Code is embedded in aggregation function certificates (AFC) Code is embedded in aggregation function certificates (AFC) Signed certificate is installed into an attribute list Signed certificate is installed into an attribute list Used to construct (new) attributes in zones of the hierarchy Used to construct (new) attributes in zones of the hierarchy

© Copyright 2002 Werner Vogels Epidemic Dissemination Each Astrolabe instance maintains all the zones on its path to the root Each Astrolabe instance maintains all the zones on its path to the root No centralized servers for intermediate zones No centralized servers for intermediate zones Consequently each instance has a copy of the root zone Consequently each instance has a copy of the root zone Replication is achieved through gossip techniques. Replication is achieved through gossip techniques. Guarantees eventual consistency Guarantees eventual consistency

© Copyright 2002 Werner Vogels AFC propagation 1. Output of the AFC includes a copy of it self – results in a copy of the AFC into the parent zone  Reaches the root and other zone leafs 2. Adoption – check the ancestors lists to find new AFC’s Spreads through the system in the order of tens of seconds. Spreads through the system in the order of tens of seconds. Certificates have an expiration date, unless refreshed aggregation eventually halts Certificates have an expiration date, unless refreshed aggregation eventually halts

© Copyright 2002 Werner Vogels I’ll skip Aggregation function details Aggregation function details Mobile code details Mobile code details Eventual consitency Eventual consitency Certificates Certificates Authentication Authentication Firewalls, & nat’s Firewalls, & nat’s

© Copyright 2002 Werner Vogels Robustness through Gossip Use of Epidemic Techniques to disseminate data and AFC’s Use of Epidemic Techniques to disseminate data and AFC’s Pure peer-to-peer communication Pure peer-to-peer communication Full autonomous progress Full autonomous progress Actions based on probability theory Actions based on probability theory Robustness improves with scale Robustness improves with scale Fixed low overhead, independent of scale Fixed low overhead, independent of scale Control as well as Data transport Control as well as Data transport

© Copyright 2002 Werner Vogels Gossip Conceptually: each zone periodically picks another zone at random and exchanges the state of those zones Conceptually: each zone periodically picks another zone at random and exchanges the state of those zones Slightly more complex because there are virtual zones … Slightly more complex because there are virtual zones …

© Copyright 2002 Werner Vogels Gossip target selection AsiaCornellNode1System EuropeMITNode2Inventory USAUSCDNode3Monitor U-WashNode4 1.Each instance update the issued attribute, evaluates depending AFC’s 2.An agent (instance) will gossip on behalf of those zones for which it is a contact, with a rate depending on configuration 3.At each level pick at random a child from the contact list and exchange state

© Copyright 2002 Werner Vogels Membership Failure detection Failure detection If no update seen for an agent in time T fail, remove it from the system If no update seen for an agent in time T fail, remove it from the system Integration Integration After partitions, crashes, etc. renegate trees can be formed After partitions, crashes, etc. renegate trees can be formed Use of broadcast, multicast, hints, to discover other agents Use of broadcast, multicast, hints, to discover other agents

© Copyright 2002 Werner Vogels Subscription routing At the leafs the subscribers store subscription information At the leafs the subscribers store subscription information Aggregation functions combine the subscriptions of participants into subscriptions for the zone Aggregation functions combine the subscriptions of participants into subscriptions for the zone Publishers use Publishers use zone.send(subscription, data) which is forwarded if the zone has children that match the subscription

© Copyright 2002 Werner Vogels Routing infrastructure Each zone dynamically selects 2-3 routing nodes using AFC’s using various load factors Each zone dynamically selects 2-3 routing nodes using AFC’s using various load factors These nodes receive news items for their children in their zone These nodes receive news items for their children in their zone Forwarding based on the individual subscription information Forwarding based on the individual subscription information Redundancy used to achieve robustness and reliability Redundancy used to achieve robustness and reliability

© Copyright 2002 Werner Vogels Summary