Epidemic Techniques Chiu Wah So (Kelvin). Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential)

Slides:



Advertisements
Similar presentations
Sergei Komarov. DNS  Mechanism for IP hostname resolution  Globally distributed database  Hierarchical structure  Comprised of three components.
Advertisements

Linearizability Linearizability is a correctness criterion for concurrent object (Herlihy & Wing ACM TOPLAS 1990). It provides the illusion that each operation.
Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 6 Managing and Administering DNS in Windows Server 2008.
Epidemic Techniques Algorithms and Implementations.
1 CS 194: Lecture 8 Consistency and Replication Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon.
Feb 7, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements.
Lecture 7 Data distribution Epidemic protocols. EECE 411: Design of Distributed Software Applications Epidemic algorithms: Basic Idea Idea Update operations.
Ranveer Chandra , Kenneth P. Birman Department of Computer Science
Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Computer Science Lecture 16, page 1 CS677: Distributed OS Last Class: Web Caching Use web caching as an illustrative example Distribution protocols –Invalidate.
CMPE 150- Introduction to Computer Networks 1 CMPE 150 Fall 2005 Lecture 22 Introduction to Computer Networks.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
CSCI 4550/8556 Computer Networks Comer, Chapter 19: Binding Protocol Addresses (ARP)
Astrolabe Serge Kreiker. Problem Need to manage large collections of distributed resources (Scalable system) The computers may be co-located in a room,
Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.
Hands-On Microsoft Windows Server 2003 Networking Chapter 7 Windows Internet Naming Service.
Anonymous Gossip: Improving Multicast Reliability in Mobile Ad-Hoc Networks Ranveer Chandra (joint work with Venugopalan Ramasubramanian and Ken Birman)
Ken Birman Cornell University. CS5410 Fall
Routing.
Wide-area cooperative storage with CFS
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 7: Planning a DNS Strategy.
Multicast Communication Multicast is the delivery of a message to a group of receivers simultaneously in a single transmission from the source – The source.
Connecting LANs, Backbone Networks, and Virtual LANs
EPIDEMIC TECHNIQUES Ki Suh Lee. OUTLINE Epidemic Protocol Epidemic Algorithms for Replicated Database Maintenance Astrolabe: A Robust and scalable technology.
Server Load Balancing. Introduction Why is load balancing of servers needed? If there is only one web server responding to all the incoming HTTP requests.
Epidemic Algorithms for replicated Database maintenance Alan Demers et al Xerox Palo Alto Research Center, PODC 87 Presented by: Harshit Dokania.
Chapter 16 – DNS. DNS Domain Name Service This service allows client machines to resolve computer names (domain names) to IP addresses DNS works at the.
Communication (II) Chapter 4
1 6.4 Distribution Protocols Different ways of propagating/distributing updates to replicas, independent of the consistency model. First design issue.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
M.Menelaou CCNA2 ROUTING. M.Menelaou ROUTING Routing is the process that a router uses to forward packets toward the destination network. A router makes.
Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for.
CH2 System models.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Module 7: Resolving NetBIOS Names by Using Windows Internet Name Service (WINS)
1 Administering Shared Folders Understanding Shared Folders Planning Shared Folders Sharing Folders Combining Shared Folder Permissions and NTFS Permissions.
© J. Liebeherr, All rights reserved 1 Multicast Routing.
Serverless Network File Systems Overview by Joseph Thompson.
IP1 The Underlying Technologies. What is inside the Internet? Or What are the key underlying technologies that make it work so successfully? –Packet Switching.
CS5412: BIMODAL MULTICAST ASTROLABE Ken Birman CS5412 Spring Lecture XIX.
Scalable Self-Repairing Publish/Subscribe Robbert van Renesse Ken Birman Werner Vogels Cornell University.
1. Outline  Introduction  Different Mechanisms Broadcasting Multicasting Forward Pointers Home-based approach Distributed Hash Tables Hierarchical approaches.
DNS DNS overview DNS operation DNS zones. DNS Overview Name to IP address lookup service based on Domain Names Some DNS servers hold name and address.
CS5412: BIMODAL MULTICAST ASTROLABE Ken Birman Gossip-Based Networking Workshop 1 Lecture XIX Leiden; Dec 06.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
BIMODAL MULTICAST ASTROLABE Ken Birman 1 CS6410. Gossip  Recall from early in the semester that gossip spreads in log(system size) time  But is.
Chapter 25 Internet Routing. Static Routing manually configured routes that do not change Used by hosts whose routing table contains one static route.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 6: Planning, Configuring, And Troubleshooting WINS.
ETHANE: TAKING CONTROL OF THE ENTERPRISE
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 6: Planning, Configuring, And Troubleshooting WINS.
Introduction to Wireless Sensor Networks
IMPLEMENTING NAME RESOLUTION USING DNS
Internet Networking recitation #12
CS5412: Bimodal Multicast Astrolabe
Routing.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Chapter 3: Dynamic Routing
Providing Secure Storage on the Internet
5.2 FLAT NAMING.
Replica Placement Model: We consider objects (and don’t worry whether they contain just data or code, or both) Distinguish different processes: A process.
Ch 17 - Binding Protocol Addresses
Last Class: Web Caching
Overview Multimedia: The Role of WINS in the Network Infrastructure
Routing.
Presentation transcript:

Epidemic Techniques Chiu Wah So (Kelvin)

Database Replication Why do we replicate database? – Low latency – High availability To achieve strong (sequential) consistency on replicated database. Not scalable. – One primary database – Quorum system (contact over half of replicas)

Database Replication To scale and have high availability, we need a weaker consistency model – Eventual consistency: If all updating stops then eventually all replicas will converge to the identical values. The two papers talk about how to use epidemic techniques to achieve eventual consistency to scale.

Epidemic Techniques for replicated database Epidemic Algorithms for Replicated Database Maintenance – Look at different epidemic algorithms to reduce bandwidth consumption to maintain replicated database Astrolabe – Scalable and Robust information management system.

Motivation on the first paper Clearinghouse service maintains translations from names to machine addresses. (like DNS) Problem: Using direct mail and anti-entropy, too much traffic to maintain consistency between highly replicated servers. Some key links are overloaded. Look at techniques to reduce bandwidth: rumor spreading and spatial distributions.

Direct mail Direct mail: each server sends update to all other servers. Advantage – Easy to implement – Good enough for small and static servers Disadvantage: – Not scale (O(n) message for each update) – Updates may get lost.

Anti-entropy Servers pick random server and resolve differences. 3 ways to resolve differences: push, pull, and push- pull.

Anti-entropy Example

Push

Pull

Push-Pull

Anti-entropy Average time Average converge in O(log(n)) steps Pull, push-pull Push

Anti-entropy (2) Very expensive to send the whole database across network to compare Some techniques for optimizing comparing bandwidth – Compute Checksum – Exchange list of of recent updates. Then apply the update and compute checksum – Exchange updates in reverse chronological order until checksums agree Still too much bandwidth…..

Rumor spreading Main idea: send out updates randomly. Instead of comparing whole database. Three states: susceptible, infective, and removed. Initially all servers are susceptible Once server has a rumor (infective), and then pick a random server to send the rumor. With probability 1/k, the server loses interest (removed) to spread rumor

Rumor spreading (2) But maybe not every server got the rumor. – With probability of remaining susceptible after the epidemic finishes: Run anti-entropy infrequently to make sure every server gets the update.

Three goals in rumor spreading Low Residue: the probability of remaining susceptible when the epidemic finishes Low Traffic: total traffic sent per site Low Delay: Average time and the last time between the injection of an update and the arrival of update.

Variations in rumor spreading Many variations in rumor spreading. – Blind with Coin vs Feedback with Counter – Push vs Pull – Increase the smaller counter of the two – Connection limit – Hunting

Feedback Counter vs Blind Prob. Feedback and Counter Blind and Probability

Deletion and Death Certificates Simple solution: death certificates and store for a fixed threshold of time 2 nd solution: dormant death certificates. Use two threshold time, and some servers keep it longer. 2 different timestamp: original timestamp and reactivation timestamp.

Motivation of Spatial Distributions Network is not uniform. Certain key links in the network are overload. – Transatlantic links about 80 conversations, but on average conversations per link is 6. Therefore, we should favor nearby neighbors.

Spatial Distributions Each servers, sort the list of sites by distance from s. Select anti-entropy exchange partners from the sorted list according to a function f(i), i = index on the sorted list. We can use f(i) = i^(-a), where a is the parameter for tuning spatial distribution.

Spatial Distribution

Next Paper: Astrolabe The first paper talks about how to use rumor spreading and spatial distribution to reduce bandwidth. But the storage grows O(n) and total bandwidth taken up by gossip grows O(n^2) We need a more scalable solution.

Astrolabe Scalable and Robust information management system. Monitors the dynamically changing state of a collection of distributed resources. Reports summaries of this information to users.

Four design goals Scalability: scale through its zone hierarchy. Information is summarized before exchanges. Flexibility: easy to install new aggregated function in a form of SQL aggregation query Robustness: randomized peer-to-peer approach to exchange information. Security: use signed certificates.

Examples: uses of astrolabes Peer-to-Peer Caching of Large Objects Peer-to-Peer Data Diffusion Publish-Subscribe Synchronization

Structure of Astrolabe Structure of Astrolabe’s zones can be viewed as a trees. Leaves of this tree are hosts. Each hosts run an astrolabe agent.

Astrolabe Detail Each agent is a virtual database. Each agent has a path name. (For example: /USA/Cornell/pc3) Each agent contains information, called MIB, for all the ancestor zone (For example, it contains /, /USA, /USA/Cornell) Each ancestor MIB is generated using aggregation for scalability, instead of having O(n) entries.

Astrolabe Detail (2) Each zone can be viewed as relational table of the attributes of its child zone. How do we gather or generate the information in the zone relational table? Two ways: If the agent is in the zone, use aggregation to construct the MIB for the zone. Otherwise, gossip for the information. Therefore, MIB for internal zones has to be small in order to scale.

Aggregation Aggregation Function Certificates contain information on how to collect and aggregate attributes of child zone MIBs into entries for internal zone MIBs. – Programmed in SQL-like language – Propagates by two ways: copying to parent (propagates like other normal attributes), and look for new AFC from its ancestor zone

Aggregation (2) Here are the SQL aggregation functions that are provided by Astrolabe.

Gossip Each zone has a small set of addresses for representative agents. Representative agents are computed using an aggregation function, such as using load and longevity. An agent gossips on behalf of those zones for which it is a representative.

Gossip (2) Periodically, the agent picks one of the child zones, and talks to one of the contact agents. (anti-entropy) Then, it sends all the child zones at that level, and does the same thing for the higher levels in the tree up until the root level. Then the two agents can compare which entries are newer and keep them.

Example of gossip (taken from ken slides) NameTimeLoadWeblogic?SMTP?Word Version swift falcon cardinal NameTimeLoadWeblogic?SMTP?Word Versio n swift falcon cardinal swift.cs.cornell.edu cardinal.cs.cornell.edu

Example of gossip (2) NameTimeLoadWeblogic?SMTP?Word Version swift falcon cardinal NameTimeLoadWeblogic?SMTP?Word Versio n swift falcon cardinal swift.cs.cornell.edu cardinal.cs.cornell.edu swift cardinal

Example of gossip (3) NameTimeLoadWeblogic?SMTP?Word Version swift falcon cardinal NameTimeLoadWeblogic?SMTP?Word Versio n swift falcon cardinal swift.cs.cornell.edu cardinal.cs.cornell.edu

NameLoadWeblogic?SMTP?Word Version … swift falcon cardinal NameLoadWeblogic?SMTP?Word Version … gazelle zebra gnu NameAvg Load WL contactSMTP contact SF NJ Paris San Francisco New Jersey SQL query “summarizes” data Dynamically changing query output is visible system-wide

Membership When an agent has not seen an update for a zone from a particular representative for some time Tfail. Remove its MIB. Connect different pieces of the trees and add in new machines – IP multicast – Broadcast – Relatives Administrators responsible for configuring the system by assigning zone names.

Communication Through Http and UDP(need to fragment the messages into more than one UDP packets) If there is firewall, – Use ALG in core internet or an astrolabe agent in core internet.

Security Each zone is a management unit. Children have a way to override policy enforced by parents. Each zone: 2 pairs of key, CA and zone keys – Zone certificate – MIB certificate – Aggregation function certificate – Client certificate

Related work Directory Services (Clearinghouse, Bayou, Globe) Network Monitoring Event Notification Sensor Networks Peer-to-peer routing

Measurement on expected # rounds

Measurement on expected # rounds (2)

Measurement on Latency RealSimulation

Conclusion ??