Replication Store it in multiple places.... Literature Colouris, Dollimore, Kindberg, 2000 –Gets deep into the details of reliable communication, byzantine.

Slides:



Advertisements
Similar presentations
High Availability Deep Dive What’s New in vSphere 5 David Lane, Virtualization Engineer High Point Solutions.
Advertisements

Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Copyright 2004 Koren & Krishna ECE655/DataRepl.1 Fall 2006 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing.
The SMART Way to Migrate Replicated Stateful Services Jacob R. Lorch, Atul Adya, Bill Bolosky, Ronnie Chaiken, John Douceur, Jon Howell Microsoft Research.
DISTRIBUTED SYSTEMS II REPLICATION CNT. II Prof Philippas Tsigas Distributed Computing and Systems Research Group.
June 23rd, 2009Inflectra Proprietary InformationPage: 1 SpiraTest/Plan/Team Deployment Considerations How to deploy for high-availability and strategies.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Slide 1 Client / Server Paradigm. Slide 2 Outline: Client / Server Paradigm Client / Server Model of Interaction Server Design Issues C/ S Points of Interaction.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CMPT 431 Dr. Alexandra Fedorova Lecture XII: Replication.
Replication Libby Rasnick Christopher Newport University CPSC 550 Spring 2003.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Lesson 1: Configuring Network Load Balancing
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
(ITI310) By Eng. BASSEM ALSAID SESSIONS 8: Network Load Balancing (NLB)
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Installing and Setting up mongoDB replica set PREPARED BY SUDHEER KONDLA SOLUTIONS ARCHITECT.
Business Continuity and Disaster Recovery Chapter 8 Part 2 Pages 914 to 945.
Module 12: Designing High Availability in Windows Server ® 2008.
A BigData Tour – HDFS, Ceph and MapReduce These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Hadoop Tutorial, Amir Payberah.
CH2 System models.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 18: Replication.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Computer Emergency Notification System (CENS)
Practical Byzantine Fault Tolerance
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
A Brief Documentation.  Provides basic information about connection, server, and client.
Paxos: Agreement for Replicated State Machines Brad Karp UCL Computer Science CS GZ03 / M st, 23 rd October, 2008.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Fault Tolerant Services
Chap 7: Consistency and Replication
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
MongoDB First Light. Mongo DB Basics Mongo is a document based NoSQL. –A document is just a JSON object. –A collection is just a (large) set of documents.
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
Spring 2003CS 4611 Replication Outline Failure Models Mirroring Quorums.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Exercises for Chapter 2: System models From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Pearson Education 2005.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1.
Fail-Stop Processors UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau One paper: Byzantine.
PERFORMANCE MANAGEMENT IMPROVING PERFORMANCE TECHNIQUES Network management system 1.
Testing the Zambeel Aztera Chris Brew FermilabCD/CSS/SCS Caveat: This is very much a work in progress. The results presented are from jobs run in the last.
Cloud Computing and Architecuture
Bentley Systems, Incorporated
High Availability Linux (HA Linux)
MongoDB Distributed Write and Read
Network Load Balancing
Cluster Communications
Unit OS10: Fault Tolerance
Software Architecture in Practice
Replication Middleware for Cloud Based Storage Service
Outline Announcements Fault Tolerance.
SpiraTest/Plan/Team Deployment Considerations
INFO 344 Web Tools And Development
Active replication for fault tolerance
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Lecture 21: Replication Control
Slides for Chapter 15: Replication
Slides for Chapter 1 Characterization of Distributed Systems
Slides for Chapter 18: Replication
The SMART Way to Migrate Replicated Stateful Services
EEC 688/788 Secure and Dependable Computing
Lecture 21: Replication Control
Network management system
Container technology, Microservices, and DevOps
Presentation transcript:

Replication Store it in multiple places...

Literature Colouris, Dollimore, Kindberg, 2000 –Gets deep into the details of reliable communication, byzantine failures, etc. –We approach it from the architectural level Terminology Overall protocol Bærbak Christensen2

Replication Replication: Maintenance of copies of data at multiple computers As always – not only one architectural quality is affected but several –Availability increase: Our primary concern in RSA If one man is down, another man can take over –Performance increase: More men can pull more load Bærbak Christensen3

Replication But –Cost: Two nodes cost more than one –Reliability: More complex algorithms increase probability of error (fail- over, ping-echo, voting, …) –Data consistency What if A reads from node X while B writes to node Y ? CAP theorem –Consistency: All see same data –Availability: Every request is served –Partition Tolerance: Operational despite arbitrary failures RDB goes for C while NoSQL goes for A Bærbak Christensen4

Discussion Redundancy: In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the form of a backup or fail-safe. [Wikipedia] What is the difference from replication? Bærbak Christensen5

Availability Calculations If a system of n replicated servers in which each server has a probability, p, of failing, then the system has total probability p n of failing Ex –p = 5% (0.05) (72 minutes every 24h) –n = 3 –Overall failure rate: – = = 0,125 per mille (10 seconds every 24h) Bærbak Christensen6 That is: 1 - p n availability

Passive Replication Master/slave Primary/backup Bærbak Christensen7

Architecture –One node is primary executes operations (notably writes/updates) sends copies to slaves (in case of write/update) –Primary failure One slave is promoted to become primary Bærbak Christensen8 What is the FE?

Passive Replication Liabilities –Additional complexity in algorithms to keep slaves up- to-date, ensuring consistency, etc... –It takes time for the the slaves to note that the primary is gone and promote one as new primary MongoDB – up to one minute Meanwhile the clients are waiting – slow reponse –Primary takes all requests No obvious performance gain from balancing load –Storage requirements N nodes require N times storage EcoSense: for every 1TB extra diskspace we pay for 4 TB Bærbak Christensen9

Example: MongoDB MongoDB –Built to run on commodity hardware That is, the stuff that you buy in a shop, not necessarily the stuff the IT department has in the server room –My supplier (accessed Sept 2015) »Commodity 1 TB disk450 kr »Server 1 TB disk4.500 kr Commodity disks, however, have higher failure rates –But – avoid failures by using replication Three is a recommendation Two + arbiter is another option –Arbiter does not store anything but participate in electing a new primary Bærbak Christensen10

Example: MongoDB Highly configurable –Write concern Unacknowledgedhappy go lucky Acknowledgedprimary has received write request Journaledprimary has journaled the write request Replica Acknowlegedn replicas have received write request –Read concerns Writes go to primary, but all reads may go to slaves –Boosts performance –Sacrify consistency, you may get old data! –MongoDB is eventual consistent Bærbak Christensen11

MongoDB Selection Bærbak Christensen12 Log file Shell

Active Replication Fully redundant primaries Bærbak Christensen13

Architecture –Clients multicast requests to all nodes –All nodes process identically but independently and reply –Front receives answers May do one of serveral things: Use first one, compare, vote … Bærbak Christensen14

Active Replication Benefits –No performance penalty for failures (no promotion of a slave to become primary, all are primary) –(Potentially) Simpler servers Less need for hand-shaking Still need state resynchronization code in case a node has failed and need to get up-to-date Liabilities –No better performance than if using only one node –Complexity in front-end Multicast, voting,... –Consistency amongst nodes... Bærbak Christensen15

Mongo’s and Docker Hints for Exercises Bærbak Christensen16

Single DB Single DB Mongo is very easy –docker pull mongo:latest –docker run --name mymongo --v () mongo [params] Recommend to host /data/db on the host, -v –For staging, consider --smallfiles --noprealloc For the server using mongo –docker run --link mymongo:mongo [myimage] … Start script must copy linked env variable into own… Bærbak Christensen17

Replica Sets Caused me quite some headaches –Linking cannot be recursive! –Quite a few guides out there, however The internal host names, known by the replica set must be identical to the ones a client connects to the replica set with –I.e. if you configure the set with docker internal IPs Which will make the replica set work internally… –All connections are refused cause client uses the host’s name/ip. Bærbak Christensen18

Solutions Use three VMs –Each has own IP, each has mongo container, expose the port –The production setup (and costs 3x fee) Use single container with three instances –See the testing tutorial on replica sets Bind on different ports: –Use linking from the server using db Use three containers –Use ‘host’ networking instead of docker network Bind to different ports: –Use host’s IP to connect from server using db Bærbak Christensen19