Download presentation
Presentation is loading. Please wait.
Published byErin Clark Modified over 8 years ago
1
Replication Store it in multiple places...
2
Literature Colouris, Dollimore, Kindberg, 2000 –Gets deep into the details of reliable communication, byzantine failures, etc. –We approach it from the architectural level Terminology Overall protocol CS@AUHenrik Bærbak Christensen2
3
Replication Replication: Maintenance of copies of data at multiple computers As always – not only one architectural quality is affected but several –Availability increase: Our primary concern in RSA If one man is down, another man can take over –Performance increase: More men can pull more load CS@AUHenrik Bærbak Christensen3
4
Replication But –Cost: Two nodes cost more than one –Reliability: More complex algorithms increase probability of error (fail- over, ping-echo, voting, …) –Data consistency What if A reads from node X while B writes to node Y ? CAP theorem –Consistency: All see same data –Availability: Every request is served –Partition Tolerance: Operational despite arbitrary failures RDB goes for C while NoSQL goes for A CS@AUHenrik Bærbak Christensen4
5
Discussion Redundancy: In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the form of a backup or fail-safe. [Wikipedia] What is the difference from replication? CS@AUHenrik Bærbak Christensen5
6
Availability Calculations If a system of n replicated servers in which each server has a probability, p, of failing, then the system has total probability p n of failing Ex –p = 5% (0.05) (72 minutes every 24h) –n = 3 –Overall failure rate: –0.05 3 = 0.000125 = 0,125 per mille (10 seconds every 24h) CS@AUHenrik Bærbak Christensen6 That is: 1 - p n availability
7
Passive Replication Master/slave Primary/backup CS@AUHenrik Bærbak Christensen7
8
Architecture –One node is primary executes operations (notably writes/updates) sends copies to slaves (in case of write/update) –Primary failure One slave is promoted to become primary CS@AUHenrik Bærbak Christensen8 What is the FE?
9
Passive Replication Liabilities –Additional complexity in algorithms to keep slaves up- to-date, ensuring consistency, etc... –It takes time for the the slaves to note that the primary is gone and promote one as new primary MongoDB – up to one minute Meanwhile the clients are waiting – slow reponse –Primary takes all requests No obvious performance gain from balancing load –Storage requirements N nodes require N times storage EcoSense: for every 1TB extra diskspace we pay for 4 TB CS@AUHenrik Bærbak Christensen9
10
Example: MongoDB MongoDB –Built to run on commodity hardware That is, the stuff that you buy in a shop, not necessarily the stuff the IT department has in the server room –My supplier (accessed Sept 2015) »Commodity 1 TB disk450 kr »Server 1 TB disk4.500 kr Commodity disks, however, have higher failure rates –But – avoid failures by using replication Three is a recommendation Two + arbiter is another option –Arbiter does not store anything but participate in electing a new primary CS@AUHenrik Bærbak Christensen10
11
Example: MongoDB Highly configurable –Write concern Unacknowledgedhappy go lucky Acknowledgedprimary has received write request Journaledprimary has journaled the write request Replica Acknowlegedn replicas have received write request –Read concerns Writes go to primary, but all reads may go to slaves –Boosts performance –Sacrify consistency, you may get old data! –MongoDB is eventual consistent CS@AUHenrik Bærbak Christensen11
12
MongoDB Selection CS@AUHenrik Bærbak Christensen12 Log file Shell
13
Active Replication Fully redundant primaries CS@AUHenrik Bærbak Christensen13
14
Architecture –Clients multicast requests to all nodes –All nodes process identically but independently and reply –Front receives answers May do one of serveral things: Use first one, compare, vote … CS@AUHenrik Bærbak Christensen14
15
Active Replication Benefits –No performance penalty for failures (no promotion of a slave to become primary, all are primary) –(Potentially) Simpler servers Less need for hand-shaking Still need state resynchronization code in case a node has failed and need to get up-to-date Liabilities –No better performance than if using only one node –Complexity in front-end Multicast, voting,... –Consistency amongst nodes... CS@AUHenrik Bærbak Christensen15
16
Mongo’s and Docker Hints for Exercises CS@AUHenrik Bærbak Christensen16
17
Single DB Single DB Mongo is very easy –docker pull mongo:latest –docker run --name mymongo --v () mongo [params] Recommend to host /data/db on the host, -v –For staging, consider --smallfiles --noprealloc For the server using mongo –docker run --link mymongo:mongo [myimage] … Start script must copy linked env variable into own… CS@AUHenrik Bærbak Christensen17
18
Replica Sets Caused me quite some headaches –Linking cannot be recursive! –Quite a few guides out there, however The internal host names, known by the replica set must be identical to the ones a client connects to the replica set with –I.e. if you configure the set with docker internal IPs Which will make the replica set work internally… –All connections are refused cause client uses the host’s name/ip. CS@AUHenrik Bærbak Christensen18
19
Solutions Use three VMs –Each has own IP, each has mongo container, expose the 27017 port –The production setup (and costs 3x fee) Use single container with three instances –See the testing tutorial on replica sets Bind on different ports: 27017..27019 –Use linking from the server using db Use three containers –Use ‘host’ networking instead of docker network Bind to different ports: 27017..27019 –Use host’s IP to connect from server using db CS@AUHenrik Bærbak Christensen19
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.