Download presentation
Presentation is loading. Please wait.
Published byRandall Fleming Modified over 9 years ago
1
Mainak Ghosh, Wenting Wang, Gopalakrishna Holla, Indranil Gupta
2
2 Predicted to become a $3.4B industry by 2018
3
Problem: Changing database or table-level configuration parameters o Primary/Shard Key – MongoDB, Cassandra o Ring size - Cassandra Challenge: Affects a lot of data at once Motivation: o Initially DB configured based on guesses o Sys admin needs to play around with different parameters Seamlessly, and efficiently o Later, as workload changes and business/use case evolves: Change parameters, but do it in a live database 3
4
Existing Solution: Create a new schema, export and re-import data Why Bad? o Massive unavailability of data: every second of outage costs $1.1K at Amazon and $1.5K at Google o Reconfiguration change caused outage at Foursquare o Manual change of primary key at Google: took 2 years and involved 2 dozen teams Need a solution that is automated, seamless and efficient 4
5
Fast completion time Minimize the amount of data transfer required Highly available CRUD operations should be answered as much as possible with little impact on latency Network Aware Reconfiguration should adapt to underlying network latencies 5
6
Master Slave Replication Range-based Partitioning Flexibility in data assignment MongoDB, RethinkDB, CouchDB, etc. 6
7
8
8 S1 S2 S3 KoKo 123 KnKn 248 KoKo 456 KnKn 163 KoKo 789 KnKn 957 Old Arrangement KoKo 416 KnKn 123 KoKo 285 KnKn 456 KoKo 937 KnKn 789 New Arrangement
9
9 S1 S2 S3 1 1 1 2 1 0 0 1 2 Lemma 1: The greedy algorithm is optimal in total network transfer volume
10
10 S1 S2 S3 KoKo 123 KnKn 248 KoKo 456 KnKn 163 KoKo 789 KnKn 957 Old Arrangement KoKo 416 KnKn 123 KoKo 285 KnKn 456 KoKo 937 KnKn 789 New Arrangement
11
11 S1 S2 S3 1 2 3 2 1 0 0 0 0 Greedy Hungarian Assignment
12
13
13 RS0 Primary RS1 Primary RS2 Primary Config Front End select * from table where user_id = 20 user_id = 20 RS1 select * from table where product_id = 20
14
14 RS 0 Secondary RS0 Primary RS 1 Secondary RS1 Primary RS 2 Secondary RS2 Primary Config Front End db.collection.changeShardKey(product_id:1)
15
15 RS 0 Secondary RS0 Primary RS 1 Secondary RS1 Primary RS 2 Secondary RS2 Primary Config Front End db.collection.changeShardKey(product_id:1) 1. Runs Algorithm 2. Generates placement plan update table set price = 20 where user_id = 20
16
16 RS 0 Secondary RS0 Primary RS 1 Secondary RS1 Primary RS 2 Secondary RS2 Primary Config Front End db.collection.changeShardKey(product_id:1) Iteration 1Iteration 2
17
17 RS0 Secondary RS0 Primary RS1 Secondary RS1 Primary RS2 Secondary RS2 Primary Config Front End db.collection.changeShardKey(product_id:1) update table set price = 20 where product_id = 20 RS0 Primary RS1 Primary RS2 Primary RS 0 Secondary RS 1 Secondary RS 2 Secondary
18
19
Assigns a socket per chunk per source-destination pair of communicating servers (Chunk-Based) 19 Inter Rack Latency Unequal Data Size
20
20
21
21 WFS strategy 30% better than naïve chunk- based scheme and 9% better than Orchestra [Chowdhury et al]
22
Morphus chooses slaves for reconfiguration during first Isolation phase In a geo-distributed setting, naïve choice can lead to bulk transfers over wide area network Solution: Localize bulk transfer by choosing replicas in the same datacenter o Morphus extracts the datacenter information from replica’s metadata. 2x-3x improvement observed in experiments 22
23
24
Dataset: Amazon Reviews [Snap]. Cluster: Emulab d710 nodes, 100 Mbps LAN switch and Google Cloud (n1-standard-4 VMs), 1 Gbps network Workload: Custom generator similar to YCSB. Implements Uniform, Zipfian and Latest distribution for key access Morphus: Implemented on top of MongoDB 24
25
25 Access Distribution Read Success Rate Write Success Rate Read Only99.9- Uniform99.998.5 Latest97.296.8 Zipf99.998.3
26
26
27
27 Access Distribution Read Success Rate Write Success Rate Read Only99.9- Uniform99.998.5 Latest97.296.8 Zipf99.998.3 Morphus has a small impact on data availability
28
28 Hungarian performs well in both scenarios and should be preferred over Greedy and Random schemes
29
29 Sub-linear increase in reconfiguration time as data and cluster size increases
30
Online schema change [Rae et al.]: Resultant availabilities smaller. Live data migration: Attempted in databases [Albatross, ShuttleDB] and VMs [Bradford et al.]. Similar approach as ours. o Albatross and ShuttleDB ship whole state to a set of empty servers. Reactive reconfiguration proposed by Squall [Elmore et al]: Fully available but takes longer. 30
31
Morphus is a system which allows live reconfiguration of a sharded NoSQL database. Morphus is implemented on top of MongoDB Morphus uses network efficient algorithms for placing chunks while changing shard key Morphus minimally affects data availability during reconfiguration Morphus mitigates stragglers during data migration by optimizing for the underlying network latencies. 31
32
32 Morphus scales super-linearly with data size significant portion of which is spent in data migration
33
33 Increase in number of replicas improves Morphus performance
34
Assigns a socket per chunk per source-destination pair of communicating servers (Chunk-Based) What if the topology is: Each shape represents a node in a single replica set. Circle is the front end. Intra-rack latency is 1ms while inter-rack is 2ms 34
35
35 Questions?
36
Growing quickly o $3.4B industry by 2018 Fast reads and writes (much faster than MySQL and relational databases) Many companies use them in running critical infrastructures o Google, Facebook, Yahoo, and many others Many open source NoSQL databases o MongoDB, Cassandra, Riak, etc. 36
37
Initially DB configured based on guesses Sys admin needs to play around with different parameters o Seamlessly, and efficiently Later, as workload changes and business/use case evolves: o Change parameters, but do it in a live database 37
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.