Download presentation
Presentation is loading. Please wait.
Published byMaude Blankenship Modified over 8 years ago
1
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1
2
Distribution models Sharding Distribute data between different servers (called “partitions”) Each server acts as the single source for a subset of data Replication Copy data across multiple servers Each data can be found in multiple places Master-slave replication One server has the authoritative copy of data. Slaves synchronize with the master. Slaves are read-only Less update conflicts Peer-to-peer replication Peer synchronize their copies of data All peer can be read + written No single point of failure Distributed databases2
3
CAP Theorem Consistency All nodes see the same data at the same time. Synchronization between data centers take some time (even minutes) Availability Data must be available 24-7, and fast! Partition tolerance System should continue to work even if one (or more) partitions fail. CAP theorem You can have at most 2 of the 3 CAP With some NoSQL databases you can chose! Partition tolerance is usually mandatory in NoSQL That means we can choose between Consistency and Availability Example: Facebook chose Availability Distributed databases3
4
CAP Theorem Consistency A read is guaranteed to return the most recent write for a given client. Availability A non-failing node will return a reasonable response within a reasonable amount of time (no error or timeout). Partition Tolerance The system will continue to function when network partitions occur. CAP theorem You can have at most 2 of the 3 CAP With some NoSQL databases you can chose! Partition tolerance is usually mandatory in NoSQL That means we can choose between Consistency and Availability Example: Facebook chose Availability Source http://robertgreiner.com/2014/08/cap-theorem-revisited/ Distributed databases4
5
Distributed Map-Reduce Map-Reduce is designed for distribution 1.Map: Each partition does Map 2.Shuffle: Based on the keys each Map is sent to a Reduce processor 3.Reduce: Each reduce processor does Reduce 4.Collect: The final result is collected Distributed databases5
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.