Download presentation
Presentation is loading. Please wait.
Published byPiers Waters Modified over 9 years ago
1
Project Voldemort: What’s New Alex Feinberg
2
The plan Introduction Motivation Inspiration Implementation Present day New features within the last months New features in active development The roadmap Wanted Features Q&A
3
Introduction Project Voldemort: a scalable, highly available, distributed, key/value store Data Platform team at LinkedIn – Data driven features – The infrastructure to run them Original work by Jay Kreps, Bhupesh Bansal The presenter: just hired a month ago to work full time on Voldemort
4
Data Driven Features…
5
Motivation Data driven features are data intensive in terms of reads, writes and the size of the datasets Scaling a relational database: if data can’t be federated, RDBMS becomes a de-facto K/V store SQL –Relational algebra is a powerful tool, but not a universal solution –Passing strings around is cumbersome, ORMs can be leaky abstractions
6
“The Exploits of a Mom” © XKCD
7
Memcached is an excellent in-memory key/value cache –Used extensively by high traffic websites, including LinkedIn –High throughput, low latency –Excellent scalability Hadoop –Used extensively by the Data Platform team –High average throughput, but high latency –Excellent scalability Wanted –Persistence and replication –Low latency –No single points of failure –Scalable: accommodate more data by adding more machines Non-relational Alternatives
8
Inspiration Amazon’s Dynamo SOSP paper late 2007 Key-value store Consistent hashing, vector clocks Gossip protocol Hinted handoff, Merkle Trees
9
Consistent Hashing A key belongs to a partition A node can hold multiple partitions There is a tunable replication factor (N) If N is 3, a key mapped to partition P is written to P-1, P and P+1
10
Vector Clocks From Leslie Lamport (also author of LaTeX) Want to determine the order of writes Total order demands strong consistency – Partial ordering: determine “x came before y” relation in most cases Associate a vector clock with a value –Versioned value is a (value, vector clock) tuple –Multiple versioned values can exist for a key –We can use a vector clock to determine causality –If two versioned values aren’t causally related, allow application to reconcile –Shopping cart example
11
Vector Clocks: Initial State
12
Vector Clocks: Event Occurs
13
Vector Clocks: Multi-cast the Vector Clock
14
Vector Clocks: Node Becomes Partitioned
15
Vector Clocks: Causality Determined
16
Implementation Customization at all layers –Pluggable serialization (JSON, protocol buffers, Thrift) allows keys and values to be structures rather than just strings Tunable R, W, N parameters Storage engines –No persistent data structure that is good at everything –BDB is most popular –Read only stores
17
Present day Production use at LinkedIn –Multiple clusters –Data Platform usage –Other teams’ usage –Read only stores for data built out in Hadoop Production use outside of LinkedIn –Gilt Group, KaChing, others Revision control through git –Hosted on github Active developer community, inside and outside LinkedIn
18
Recently Added: Read Only Stores Motivation Offline batch/computing Optimize the store for atomic swaps and rollbacks Leverage what Hadoop provides Implementation Memory mapped files Integration with Hadoop Driver program to initiate fetch and swap in parallel
19
Recently Added: NIO Non-blocking IO, why? –Scalability and the c10k problem Java’s NIO framework –Added in 1.4, greatly improved in 1.5 and 1.6 –Will use native scalable poll implementation Tricky to get good performance Contributed by Kirk True
20
NIO Performance and Scalability
21
Recently Added: Data Compression Motivation: smaller data size –Denormalized data leads to big blobs –Less to transfer between client and server –More of the data can be stored in main memory –Less to transfer from disk to memory –Compression/decompression is fast –If we’re I/O bound, less bytes to express the same data implies better performance Implementation Usage
22
Monitoring and Administration In place: JMX hooks –View statistics (how many queries are made? How long are they taking?) –Perform operations (analogous to SNMP traps) Admin Server –Functionality which is needed, but shouldn’t be performed by regular store clients –Ability to update and retrieve cluster/store metadata –Functionality efficiently stream keys and values in a partition Network class loader/server side filtering
23
On The Roadmap Failure detection Large value support Publish/subscribe Rebalancing
24
On The Roadmap: Rebalancing Rebalancing: ability to add a server to a cluster while the cluster is still running Node enters a cluster, “steals” a partition from other nodes (fetches it as a stream using the admin protocol) Pull-based gossip protocol to let other nodes know that it’s in the cluster –Metadata about cluster membership treated as data, conflicts reconciled using vector clocks While the new node is transferring the partitions, gets sent to it are redirected to the donor node(s)
25
Stability and Infrastructure Testing “in the cloud” Distributed systems have to be tested on multi- node clusters Distributed systems have complex failure scenarios A storage system, above all, must be stable Automated testing allows rapid iteration while maintaining confidence in systems’ correctness and stability EC2-based testing framework Tests are invoked programmatically Contributed by Kirk True Adaptable to other cloud hosting providers Will run on a regular basis Regular releases for new features and bug fixes Trunk stays stable
26
Wanted Features Clients for other languages Outside of the JVM Ruby, PHP (popular for web development) On the JVM JRuby, Scala, Clojure Different languages have different idioms Java’s idiom is objects with mutable state Views Inspired by CouchDB Want to change a value for a key without transfering that value back and forth Example: adding to a list, incrementing a counter Less collisions/conflicts
27
Contributions are Welcome Thriving open source community –Fork us on Github: http://github.com/voldemort/voldemorthttp://github.com/voldemort/voldemort –Wiki: http://wiki.github.com/voldemort/voldemorthttp://wiki.github.com/voldemort/voldemort Fun projects: http://wiki.github.com/voldemort/voldemort/fun-projectshttp://wiki.github.com/voldemort/voldemort/fun-projects –IRC channel: #Voldemort on Freenode (irc.freenode.org) Want to work on this full time? LinkedIn is hiring! Just in the Data Platform group Other technologies: Scala, Hadoop, ZooKeeper, Lucene, Netty Projects: real time faceted search, distributed graph databases, machine learning, data mining, information retrieval / extraction, NLP Open source projects: Zoie, Bobo, Sensei-search, decomposer, kamikaze (three more on the way!) More elsewhere! Contact me http://www.linkedin.com/in/alexfeinberg http://www.linkedin.com/in/alexfeinberg afeinberg@linkedin.com afeinberg@linkedin.com
28
Questions? Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.