Project Voldemort: What’s New Alex Feinberg. The plan  Introduction  Motivation  Inspiration  Implementation  Present day  New features within the.

Slides:



Advertisements
Similar presentations
The Big Data Ecosystem at LinkedIn
Advertisements

Dynamo: Amazon’s Highly Available Key-value Store
Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
1 Dynamo Amazon’s Highly Available Key-value Store Scott Dougan.
Spark: Cluster Computing with Working Sets
Project Voldemort Distributed Key-value Storage Alex Feinberg
5/17/ Project Voldemort Jay Kreps. Where was it born? LinkedIn’s Data & Analytics Team Analysis & Research Hadoop and data pipeline Search Social.
Project Voldemort Bhupesh Bansal & Jay Kreps
Kafka high-throughput, persistent, multi-reader streams
The Evolution of Data Infrastructure at Linkedin LinkedIn Confidential ©2013 All Rights Reserved.
Cacti Workshop Tony Roman Agenda What is Cacti? The Origins of Cacti Large Installation Considerations Automation The Current.
Dynamo: Amazon’s Highly Available Key- value Store (SOSP’07) Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman,
Presentation by Krishna
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
Platform as a Service (PaaS)
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
CC P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2015 Lecture 9: NoSQL I Aidan Hogan
Dynamo: Amazon’s Highly Available Key-value Store Presented By: Devarsh Patel 1CS5204 – Operating Systems.
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Dynamo: Amazon’s Highly Available Key-value Store COSC7388 – Advanced Distributed Computing Presented By: Eshwar Rohit
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Introduction to Hadoop and HDFS
Dynamo: Amazon's Highly Available Key-value Store Dr. Yingwu Zhu.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED.
D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - VALUE S TORE Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1 Professor : Dr Sheykh Esmaili.
Dynamo: Amazon’s Highly Available Key-value Store
IMDGs An essential part of your architecture. About me
Cassandra - A Decentralized Structured Storage System
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Hadoop IT Services Hadoop Users Forum CERN October 7 th,2015 CERN IT-D*
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Big Data Engineering: Recent Performance Enhancements in JVM- based Frameworks Mayuresh Kunjir.
Big traffic data processing framework for intelligent monitoring and recording systems 學生 : 賴弘偉 教授 : 許毅然 作者 : Yingjie Xia a, JinlongChen a,b,n, XindaiLu.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Data and Information Systems Laboratory University of Illinois Urbana-Champaign Data Mining Meeting Mar, From SQL to NoSQL Xiao Yu Mar 2012.
Big Data Yuan Xue CS 292 Special topics on.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Big Data Yuan Xue CS 292 Special topics on.
Kitsuregawa Laboratory Confidential. © 2007 Kitsuregawa Laboratory, IIS, University of Tokyo. [ hoshino] paper summary: dynamo 1 Dynamo: Amazon.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
BIG DATA/ Hadoop Interview Questions.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Gorilla: A Fast, Scalable, In-Memory Time Series Database
Solr Power FTW Alex #solrnosql. What Will I Cover? Who I am What Bazaarvoice does SOLR and NoSQL Can SOLR handle 20K queries per second?
Pilot Kafka Service Manuel Martín Márquez. Pilot Kafka Service Manuel Martín Márquez.
Project Voldemort Distributed Key-value Storage Alex Feinberg
Managing State Chapter 13.
Reducing Risk with Cloud Storage
Hadoop.
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Open Source distributed document DB for an enterprise
Spark Presentation.
Dynamo: Amazon’s Highly Available Key-value Store
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Central Florida Business Intelligence User Group
Introduction to Spark.
Massively Parallel Cloud Data Storage Systems
Clouds & Containers: Case Studies for Big Data
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
Mark Quirk Head of Technology Developer & Platform Group
Presentation transcript:

Project Voldemort: What’s New Alex Feinberg

The plan  Introduction  Motivation  Inspiration  Implementation  Present day  New features within the last months  New features in active development  The roadmap  Wanted Features  Q&A

Introduction  Project Voldemort: a scalable, highly available, distributed, key/value store  Data Platform team at LinkedIn – Data driven features – The infrastructure to run them  Original work by Jay Kreps, Bhupesh Bansal  The presenter: just hired a month ago to work full time on Voldemort

Data Driven Features…

Motivation  Data driven features are data intensive in terms of reads, writes and the size of the datasets  Scaling a relational database: if data can’t be federated, RDBMS becomes a de-facto K/V store  SQL –Relational algebra is a powerful tool, but not a universal solution –Passing strings around is cumbersome, ORMs can be leaky abstractions

“The Exploits of a Mom” © XKCD

 Memcached is an excellent in-memory key/value cache –Used extensively by high traffic websites, including LinkedIn –High throughput, low latency –Excellent scalability  Hadoop –Used extensively by the Data Platform team –High average throughput, but high latency –Excellent scalability  Wanted –Persistence and replication –Low latency –No single points of failure –Scalable: accommodate more data by adding more machines Non-relational Alternatives

Inspiration  Amazon’s Dynamo  SOSP paper late 2007  Key-value store  Consistent hashing, vector clocks  Gossip protocol  Hinted handoff, Merkle Trees

Consistent Hashing  A key belongs to a partition  A node can hold multiple partitions  There is a tunable replication factor (N)  If N is 3, a key mapped to partition P is written to P-1, P and P+1

Vector Clocks  From Leslie Lamport (also author of LaTeX)  Want to determine the order of writes  Total order demands strong consistency – Partial ordering: determine “x came before y” relation in most cases  Associate a vector clock with a value –Versioned value is a (value, vector clock) tuple –Multiple versioned values can exist for a key –We can use a vector clock to determine causality –If two versioned values aren’t causally related, allow application to reconcile –Shopping cart example

Vector Clocks: Initial State

Vector Clocks: Event Occurs

Vector Clocks: Multi-cast the Vector Clock

Vector Clocks: Node Becomes Partitioned

Vector Clocks: Causality Determined

Implementation  Customization at all layers –Pluggable serialization (JSON, protocol buffers, Thrift) allows keys and values to be structures rather than just strings  Tunable R, W, N parameters  Storage engines –No persistent data structure that is good at everything –BDB is most popular –Read only stores

Present day  Production use at LinkedIn –Multiple clusters –Data Platform usage –Other teams’ usage –Read only stores for data built out in Hadoop  Production use outside of LinkedIn –Gilt Group, KaChing, others  Revision control through git –Hosted on github  Active developer community, inside and outside LinkedIn

Recently Added: Read Only Stores  Motivation  Offline batch/computing  Optimize the store for atomic swaps and rollbacks  Leverage what Hadoop provides  Implementation  Memory mapped files  Integration with Hadoop  Driver program to initiate fetch and swap in parallel

Recently Added: NIO  Non-blocking IO, why? –Scalability and the c10k problem  Java’s NIO framework –Added in 1.4, greatly improved in 1.5 and 1.6 –Will use native scalable poll implementation  Tricky to get good performance  Contributed by Kirk True

NIO Performance and Scalability

Recently Added: Data Compression  Motivation: smaller data size –Denormalized data leads to big blobs –Less to transfer between client and server –More of the data can be stored in main memory –Less to transfer from disk to memory –Compression/decompression is fast –If we’re I/O bound, less bytes to express the same data implies better performance  Implementation  Usage

Monitoring and Administration  In place: JMX hooks –View statistics (how many queries are made? How long are they taking?) –Perform operations (analogous to SNMP traps)  Admin Server –Functionality which is needed, but shouldn’t be performed by regular store clients –Ability to update and retrieve cluster/store metadata –Functionality efficiently stream keys and values in a partition  Network class loader/server side filtering

On The Roadmap  Failure detection  Large value support  Publish/subscribe  Rebalancing

On The Roadmap: Rebalancing  Rebalancing: ability to add a server to a cluster while the cluster is still running  Node enters a cluster, “steals” a partition from other nodes (fetches it as a stream using the admin protocol)  Pull-based gossip protocol to let other nodes know that it’s in the cluster –Metadata about cluster membership treated as data, conflicts reconciled using vector clocks  While the new node is transferring the partitions, gets sent to it are redirected to the donor node(s)

Stability and Infrastructure  Testing “in the cloud”  Distributed systems have to be tested on multi- node clusters  Distributed systems have complex failure scenarios  A storage system, above all, must be stable  Automated testing allows rapid iteration while maintaining confidence in systems’ correctness and stability  EC2-based testing framework  Tests are invoked programmatically  Contributed by Kirk True  Adaptable to other cloud hosting providers  Will run on a regular basis  Regular releases for new features and bug fixes  Trunk stays stable

Wanted Features  Clients for other languages  Outside of the JVM  Ruby, PHP (popular for web development)  On the JVM  JRuby, Scala, Clojure  Different languages have different idioms  Java’s idiom is objects with mutable state  Views  Inspired by CouchDB  Want to change a value for a key without transfering that value back and forth  Example: adding to a list, incrementing a counter  Less collisions/conflicts

Contributions are Welcome  Thriving open source community –Fork us on Github: –Wiki:  Fun projects: –IRC channel: #Voldemort on Freenode (irc.freenode.org)  Want to work on this full time? LinkedIn is hiring!  Just in the Data Platform group  Other technologies: Scala, Hadoop, ZooKeeper, Lucene, Netty  Projects: real time faceted search, distributed graph databases, machine learning, data mining, information retrieval / extraction, NLP  Open source projects: Zoie, Bobo, Sensei-search, decomposer, kamikaze (three more on the way!)  More elsewhere!  Contact me  

Questions?  Questions?