Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications.

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

CASSANDRA-A Decentralized Structured Storage System Presented By Sadhana Kuthuru.
Dynamo: Amazon’s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat.
AMAZON’S KEY-VALUE STORE: DYNAMO DeCandia,Hastorun,Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels: Dynamo: Amazon's highly available.
Map/Reduce in Practice Hadoop, Hbase, MongoDB, Accumulo, and related Map/Reduce- enabled data stores.
Cassandra Structured Storage System over a P2P Network Avinash Lakshman, Prashant Malik.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Cloud Storage Yizheng Chen. Outline Cassandra Hadoop/HDFS in Cloud Megastore.
Cassandra Database Project Alireza Haghdoost, Jake Moroshek Computer Science and Engineering University of Minnesota-Twin Cities Nov. 17, 2011 News Presentation:
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Overview Distributed vs. decentralized Why distributed databases
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
A Decentralized Structure Storage Model - Avinash Lakshman & Prashanth Malik - Presented by Srinidhi Katla CASSANDRA.
Wide-area cooperative storage with CFS
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
CS 405G: Introduction to Database Systems 24 NoSQL Reuse some slides of Jennifer Widom Chen Qian University of Kentucky.
Distributed storage for structured data
BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Cloud Storage: All your data belongs to us! Theo Benson This slide includes images from the Megastore and the Cassandra papers/conference slides.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
Designing a DHT for low latency and high throughput Robert Vollmann P2P Information Systems.
 Anil Nori Distinguished Engineer Microsoft Corporation.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
MapReduce – An overview Medha Atre (May 7, 2008) Dept of Computer Science Rensselaer Polytechnic Institute.
Cassandra-A Decentrilized Structured Storage System Azad Kurdistan University Subject : Cassandra - A Decentralized Structured Storage System Professor.
Cloud Computing Cloud Data Serving Systems Keke Chen.
High Throughput Computing on P2P Networks Carlos Pérez Miguel
Apache Cassandra - Distributed Database Management System Presented by Jayesh Kawli.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
Cassandra - A Decentralized Structured Storage System
Cassandra – A Decentralized Structured Storage System Lecturer : Prof. Kyungbaek Kim Presenter : I Gde Dharma Nugraha.
Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.
Bigtable: A Distributed Storage System for Structured Data 1.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
CS 347Lecture 9B1 CS 347: Parallel and Distributed Data Management Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Serverless Network File Systems Overview by Joseph Thompson.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
Chapter 1 Database Access from Client Applications.
Bigtable: A Distributed Storage System for Structured Data
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Cassandra. Outline Introduction Background Use Cases Data Model & Query Language Architecture Conclusion.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Cassandra Architecture.
Big Data Yuan Xue CS 292 Special topics on.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
Bigtable A Distributed Storage System for Structured Data.
Cassandra The Fortune Teller
Introduction to DBMS Purpose of Database Systems View of Data
CS 405G: Introduction to Database Systems
and Big Data Storage Systems
Distributed Systems CS 425 / ECE 428 Fall 2012
Cassandra - A Decentralized Structured Storage System
Dynamo: Amazon’s Highly Available Key-value Store
NOSQL.
The NoSQL Column Store used by Facebook
CHAPTER 3 Architectures for Distributed Systems
NOSQL databases and Big Data Storage Systems
Providing Secure Storage on the Internet
Distributed P2P File System
Building a Database on S3
Introduction to DBMS Purpose of Database Systems View of Data
Presentation transcript:

Cassandra A Decentralized, Structured Storage System Avinash Lakshman and Prashant Malik Facebook Published: April 2010, Volume 44, Issue 2 Communications of the ACM Presented by James Owens Old Dominion University For CS795 on 11//2014

About the Authors Avinash LakshmanPrashant Malik Currently: Hedvig / Quexascale ? – 2013 Notable Works: o Dynamo : amazon's highly available key-value store o Cassandra : a decentralized structured storage system o Cassandra : structured storage system on a p2p network o System and method for providing high availability data Currently: o LimeRoad ? – 2013 Notable works: o Cassandra : a decentralized structured storage system o Cassandra : structured storage system on a p2p network o Asynchronous communication within a server arrangement o Publishing digital content within a defined universe such as an organization in accordance with a digital rights management (DRM) system Rank 4 search results on author name - Google Scholar 11/14/

Significance Provides a platform for data storage and retrieval which supports very high write throughput and tolerates continuous component failure. Integrates strategies from many other technologies, cited over 916 times. [Google Scholar, Nov 2014]

Difficulties o Dense reading. o No Diagrams. o Simultaneously defines a general purpose tool(Cassandra) and specific implementation (Inbox Search) o No clear separation of the above.

Approach o Handling Density: Inbox Search o Big-Picture View of Cassandra Data Model API Read/Write Model o How Cassandra solves Inbox Search Cassandra Internals…

Why was Cassandra Created? o Solution to the Inbox Search problem Consider the Facebook context: o Many simultaneous users o Billions of writes per day o Need for Scalability o Cassandra is used for multiple services within Facebook.

Inbox Search Problem o A user wants to search his or her inbox for messages using one of two strategies Term Search - keyword Interactions - name

What is Cassandra? o A structured data storage system Logical ring of servers Designed to support multiple, continuous component failures No central point of failure Highly Configurable Runs on commodity hardware o *It is not a full relational DBMS

Data Model o Distributed Multidimensional Map Pairs o String Key Typically 16 – 36 B o Object Value

Data Model o Column Types: Column Families Super Column Families o A superset of column families o (The Value Model supports recursion) o A Visual…

Data Model

Data Model

API o insert ( table, key, rowMutation ) o get ( table, key, columnName ) o delete ( table, key, columnName ) o columnName – (Path) May refer to any column, column family, super column family, or a column within a super column

Read/Write Overview o *Read and Write requests are processed by any node. o The searching node determines which particular nodes contain the data. o Writes Issues write to all nodes, waits for a quorum of commits o Reads (Variable) Closest Node – Low integrity All nodes + quorum – High integrity

Inbox Search Problem o A user wants to search his or her inbox for messages using one of two strategies Term Search - keyword Interactions - name

Term Search o Key – user ID Super Column – (Inverted index) o CF - words that make up a message C - Individual Message Identifiers

Search by Interactions o Key – user ID Super Column – (Inverted index) o CF – Recipient ID C -Individual Message Identifiers

Inbox Search Problem o Cassandra provides ‘hooks’ for intelligent caching: e.g. user clicks on ‘inbox’, primes index Production Performance numbers Latency StatSearch InteractionsTerm Search Min7.69 ms7.78 ms Median15.69 ms18.27 ms Max26.13 ms44.41 ms

Inbox Search Solution o The full solution requires understanding of the underlying architecture. o Key points are: Any node can service a query o Global knowledge of data stores Query all relevant data stores Take most-recent, good response o Quorum, flexibility o Time Threshold

Questions

My Question What should I emphasize in the architecture?

Cassandra Architecture General Structure apache-cassandra/figure003.gif

Cassandra Architecture General Structure

Cassandra Architecture Data Partitioning Consistent Hashing Algorithm o Logical Ring of hash values o Each node is given a position on this ring o Each node is responsible for a LEFT range of hashes. o Each data item’s RIGHT neighbor is responsible for storage and replication Recall the nodes communicate about ranges so any one node knows the locations which should contain data for a particular key.

Cassandra Architecture Data Partitioning

Cassandra Architecture Replication Schemes Configurable: o Number of replicas o Replication Policies Non-coordinator replicas are chosen by picking N-1 other nodes on ring. o Rack Unaware Zookeeper (Elected leader) Abstraction o Rack Aware o Datacenter Aware

Replication Coordination content/uploads/2012/03/migration-distribution.png

Cassandra Architecture Domain Awareness Each node has full awareness - handwaving o Cassandra nodes communicate via a Gossip Protocol Based on “Scuttlebutt”

Cassandra Architecture PHI Accrual Failure Detection Failure detection (prediction) via Gossip o A sliding window is used to calculate likelihood of failure, based on gossip: o (Phi, P(Failure) ) o { (1, 0.1) (2, 0.01) (3, 0.001) … }

Cassandra Architecture Failure Response System architecture does not reconfigure o The server will return eventually. o Recall: Scuttlebutt, sliding window and PHI Permanent modification of the ring is an administrator task. Phi represents a level of suspicion a particular node is down or unreachable o (*) This is used for timeout avoidance? (*) Protocols account for failure by design

Cassandra Architecture Request Handling Request arrives at any node: 1.Servicing node looks up the data hosts. 1.All nodes have knowledge of the data partition 2.(*) Request is routed to all data hosts. 3.If requests timeout, failure is returned 4.Identify the response containing data with the youngest timestamp, return it. 5. Schedule update of nodes with older timestamps. – This ensures quorum integrity

Cassandra Architecture Local Data Persistence Typical Write: 1.Write to commit log – Dedicated Disk 2.Update to in-memory structure When in-memory structure crosses threshold (data size, number of data items) Data is sequentially written to commodity disks. Over time these files are merged and reorganized ala BigTable

Cassandra Architecture Local Data Persistence Typical Read: (A key can be in many files) 1.Bloom filter (index of keys in each file) 2.Get Values (reverse chronological order) 1.Values (CF) have Column Indices allowing for direct access of columns. I’ve left out much of the chunking details. Optimizations, included at almost every conceivable point, are our of scope.

Questions

Read Path anding-cassandra-code-base/index.html

Write Path anding-cassandra-code-base/index.html

Questions Additional Information: