Download presentation
Presentation is loading. Please wait.
1
Cassandra Database Project Alireza Haghdoost, Jake Moroshek Computer Science and Engineering University of Minnesota-Twin Cities Nov. 17, 2011 News Presentation: Joab Jackson, “New Cassandra Can Pack Two Billion Columns Into a Row”, PCWorld News, January 2011.
2
What was the Problem ? Facebook Messages Inbox Search Feature that enables users to search through their Facebook Inbox Millions of messages are sent everyday on Facebook Messages stored in different data centers How to handle indexing all of this information for Inbox search ? 2
3
What is Cassandra ? Distributed storage system Designed for managing kind of NoSQL database NoSQL: Key-Value, schema-less database Scale to a very large size across many servers spread across different datacenters small and large components fail continuously No single point of failure Data replicated at several nodes 3
4
Cassandra Goals High scalability The ability to scale incrementally High performance The ability to respond quickly High availability The ability to retain data available for users 4
5
Cassandra Data Model Cassandra does not support a full relational data model Key-Value data model Every row is identified by a unique key Every row can have unlimited number of Columns classified in different columns family can pack Two Billion columns into a row Columns are sorted in a row by name order time order (required for inbox search) 5
6
Distribution and Replication Data is distributed across the nodes using Consistent Hashing function High availability is achieved using replication If one storage node fails, data that has been replicated in other nodes is available. Data replicate at N node across data centers actively. Replication policies: Rack Unaware Rack Aware Datacenter Aware 6
7
Users of Cassandra System First deployment: 2008 by Facebook, inspired by Google and Amazon Designed for message inbox search system Stores TB’s of indexes across a cluster of 600+ cores and 120+ TB of disk space Each node can handle over 5,000 requests per second Well-known users: 7
8
References Prashant Malik, “Inbox Search” http://ja-jp.facebook.com/blog.php?post=20387467130http://ja-jp.facebook.com/blog.php?post=20387467130 Joab Jackson, “Apache Cassandra Ready for the Enterprise”, http://www.pcworld.com/businesscenter/article/242111/apache_cassandra_ready_for_the_enterprise.html#tk.mod_rel Joab Jackson “, New Cassandra Can Pack Two Billion Columns Into a Row http://www.pcworld.com/businesscenter/article/216766/new_cassandra_can_pack_two_billion_columns_into_a_row.htmlhttp://www.pcworld.com/businesscenter/article/216766/new_cassandra_can_pack_two_billion_columns_into_a_row.html” Avinash Lakshman and Prashant Malik. “Cassandra: a decentralized structured storage system” SIGOPS Oper. Syst. Rev. 44, 2 (April 2010) http://doi.acm.org/10.1145/1773912.1773922 8
9
Thank You 9
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.