Download presentation
Presentation is loading. Please wait.
1
Shujaat Hussain
9
A single column
10
A single row
17
Consistency –the system is in a consistent state after an operation Availability –the system is “always on”, no downtime Partition tolerance–the system continues to function even when split into disconnected subsets (by a network disruption)
18
MySQL 300ms write 350ms read Cassandra 0.12ms write 15ms read
19
You need a key or keys: Single: key=‘a’ Range: key=‘a’ through ’f’ And columns to retrieve: Slice: cols={bar through kite} By name: key=‘b’ cols={bar, cat, llama} Nothing like SQL “ WHERE col=‘faz’ ”
20
Digg is a social news site that allows people to discover and share content from anywhere on the Internet by submitting stories and links, and voting and commenting on submitted stories and links.
21
Problems Terabytes of data; high transaction rate (reads dominated) Multiple clusters Management nightmare (high effort, error prone) Unsatisfied availability requirements (geographic isolation) Solution Cassandra as primary data store Datacenter and rack-aware replication
22
Twitter is a social networking and microblogging service that enables its users to send and read tweets, text-based posts of up to 140 characters. Terabytes of data, ~1,000,000 ops/s
23
Inbox Search 100 TB 160 nodes 1/2 billion writes per day (2yr old number?)
24
Advantages Massive scalability High availability Lower cost (than competitive solutions at that scale) (usually) predictable elasticity Schema flexibility, sparse & semi-structured data
25
Disadvantages Limited query capabilities (so far) Eventual consistency is not intuitive to program for Makes client applications more complicated No standardizatrion Portability might be an issue Insufficient access control
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.