Introduction & Data Modeling Cassandra Training Introduction & Data Modeling
Aims By the end of today you should know: How Cassandra organises data How to configure replicas How to choose between consistency and availability How to efficiently model data for both reads and writes You need to consider Active-Active scenarios Who to ask to help you & sign off on your data model HINT: Ask Neil directly or email harch@expedia.com. Introduction to Cassandra
Agenda – 100ft Quick Introduction Data Structures Efficient Data Modeling Data Modeling Examples Introduction to Cassandra
Agenda - Introduction Elevator Pitch Brewer’s Theorem & Tuneable Consistency Distributed Hash Table 101 Write path Read path TTL, Deletion & Tombstones Background Processes Data Model in 5mins Thrift vs CQL Maintaining Consistency Scaling Cassandra Introduction to Cassandra
Agenda – Advanced Topics Data Modelling Key Concepts Time Series Modelling Wide rows Compound Keys Code example Performance Tuning Levers What is DataStax Enterprise? Multi DC Support Virtual Nodes Nodetool Introduction to Cassandra
What? Elevator Pitch Write path optimised Eventually consistent (ms) Distributed Hash Table Highly durable Tunable consistency Introduction to Cassandra
let me choose my tradeoff Elevator Pitch Why? Linear horizontal read & write scaling Data is important and should always be there Often times we don’t need consistency guarantee let me choose my tradeoff Introduction to Cassandra
How? Elevator Pitch Data partitioned internally across nodes Writes must just hit the commit log Store data read-optimised to minimise read & write work: no indexes to update, no query to plan Specify agreement (consistency) per query Introduction to Cassandra
Not a silver bullet - easy to design a poorly-performing data model Elevator Pitch What it’s Not No support for transactions - atomicity, isolation mostly not available Not a silver bullet - easy to design a poorly-performing data model Introduction to Cassandra
DHT 101 Each physical node is assigned a token Nodes own the range from the previous token Introduction to Cassandra
Cassandra Write Path The coordinator will send the update to two nodes, starting at the owning node and working clockwise Introduction to Cassandra
Cassandra Write Path 128-bit hash used to compute partition key Keys are therefore distributed randomly around the ring If Unavailable - Hinted Handoff Introduction to Cassandra
Random Partitioner – key -> token Cassandra Write Path Concepts The Snitch – proximity Random Partitioner – key -> token Replication Factor – how many replicas Gossip – discovery protocol Introduction to Cassandra
Cassandra Write Path SSTables are sequential and immutable Data may reside across SSTables SSTables are periodically compacted together Introduction to Cassandra
Cassandra Read Path Data read command sent to closest replica - snitch Digest commands sent to other replicas – CL Read Repair Chance 10% - digest all replicas Introduction to Cassandra
Start & Interrogate C* vagrant box add dse.box http://htraining.s3.amazonaws.com/dse.box mkdir ~/vagrant curl http://htraining.s3.amazonaws.com/vagrant-dse.tar.gz > ~/vagrant/dse.tar.gz cd ~/vagrant && tar xzvf dse.tar.gz cd dse && vagrant up vagrant ssh node1 nodetool ring Introduction to Cassandra
Find Candidate SSTables - Bloom Filters Seek Through SSTables Cassandra Read Path Read Mechanics Find Candidate SSTables - Bloom Filters Seek Through SSTables Memory Mapped Files Check Memtable -> minimise sstables for best efficiency Introduction to Cassandra
Deleted data marked as removed – tombstone Deletion & Tombstones Deleted data marked as removed – tombstone Stops zombie data – distributed system Tombstones collected after a few days – configurable Introduction to Cassandra
Distributed Data – only 2 at a time – Consistency Availability Brewer’s Theorem Distributed Data – only 2 at a time – Consistency Availability Partition Tolerance Introduction to Cassandra
Brewer’s Theorem CA - normal operation, no partition, consistency and availability provided Introduction to Cassandra
Brewer’s Theorem AP - partition occurs, maintaining two mutable, disconnected state copies breaks consistency, availability is conserved Introduction to Cassandra
Brewer’s Theorem CP - partition occurs, to maintain consistency we need to take one side offline, sacrificing availability Introduction to Cassandra
Cassandra Consistency Level Tuneable Consistency Cassandra Consistency Level Specify node number to agree on read/write Choose consistency or availability: CL.LOCAL_QUORUM, CL.ONE Eventual consistency will bring both sides into agreement eventually Introduction to Cassandra
SSTables Compacted Periodically Size-Tiered Compaction Background Processes SSTables Compacted Periodically Size-Tiered Compaction – default, no compaction guarantee Leveled-Compaction – better chance of tombstone compaction – more continual compaction, 2x I/O – impact on online – use for update-heavy workloads – creates many SSTables Introduction to Cassandra
Agenda – 100ft Quick Introduction Data Structures Efficient Data Modeling Data Modeling Examples Introduction to Cassandra
Keyspace Data Model Analogous to Database/Schema Segregate Applications Replication configured at this level Introduction to Cassandra
Caches configurable at this level Data Model Column Family Analogous to Table Contains many rows Caches configurable at this level Introduction to Cassandra
Row Data Model Each one has a partition key - hash Has many columns – up to 2Bn Columns don’t have to be defined ahead of time Rows in the same CF can have different columns No sorting by rows, model ordering in rows Introduction to Cassandra
Columns Data Model Sorted by name before being written to SSTable Name and Value are typed Values can be type-validated Column update is timestamped Can have TTL Introduction to Cassandra
Counter Columns Data Model Distributed counters Can get false counts Introduction to Cassandra
Super Columns – Don’t Use Data Model Super Columns – Don’t Use Blob of columns stored inside a single column Have to read and write whole blob Memory intensive Conflicts resolved for whole blob - bad Introduction to Cassandra
Can define an index on a column Secondary Indices Can define an index on a column Cassandra will maintain an inverted index Use sparingly Low Cardinality Columns Only Often times better to maintain own view Introduction to Cassandra
Thrift CQL Thrift vs CQL Original interface, hash style syntax SQL-like syntax but highly limited Sent over Thrift but plans for own protocol Introduction to Cassandra
Maintaining Consistency Consistency Level Used on read & write operations ONE, TWO, LOCAL_QUORUM, ALL, ANY Do you really need consistency guarantee? Introduction to Cassandra
Imagine RF=3, Quorum, Nodes=6 Each query impacts 2 nodes sync Scaling Cassandra Imagine RF=3, Quorum, Nodes=6 Each query impacts 2 nodes sync Each write will touch all 3 nodes, though async To scale writes add more nodes To scale reads, add more replicas Introduction to Cassandra
Solr 4 & Hadoop Integration Advanced Topics Advanced Topics Data Modelling Wide Rows & Clustering Performance Solr 4 & Hadoop Integration Introduction to Cassandra
Agenda – 100ft Quick Introduction Data Structures Efficient Data Modeling Data Modeling Examples Introduction to Cassandra
Data Modelling Data Modelling Concepts that Drive Data Modeling Time-series Modeling Wide Rows (Composite Columns) Compound Keys & CQL3 Introduction to Cassandra
Data Modelling - Concepts Rows in same CF will live on different nodes High cost of multi-get De-normalise your data into rows Don’t Put Consistent Load on Single Row Will heat up replica nodes Introduction to Cassandra
Data Modelling - Concepts Writes to Single Row Atomic & Isolated Columns are Ordered Column Range Slicing Efficient Mutating data often needs compaction tuning Introduction to Cassandra
Efficient Reads Wide Rows Store how you want to fetch Fetch most efficient over few rows Store what you want to fetch in few rows Introduction to Cassandra
Use Timestamp for Column Name – ordered Range slicing efficient Time Series Use Timestamp for Column Name – ordered Range slicing efficient Can limit row length by using date partition key e.g. 20121004 Introduction to Cassandra
Composite Column Composite Columns e.g. time1:log_class, time1:log_message, time2:log_class, time2:log_message Introduction to Cassandra
Writing to a Single Row Hotspots Use Round Robin Over Rows Time Series Writing to a Single Row Hotspots Use Round Robin Over Rows e.g. 20121004:1, 20121004:2, etc… Introduction to Cassandra
Compound Key in CQL3 Compound Keys Partition Key is the row key Compound Key = Partition Key + Composite Key e.g. partition key = 20121004, composite key = time1 20121004 => time1:name, time1:msg, time2:name, time2:msg Introduction to Cassandra
Agenda – 100ft Quick Introduction Data Structures Efficient Data Modeling Data Modeling Examples Introduction to Cassandra
Working with CQL cqlsh -3 192.168.33.21 CREATE KEYSPACE my_app_data WITH strategy_class = SimpleStrategy AND strategy_options:replication_factor = 2; DESCRIBE KEYSPACE my_app_data; Introduction to Cassandra
Compound Keys USE my_app_data; CREATE COLUMNFAMILY logs ( day text, -- partition key log_id timeuuid, -- clustering column log_class text, log_message text, primary key (day, log_id) ); DESCRIBE columnfamilies; Introduction to Cassandra
Compound Keys INSERT INTO logs (day,log_id,log_class,log_message) VALUES (‘20130604’, ‘2013-06-04 10:05:00’, ‘error’, ‘it broke’) USING CONSISTENCY ONE; VALUES (‘20130604’, ‘2013-06-04 11:05:00’, ‘error’, ‘it broke again’) USING CONSISTENCY QUORUM; Introduction to Cassandra
Compound Keys SELECT * FROM logs USING CONSISTENCY ONE WHERE day=‘20130604’; SELECT * FROM logs USING CONSISTENCY QUORUM WHERE day=‘20130604’ AND log_id > ‘2013-06-04 11:00:00’; TRY WITH CL.TWO: vagrant suspend node2 Setting CL and range querying columns, losing consistency Introduction to Cassandra
See the raw Cassandra data Compound Keys cassandra-cli -h 192.168.33.21 use my_app_data; list logs; See the raw Cassandra data Introduction to Cassandra
Hector Code Example - Clients Solid Java Client In Use in Production Round Robin Node Discovery Introduction to Cassandra
Netflix Open Source Library Code Example - Clients Astyanax Netflix Open Source Library Simpler APIs Introduction to Cassandra
Example: Storing Payment Methods Code Example Example: Storing Payment Methods https://github.com/neilbeveridge/example-compoundkeys Introduction to Cassandra
Store 1-10 payment methods Code Example Requirements Store 1-10 payment methods Use a single row Introduction to Cassandra
Define a composite column class Code Example Non-CQL Define a composite column class public static final class Composite { private @Component(ordinal = 0) String paymentUuid; private @Component(ordinal = 1) String field; Introduction to Cassandra
Writing Data Code Example UUID paymentUUID = TimeUUIDUtils.getUniqueTimeUUIDinMillis(); String sPaymentUUID = paymentUUID.toString(); batch.withRow(PAYMENTS_CF, userId) .putColumn(new Composite(sPaymentUUID, "pvtoken"), paymentInfo.pvToken, null) .putColumn(new Composite(sPaymentUUID, "name"), paymentInfo.name, null) .putColumn(new Composite(sPaymentUUID, "number"), paymentInfo.number, null) Introduction to Cassandra
Need some logic to handle record boundaries Code Example Reading Data Need some logic to handle record boundaries //handle the payment info boundary if (lastSeen != null && !column.getName().getPaymentUuid().equals(lastSeen)) { payments.add(payment); payment = new PaymentInfo(); payment.paymentUUID = UUID.fromString(column.getName().paymentUuid); } lastSeen = column.getName().getPaymentUuid(); Introduction to Cassandra
Code Example A Bit Messy Introduction to Cassandra
Cassandra needs it to split up the row for us Code Example CQL3 Need to define a Schema Cassandra needs it to split up the row for us Introduction to Cassandra
Schema Code Example create table paymentinfo_cql ( user text, paymentid timeuuid, name text, number text, pvtoken text, primary key (user,paymentid) ); Introduction to Cassandra
Inserting Data Code Example insert into paymentinfo_cql ( user, paymentid, name, number, pvtoken ) values ( '%1$s','%2$s','%3$s','%4$s','%5$s’ ) Introduction to Cassandra
Reading Data Code Example select * from paymentinfo_cql where user='%s Introduction to Cassandra
Multi Datacentre Support Cassandra RF=2 (availability), Solr RF=1 (offline search) RFs set per Column Family and per logical datacentre Introduction to Cassandra
Multi Datacentre Support Both DCs participate in same ring Cassandra walks clockwise as normal to fulfill RFs Introduction to Cassandra
Performance Tuning Levers Memory Mapped Files SSTables memory mapped Visible as high virtual memory consumption Read fastest when working set fits in free RAM Introduction to Cassandra
Performance Tuning Levers Row Cache Saves locating SSTables, seeking, reconciliation Off-heap – IPC marshaling penalty Whole row in memory Good for small numbers of hot rows – Gaussian dist. Introduction to Cassandra
Performance Tuning Levers Key Cache Saves seeking through SSTables Beneficial for large SSTables - tiered compaction On-heap Introduction to Cassandra
Performance Tuning Levers Cache hit-rates exposed over JMX Introduction to Cassandra
Performance Tuning Levers Take care using memory that might be stolen from the read path (VirtMem) Introduction to Cassandra
Solr 4.0 Integration DataStax Enterprise Near-realtime indexing Columns are available to Solr to index Indexes maintained in original file format Supports distributed search Use Cassandra API or Solr API Introduction to Cassandra
Hadoop Integration DataStax Enterprise DataStax impements the HDFS on Cassandra – CFS Use H* or C* API No ETL Map operations are sent to replicas Reduce back to the task owner Introduction to Cassandra
Problem #1: Adding New Nodes Virtual Nodes Problem #1: Adding New Nodes Introduction to Cassandra
Minimise streaming caused by moves Virtual Nodes Wish to add node Ring already loaded Minimise streaming caused by moves Could put it in between 2 existing nodes – only helps a small range (this sucks) Introduction to Cassandra
Don’t want to have to buy 2 x servers each time (also sucks) Virtual Nodes Double size of ring Minimise streaming caused by moves Don’t want to have to buy 2 x servers each time (also sucks) Introduction to Cassandra
Choose to rebalance the ring Virtual Nodes Choose to rebalance the ring Load already warranted expansion Now adding streaming load Introduction to Cassandra
Problem #2: Replacing Failed Nodes Virtual Nodes Problem #2: Replacing Failed Nodes Introduction to Cassandra
Remaining replica heats up Virtual Nodes Node fails Remaining replica heats up Introduction to Cassandra
Now node 20 starts streaming => FIRE! Virtual Nodes Bootstrap another Now node 20 starts streaming => FIRE! Introduction to Cassandra
Virtual Nodes The Solution Introduction to Cassandra
Slice each node into 256 token ranges Virtual Nodes Slice each node into 256 token ranges Introduction to Cassandra
Randomly distribute tokens to other nodes Virtual Nodes Randomly distribute tokens to other nodes Introduction to Cassandra
Each colour represents a node Virtual Nodes Each colour represents a node Each node owns an even, random distribution of the ring Introduction to Cassandra
Can stream from every node Virtual Nodes Replacing a node Can stream from every node Introduction to Cassandra
Do stuff with your deployment watch “nodetool ring” Nodetool & Opscenter Do stuff with your deployment watch “nodetool ring” Useful overview of the ring – tokens, health Opscenter Introduction to Cassandra
Aims By the end of today you should know: How Cassandra organises data How to configure replicas How to choose between consistency and availability How to efficiently model data for both reads and writes You need to consider Active-Active scenarios Who to ask to help you & sign off on your data model HINT: Ask Neil directly or email harch@expedia.com. Introduction to Cassandra
Questions Code Example htraining.s3.amazonaws.com/cassandra-training.pptx Introduction to Cassandra