Jean Armel Luce Orange France Thursday, June Cassandra Meetup Nice
2 Cassandra Meetup Nice – June Summary Short description of PnS. Why did we choose C* ? Some key features of C* After the migration … Analytics with Hadoop/Hive over Cassandra Some conclusions about the project PnS3.0 Jean Armel Luce - Orange-France
Short description of PnS3
4 Cassandra Meetup Nice – June PnS – Short description PnS means Profiles and Syndication : PnS is a highly available service for collecting and serving live data about Orange customers End users of PnS are : –Orange customers (logged to Portal –Sellers in Orange shops –Some services in Orange (advertisements, …) Jean Armel Luce - Orange-France
5 Cassandra Meetup Nice – June PnS – The Big Picture Jean Armel Luce - Orange-France End users Millions of HTTP requests (Rest or Soap) Fast and highly available Database WebService to get or set data stored by pns : -postProcessing(data1) -postProcessing(data2) -postProcessing(data3) -postProcessing(datax) -… PNS Data providers Thousands of files (Csv or Xml) Scheduled data injection DB Queries R/W operations
6 Cassandra Meetup Nice – June Until 2012, data were stored in 2 differents backends : MySQL cluster (for volatile data) PostGres « cluster » (sharding and replication) and web services (read and writes) for batch updates PnS – Architecture Jean Armel Luce - Orange-France Bagnolet Sophia Antipolis 2 DCs architecture for high availability
7 Cassandra Meetup Nice – June Timeline – Key dates of PnS 3.0 Jean Armel Luce - Orange-France PNS to 2012 Study phase We did a large study about a few NoSQL databases (Cassandra, MongoDB, Riak, Hbase, Hypertable, …) We chose Cassandra as the single backend for PnS 06/2012 Design phase We started the design phase of PnS3.0 09/2012 Proof Of Concept We started a 1st (small) Cassandra cluster in production for a non critical application : 1 table, key value access 04/2013 Production phase Migration of the 1st subset of data of PnS from mysql cluster to Cassandra in production 05/2013 to 12/2013 Complete migration Migration of all other subsets of data from Mysql cluster and Postgres to Cassandra Add new nodes in the cluster (From 8 nodes in each DC to 16 nodes in each DC) 0 Analytics Add a 3rd datacenter 04/2014
8 Cassandra Meetup Nice – June PnS – Why did we choose Cassandra ? Cassandra fits our requirements : –Very high availability –Low latency –Scalability And also : –Ease of use : Cassandra is easy to administrate and operate – Some features that I like (rack aware, CL per request, …) –Cassandra is designed to work naturally and plainly in a multidatacenter architecture Jean Armel Luce - Orange-France PnS2 = 99,95% availability we want to improve it !!! 20 ms < RT PnS2 web service < 150 ms we want to improve it !!! Higher load, higher volume next years ? unpredictable; better scalability brings new businesses
Some key features of C*
10 Cassandra Meetup Nice – June Who is Cassandra ? Cassandra is a NoSQL database, developped in Java, Cassandra was created at Facebook in 2007, was open sourced and then incubated at Apache, and is nowadays a Top-Level- Project. 2 distributions of Cassandra : –Community edition : distributed under the Apache License –Enterprise distribution : distributed by Datastax Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0
11 Cassandra Meetup Nice – June Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0 Cassandra : architecture/topology The main characteristic of Cassandra : all the nodes play the same role No master, no slave, no configuration server no SPOF Rows are sharded among the nodes in the cluster Rows are replicated among the nodes in the cluster The parameter TopologyStrategy defines how/where rows in a keyspace are replicated (monodatacenter, multidatacenter, …)
12 Cassandra Meetup Nice – June Jean Armel Luce - Orange-DSIF/SDFY-PnS 3.0 Cassandra : how requests are executed The application sends a request to one of the nodes (not always the same; try to balance the load among the nodes). This node is called the coordinator The coordinator routes the query to the datanode(s) The datanodes execute the query and return the result to the coordinator The coordinator returns the result to the application Application
13 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Cassandra : what happens when a node crash? Read query Case 1 : a READ query is executed while a datanode is crashed : the coordinator has already received the information (via Gossip) that a node is down and do not send any request to this node Application Replica1 Replica 2 Replica 3
14 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Cassandra : what happens when a node crash Read query Case 2 : the coordinator crashes while a READ query is being executed : the application receives a KO (or timeouts), then re-sends the request to another node which acts as a new coordinator Application
15 Cassandra Meetup Nice – June HH Jean Armel Luce - Orange-France Cassandra : what happens when a node crash ? Write query Case 1 : a WRITE query is executed while a datanode is crashed and there are enough replica up : – A “Hinted Handoff” is stored in the coordinator Application Replica1 Replica 2 Replica 3 Replica 1
16 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Cassandra : what happens when a node crash ? Write query Case 1 : a WRITE query is executed while a datanode is crashed : –The write is executed in all replica which are available. –A Hinted Handoff is stored in the coordinator, and the query will be executed when the datanode comes back again (within 3 hours) 3 tips for keeping consistency between nodes : –Hinted Handoffs (repair when node comes back in the ring after a failure) –Read repairs (automatic repair in background for 10% of read queries) –Anti entropy repairs (manual read repair for all data)
17 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France The eventual consistency with Cassandra We can specify the consistency level for each read/update/insert/delete request CL mostly used : LOCAL_ONE ONE ANY LOCAL_QUORUM QUORUM ALL SERIAL Strong consistency : W + R > RF Consistenc y Weak Strong Availability Higher Lower Latency Higher Lower
18 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Cassandra : sharding with virtual nodes (versions 1.2 +) Virtual nodes are available since C* 1.2. With virtual nodes, adding a new node (or many nodes) in the cluster is easy data are moved from ALL the old nodes to new node : few data to move between nodes after the move of data, the cluster is still well balanced procedure totally automatized Adding a new node in the cluster is a normal operation which is done on-line without interruption of service When adding nodes, replica are also moved between nodes
19 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Cassandra : Adding a node using Vnodes Example : adding a 5th node in a 4-nodes cluster Node 4 Node 5 Node 3 Node 2 Node 1
20 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France CQL (Cassandra Query language) DDL : CREATE keyspace, CREATE table, CREATE INDEX ALTER keyspace, ALTER table DROP keyspace, DROP table, DROP INDEX DML : SELECT INSERT/UPDATE (INSERT equivalent to UPDATE : improve performances) DELETE (delete a ROW or delete COLUMNS in a ROW)
21 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France CQL (Cassandra Query language) Exemple : -u cassandra -p cassandra Connected to Test Cluster at localhost:9160. [cqlsh | Cassandra beta1-SNAPSHOT | CQL spec | Thrift protocol ] Use HELP for help. cqlsh> CREATE KEYSPACE fr... WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh> use fr; Replication factor Replication strategy Connexion to the keyspace Keyspace instead of database
22 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France CQL (Cassandra Query language) Exemple : cqlsh:fr> CREATE TABLE customer (... custid int,... firstname text,... lastname text,... PRIMARY KEY (custid) ); cqlsh:fr> cqlsh:fr> UPDATE customer set firstname = ‘Bill', lastname = ‘Azerty' WHERE custid = 1; cqlsh:fr> INSERT INTO customer (custid, firstname, lastname ) values (2, ‘Steve', ‘Qwerty'); cqlsh:fr> INSERT equivalent to UPDATE
23 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France CQL (Cassandra Query language) Exemple : cqlsh:fr> SELECT firstname, lastname FROM customer WHERE custid = 1; firstname | lastname Bill | Azerty (1 rows) cqlsh:fr> SELECT * FROM customer WHERE custid = 2; custid | firstname | lastname | Steve | Qwerty (1 rows) SELECT with clause WHERE on primary key
24 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France CQL (Cassandra Query language) Exemple : cqlsh:fr> SELECT * FROM customer WHERE lastname= ‘Azerty'; Bad Request: No indexed columns present in by-columns clause with Equal operator This request requires an index on column ‘lastname’ SELECT rejected. No other operator than = accepted in WHERE clause (, != rejected)/ The column in the WHERE clause must be indexed
After the migration …
26 Cassandra Meetup Nice – June Comparison before/after migration to Cassandra Some graphs about the latency of the web services are very explicit : Service push mailService push webxms Jean Armel Luce - Orange-France The latency dates of migration to C*
27 Cassandra Meetup Nice – June Read and write latencies are now in microseconds in the datanodes : Thanks to and This latency will be improved by (tests in progress) : ALTER TABLE syndic WITH compaction = { 'class' : 'LeveledCompactionStrategy', 'sstable_size_in_mb' : ?? }; Jean Armel Luce - Orange-France The latency
28 Cassandra Meetup Nice – June We got a few hardware failures and network outages No impact on QoS : no error returned by the application no real impact on latency Jean Armel Luce - Orange-France The availability
29 Cassandra Meetup Nice – June Thanks to vnodes (available since Cassandra 1.2), it is easy to scale out With NetworkTopologyStrategy, make sure to distribute evenly the nodes in the racks Jean Armel Luce - Orange-France The scalability
Analytics with Hadoop/Hive over Cassandra
31 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Basic architecture of the Cassandra cluster Cluster without Hadoop : 2 datacenters, 16 nodes in each DC RF (DC1, DC2) = (3, 3) Web servers in DC1 send queries to C* nodes in DC1 Web servers in DC2 send queries to C* nodes in DC2
32 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Architecture of the Cassandra cluster with the datacenter for analytics Cluster with Hadoop : 3 datacenters, 16 nodes in DC1, 16 nodes in DC2, 4 nodes in DC3 RF (DC1, DC2, DC3) = (3, 3, 1) Because RF = 1 in DC3, we need less storage space in this datacenter We favor cheaper disks (SATA) in DC3 rather than SSDs or FusionIo cards
33 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Architecture of the Cassandra cluster with the datacenter for analytics
Some conclusions about the project PnS3
35 Cassandra Meetup Nice – June With Cassandra, we have improved our QoS Lower response time Higher availability More flexibility for exploitation teams We are able to open our service to new opportunities There is a large ecosystem around C* (Hadoop, Hive, Pig, Storm, Shark, Titan, …), which offers more capabilities. The Cassandra community is very active and helps a lot. There are a lot of resources available : mailing lists, blogs, … Jean Armel Luce - Orange-France Conclusions
Thank you for your attention
37 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France Questions
38 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France A few answers about hardware/OS version /Java version/Cassandra version/Hadoop version Hardware : 16 nodes in DC1 and DC2 at the end of 2013 : 2 CPU 6cores each Intel® Xeon® 2.00 GHz 64 GB RAM FusionIO 320 GB MLC 4 nodes in DC3 24 GB de RAM 2 CPU 6cores each Intel® Xeon® 2.00 GHz SATA Disks 15Krpm OS : Ubuntu Precise (12.04 LTS) Cassandra version : Hadoop version : CDH 4.5 (with Hive 0.10) : Hadoop 2 with MRv1 Hive handler : Java version : Java7u45 (GC CMS with option CMSClassUnloadingEnabled)
39 Cassandra Meetup Nice – June Jean Armel Luce - Orange-France A few answers about data and requests Data types : Volume : 6 TB at the end of 2013 elementary types : boolean, integer, string, date collection types complex types : json, xml (between 1 and 20 KB) Requests : requests/sec at the end of 2013 80% get 20% set Consistency level used by PnS for on line queries and batch updates : LOCAL_ONE (95% of the queries) LOCAL_QUORUM (5% of the queries)