Presentation is loading. Please wait.

Presentation is loading. Please wait.

Not Only SQL. Table of Content Background and history Used Applications What is Cassandra? – Overview Replication & Consistency Writing, Reading, Querying.

Similar presentations


Presentation on theme: "Not Only SQL. Table of Content Background and history Used Applications What is Cassandra? – Overview Replication & Consistency Writing, Reading, Querying."— Presentation transcript:

1 Not Only SQL

2 Table of Content Background and history Used Applications What is Cassandra? – Overview Replication & Consistency Writing, Reading, Querying and Sorting API’s & Installation World Database in Cassandra Using Hector API Administration tools

3 Background Influential Technologies: Dynamo – Fully distributed design - infrastructure BigTable – Sparse data model

4 Other NoSql databases NoSql Big Data NoSql MongoDB Neo4J HyperGra Memcach Tokyo Ca Redis CouchDB Hypertab Cassandra Riak Voldemort HBase

5 Bigtable / Dynamo Bigtable Dynamo Hbase Hypertable Riak Voldemort Cassandra Combination of Both

6 CAP Theorem Consistency Availability Partition Tolerance

7 Applications Facebook Google Code Apache Digg Twitter Rackspace Others…

8 What Is Cassandra? O(1) node lookup Key – Value Store Column based data store Highly Distributed – decentralized (no master\slave) Elasticity Durable, Fault-tolerant - Replications Sparse ACID NoSQL!

9 Overview – Data Model Keyspace Uppermost namespace Typically one per application Column Basic unit of storage – Name, Value and timestamp ColumnFamily Associates records of a similar kind Record-level Atomicity Indexed SuperColumn Columns whose values are columns Array of columns SuperColumnFamily ColumnFamily whose values are only SuperColumns

10 Examples Column - City: ORANJESTAD {"id": 1, "name": "ORANJESTAD", "population": 33000, "capital": true} SuperColumns – Country: Aruba {"id": "aa", "name": "Aruba", "fullName": "Aruba“, "location": "Caribbean, island in the Caribbean Sea, north of Venezuela", "coordinates": { "latitudeType": "N", "latitude": 12.5, "longitudeType": "W", "longitude": 69.96667}, ….

11 Replication & Consistency Consistency Level is based on Replication Factor (N), nor the number of nodes in the system. The are a few options to set How many replicas must respond to declare success Query all replicas on every read Every Column has a value and a timestamp – latest timestamp wins Read repair – read one replica and check the checksum/timestamp to verify R(number of nodes to read from) + W(number of nodes to write on) > N (number of nodes)

12 The Ring - Partitioning Each NODE has a single, unique TOKEN Each NODE claims a RANGE of its neighbors in the ring Partitioning – Map from Key Space to Token – Can be random or Order Preserving Snitching – Map from Nodes to Physical Location

13 Writing No Locks Append support without read ahead Atomicity guarantee for a key (in a ColumnFamily) Always Writable!!! SSTables – Key/data – SSTable file for each column family Fast

14 Reading Wait for R responses Wait for N – R responses in the background and perform read repair Read multiple SSTables Slower than writes (but still fast)

15 Compare with MySQL (RDBMS) Compare a 50GB Database: MySQL ~300ms write ~350ms read Cassandra ~0.12ms write ~15ms read

16 Queries Single column Slice Set of names / range of names Simple slice -> columns Super slice -> supercolumns Key range

17 Sorting Sorting is set on writing Sorting is set by the type of the Column/Supercolumn keys Sorting/keys Types Bytes UTF8 Ascii LexicalUUID TimeUUID

18 Drawbacks No joins (for speed) Not able to sort at query time Not really supports sql (altough some API’s support it on a very small portion)

19 API’s Many API’s for large number of languages includes C++, Java, Python, PHP, Ruby, Erlang, Haskell, C#, Javascript and more… Thrift interface – Driver level interface – hard to use. Hector – a java Cassandra client – simple Column based client – does what Cassandra is intended to do. Kundera – JPA supported java client – tries to translate JPA classes and attributes to Cassandra – good on inserts, hard and problematic still with queries.

20 Cassandra Installation Install prerequisite – basically the latest java se release Extract the Cassandra Zip files to your requested path Run Bin/cassandra.but –f Cassandra node is up and running

21 World database in cassandra World - Keyspace Countries – SuperColumn Family CountryDetails – SuperColumn Border – SuperColumns Coordinates – SuperColumn GDP – SuperColumn Language – SuperColumns Cities – Column Family

22 Using Hector API - definitions Creating a Cassandra Cluster : Adding a keyspace: Adding a Column: Cluster cluster = HFactory.getOrCreateCluster("WorldCluster", "localhost:9160"); columnFamilyDefinition.setKeyspaceName(WORLD_KEYSPACE); BasicColumnFamilyDefinition columnFamilyDefinition = new BasicColumnFamilyDefinition(); columnFamilyDefinition.setKeyspaceName(WORLD_KEYSPACE); columnFamilyDefinition.setName(CITY_CF); // ColumnFamily Name columnFamilyDefinition.addColumnDefinition(columnDefinition); BasicColumnFamilyDefinition columnFamilyDefinition = new BasicColumnFamilyDefinition(); columnFamilyDefinition.setKeyspaceName(WORLD_KEYSPACE); columnFamilyDefinition.setName(CITY_CF); // ColumnFamily Name columnFamilyDefinition.addColumnDefinition(columnDefinition);

23 Using Hector API - definitions Adding a SuperColumn: Adding all definition to cluster: BasicColumnFamilyDefinition superCfDefinition = new BasicColumnFamilyDefinition(); superCfDefinition.setKeyspaceName(WORLD_KEYSPACE); superCfDefinition.setName(COUNTRY_SUPER); superCfDefinition.setColumnType(ColumnType.SUPER); BasicColumnFamilyDefinition superCfDefinition = new BasicColumnFamilyDefinition(); superCfDefinition.setKeyspaceName(WORLD_KEYSPACE); superCfDefinition.setName(COUNTRY_SUPER); superCfDefinition.setColumnType(ColumnType.SUPER); ColumnFamilyDefinition cfDefStandard = new ThriftCfDef(columnFamilyDefinition); ColumnFamilyDefinition cfDefSuper = new ThriftCfDef(superCfDefinition); KeyspaceDefinition keyspaceDefinition = HFactory.createKeyspaceDefinition(WORLD_KEYSPACE, "org.apache.cassandra.locator.SimpleStrategy", 1, Arrays.asList(cfDefStandard, cfDefSuper)); cluster.addKeyspace(keyspaceDefinition); ColumnFamilyDefinition cfDefStandard = new ThriftCfDef(columnFamilyDefinition); ColumnFamilyDefinition cfDefSuper = new ThriftCfDef(superCfDefinition); KeyspaceDefinition keyspaceDefinition = HFactory.createKeyspaceDefinition(WORLD_KEYSPACE, "org.apache.cassandra.locator.SimpleStrategy", 1, Arrays.asList(cfDefStandard, cfDefSuper)); cluster.addKeyspace(keyspaceDefinition);

24 Using Hector API - inserting Creating a Column Template Adding a Row into a Column Family ColumnFamilyTemplate template = new ThriftColumnFamilyTemplate (keyspaceOperator, columnFamilyName, stringSerializer, stringSerializer); ColumnFamilyTemplate template = new ThriftColumnFamilyTemplate (keyspaceOperator, columnFamilyName, stringSerializer, stringSerializer); ColumnFamilyUpdater updater = template.createUpdater("a key"); updater.setString(“key", "value"); try { template.update(updater); } catch (HectorException e) { // do something... } ColumnFamilyUpdater updater = template.createUpdater("a key"); updater.setString(“key", "value"); try { template.update(updater); } catch (HectorException e) { // do something... }

25 Using Hector API - inserting Creating a Super Column Template Adding a Row into a SuperColumn Family SuperCfTemplate template = new ThriftSuperCfTemplate (keyspaceOperator, columnFamilyName, stringSerializer, stringSerializer); SuperCfTemplate template = new ThriftSuperCfTemplate (keyspaceOperator, columnFamilyName, stringSerializer, stringSerializer); SuperCfUpdater updater = template.createUpdater("a key"); HSuperColumn superColumn = updater.addSuperColumn(“sc name”); superColumn.setString(“column name”, value); superColumn.update(); try { template.update(updater); } catch (HectorException e) { // do something... } SuperCfUpdater updater = template.createUpdater("a key"); HSuperColumn superColumn = updater.addSuperColumn(“sc name”); superColumn.setString(“column name”, value); superColumn.update(); try { template.update(updater); } catch (HectorException e) { // do something... }

26 Using Hector API - reading Reading all Rows and it’s columns from a Column Family (Using CQL) Reading all columns from a Row in a SuperColumn Family CqlQuery cqlQuery = new CqlQuery (factory.getKeyspaceOperator(), stringSerializer, stringSerializer, stringSerializer); cqlQuery.setQuery("select * from City"); QueryResult > result = cqlQuery.execute(); CqlQuery cqlQuery = new CqlQuery (factory.getKeyspaceOperator(), stringSerializer, stringSerializer, stringSerializer); cqlQuery.setQuery("select * from City"); QueryResult > result = cqlQuery.execute(); SuperCfTemplate superColumn = HectorFactory.getFactory().getSuperColumnFamilyTemplate(“SuperColumnFamily”); SuperCfResult superRes = superColumn.querySuperColumns(“key"); Collection columnNames = superRes.getSuperColumns(); SuperCfTemplate superColumn = HectorFactory.getFactory().getSuperColumnFamilyTemplate(“SuperColumnFamily”); SuperCfResult superRes = superColumn.querySuperColumns(“key"); Collection columnNames = superRes.getSuperColumns();

27 Using Hector API - reading Reading a SuperColumn from a Row in a SuperColumn Family Every query as options to get part of the rows – by setting start value and end value (the rows are sorted on inserting), and part of the columns by setting the column names explicitly SuperColumnQuery query = HFactory.createSuperColumnQuery(keyspaceOperator, stringSerializer, stringSerializer, stringSerializer, stringSerializer); query.setColumnFamily(“SuperColumnFamily”); query.setKey(“key"); query.setSuperName(“SuperColumnName"); QueryResult > result = query.execute(); for (HColumn col : result.get().getColumns()) { String name = col.getName(); String value = col.getValue(); } SuperColumnQuery query = HFactory.createSuperColumnQuery(keyspaceOperator, stringSerializer, stringSerializer, stringSerializer, stringSerializer); query.setColumnFamily(“SuperColumnFamily”); query.setKey(“key"); query.setSuperName(“SuperColumnName"); QueryResult > result = query.execute(); for (HColumn col : result.get().getColumns()) { String name = col.getName(); String value = col.getValue(); }

28 Administration tools Cassandra – node activator Nodetool – bootstrapping and monitoring Cassandra-cli – Application Console Sstable2json - Export Json2sstable - Import


Download ppt "Not Only SQL. Table of Content Background and history Used Applications What is Cassandra? – Overview Replication & Consistency Writing, Reading, Querying."

Similar presentations


Ads by Google