Download presentation
Presentation is loading. Please wait.
Published byGeoffrey Kennedy Modified over 9 years ago
1
Not Only SQL
2
Table of Content Background and history Used Applications What is Cassandra? – Overview Replication & Consistency Writing, Reading, Querying and Sorting API’s & Installation World Database in Cassandra Using Hector API Administration tools
3
Background Influential Technologies: Dynamo – Fully distributed design - infrastructure BigTable – Sparse data model
4
Other NoSql databases NoSql Big Data NoSql MongoDB Neo4J HyperGra Memcach Tokyo Ca Redis CouchDB Hypertab Cassandra Riak Voldemort HBase
5
Bigtable / Dynamo Bigtable Dynamo Hbase Hypertable Riak Voldemort Cassandra Combination of Both
6
CAP Theorem Consistency Availability Partition Tolerance
7
Applications Facebook Google Code Apache Digg Twitter Rackspace Others…
8
What Is Cassandra? O(1) node lookup Key – Value Store Column based data store Highly Distributed – decentralized (no master\slave) Elasticity Durable, Fault-tolerant - Replications Sparse ACID NoSQL!
9
Overview – Data Model Keyspace Uppermost namespace Typically one per application Column Basic unit of storage – Name, Value and timestamp ColumnFamily Associates records of a similar kind Record-level Atomicity Indexed SuperColumn Columns whose values are columns Array of columns SuperColumnFamily ColumnFamily whose values are only SuperColumns
10
Examples Column - City: ORANJESTAD {"id": 1, "name": "ORANJESTAD", "population": 33000, "capital": true} SuperColumns – Country: Aruba {"id": "aa", "name": "Aruba", "fullName": "Aruba“, "location": "Caribbean, island in the Caribbean Sea, north of Venezuela", "coordinates": { "latitudeType": "N", "latitude": 12.5, "longitudeType": "W", "longitude": 69.96667}, ….
11
Replication & Consistency Consistency Level is based on Replication Factor (N), nor the number of nodes in the system. The are a few options to set How many replicas must respond to declare success Query all replicas on every read Every Column has a value and a timestamp – latest timestamp wins Read repair – read one replica and check the checksum/timestamp to verify R(number of nodes to read from) + W(number of nodes to write on) > N (number of nodes)
12
The Ring - Partitioning Each NODE has a single, unique TOKEN Each NODE claims a RANGE of its neighbors in the ring Partitioning – Map from Key Space to Token – Can be random or Order Preserving Snitching – Map from Nodes to Physical Location
13
Writing No Locks Append support without read ahead Atomicity guarantee for a key (in a ColumnFamily) Always Writable!!! SSTables – Key/data – SSTable file for each column family Fast
14
Reading Wait for R responses Wait for N – R responses in the background and perform read repair Read multiple SSTables Slower than writes (but still fast)
15
Compare with MySQL (RDBMS) Compare a 50GB Database: MySQL ~300ms write ~350ms read Cassandra ~0.12ms write ~15ms read
16
Queries Single column Slice Set of names / range of names Simple slice -> columns Super slice -> supercolumns Key range
17
Sorting Sorting is set on writing Sorting is set by the type of the Column/Supercolumn keys Sorting/keys Types Bytes UTF8 Ascii LexicalUUID TimeUUID
18
Drawbacks No joins (for speed) Not able to sort at query time Not really supports sql (altough some API’s support it on a very small portion)
19
API’s Many API’s for large number of languages includes C++, Java, Python, PHP, Ruby, Erlang, Haskell, C#, Javascript and more… Thrift interface – Driver level interface – hard to use. Hector – a java Cassandra client – simple Column based client – does what Cassandra is intended to do. Kundera – JPA supported java client – tries to translate JPA classes and attributes to Cassandra – good on inserts, hard and problematic still with queries.
20
Cassandra Installation Install prerequisite – basically the latest java se release Extract the Cassandra Zip files to your requested path Run Bin/cassandra.but –f Cassandra node is up and running
21
World database in cassandra World - Keyspace Countries – SuperColumn Family CountryDetails – SuperColumn Border – SuperColumns Coordinates – SuperColumn GDP – SuperColumn Language – SuperColumns Cities – Column Family
22
Using Hector API - definitions Creating a Cassandra Cluster : Adding a keyspace: Adding a Column: Cluster cluster = HFactory.getOrCreateCluster("WorldCluster", "localhost:9160"); columnFamilyDefinition.setKeyspaceName(WORLD_KEYSPACE); BasicColumnFamilyDefinition columnFamilyDefinition = new BasicColumnFamilyDefinition(); columnFamilyDefinition.setKeyspaceName(WORLD_KEYSPACE); columnFamilyDefinition.setName(CITY_CF); // ColumnFamily Name columnFamilyDefinition.addColumnDefinition(columnDefinition); BasicColumnFamilyDefinition columnFamilyDefinition = new BasicColumnFamilyDefinition(); columnFamilyDefinition.setKeyspaceName(WORLD_KEYSPACE); columnFamilyDefinition.setName(CITY_CF); // ColumnFamily Name columnFamilyDefinition.addColumnDefinition(columnDefinition);
23
Using Hector API - definitions Adding a SuperColumn: Adding all definition to cluster: BasicColumnFamilyDefinition superCfDefinition = new BasicColumnFamilyDefinition(); superCfDefinition.setKeyspaceName(WORLD_KEYSPACE); superCfDefinition.setName(COUNTRY_SUPER); superCfDefinition.setColumnType(ColumnType.SUPER); BasicColumnFamilyDefinition superCfDefinition = new BasicColumnFamilyDefinition(); superCfDefinition.setKeyspaceName(WORLD_KEYSPACE); superCfDefinition.setName(COUNTRY_SUPER); superCfDefinition.setColumnType(ColumnType.SUPER); ColumnFamilyDefinition cfDefStandard = new ThriftCfDef(columnFamilyDefinition); ColumnFamilyDefinition cfDefSuper = new ThriftCfDef(superCfDefinition); KeyspaceDefinition keyspaceDefinition = HFactory.createKeyspaceDefinition(WORLD_KEYSPACE, "org.apache.cassandra.locator.SimpleStrategy", 1, Arrays.asList(cfDefStandard, cfDefSuper)); cluster.addKeyspace(keyspaceDefinition); ColumnFamilyDefinition cfDefStandard = new ThriftCfDef(columnFamilyDefinition); ColumnFamilyDefinition cfDefSuper = new ThriftCfDef(superCfDefinition); KeyspaceDefinition keyspaceDefinition = HFactory.createKeyspaceDefinition(WORLD_KEYSPACE, "org.apache.cassandra.locator.SimpleStrategy", 1, Arrays.asList(cfDefStandard, cfDefSuper)); cluster.addKeyspace(keyspaceDefinition);
24
Using Hector API - inserting Creating a Column Template Adding a Row into a Column Family ColumnFamilyTemplate template = new ThriftColumnFamilyTemplate (keyspaceOperator, columnFamilyName, stringSerializer, stringSerializer); ColumnFamilyTemplate template = new ThriftColumnFamilyTemplate (keyspaceOperator, columnFamilyName, stringSerializer, stringSerializer); ColumnFamilyUpdater updater = template.createUpdater("a key"); updater.setString(“key", "value"); try { template.update(updater); } catch (HectorException e) { // do something... } ColumnFamilyUpdater updater = template.createUpdater("a key"); updater.setString(“key", "value"); try { template.update(updater); } catch (HectorException e) { // do something... }
25
Using Hector API - inserting Creating a Super Column Template Adding a Row into a SuperColumn Family SuperCfTemplate template = new ThriftSuperCfTemplate (keyspaceOperator, columnFamilyName, stringSerializer, stringSerializer); SuperCfTemplate template = new ThriftSuperCfTemplate (keyspaceOperator, columnFamilyName, stringSerializer, stringSerializer); SuperCfUpdater updater = template.createUpdater("a key"); HSuperColumn superColumn = updater.addSuperColumn(“sc name”); superColumn.setString(“column name”, value); superColumn.update(); try { template.update(updater); } catch (HectorException e) { // do something... } SuperCfUpdater updater = template.createUpdater("a key"); HSuperColumn superColumn = updater.addSuperColumn(“sc name”); superColumn.setString(“column name”, value); superColumn.update(); try { template.update(updater); } catch (HectorException e) { // do something... }
26
Using Hector API - reading Reading all Rows and it’s columns from a Column Family (Using CQL) Reading all columns from a Row in a SuperColumn Family CqlQuery cqlQuery = new CqlQuery (factory.getKeyspaceOperator(), stringSerializer, stringSerializer, stringSerializer); cqlQuery.setQuery("select * from City"); QueryResult > result = cqlQuery.execute(); CqlQuery cqlQuery = new CqlQuery (factory.getKeyspaceOperator(), stringSerializer, stringSerializer, stringSerializer); cqlQuery.setQuery("select * from City"); QueryResult > result = cqlQuery.execute(); SuperCfTemplate superColumn = HectorFactory.getFactory().getSuperColumnFamilyTemplate(“SuperColumnFamily”); SuperCfResult superRes = superColumn.querySuperColumns(“key"); Collection columnNames = superRes.getSuperColumns(); SuperCfTemplate superColumn = HectorFactory.getFactory().getSuperColumnFamilyTemplate(“SuperColumnFamily”); SuperCfResult superRes = superColumn.querySuperColumns(“key"); Collection columnNames = superRes.getSuperColumns();
27
Using Hector API - reading Reading a SuperColumn from a Row in a SuperColumn Family Every query as options to get part of the rows – by setting start value and end value (the rows are sorted on inserting), and part of the columns by setting the column names explicitly SuperColumnQuery query = HFactory.createSuperColumnQuery(keyspaceOperator, stringSerializer, stringSerializer, stringSerializer, stringSerializer); query.setColumnFamily(“SuperColumnFamily”); query.setKey(“key"); query.setSuperName(“SuperColumnName"); QueryResult > result = query.execute(); for (HColumn col : result.get().getColumns()) { String name = col.getName(); String value = col.getValue(); } SuperColumnQuery query = HFactory.createSuperColumnQuery(keyspaceOperator, stringSerializer, stringSerializer, stringSerializer, stringSerializer); query.setColumnFamily(“SuperColumnFamily”); query.setKey(“key"); query.setSuperName(“SuperColumnName"); QueryResult > result = query.execute(); for (HColumn col : result.get().getColumns()) { String name = col.getName(); String value = col.getValue(); }
28
Administration tools Cassandra – node activator Nodetool – bootstrapping and monitoring Cassandra-cli – Application Console Sstable2json - Export Json2sstable - Import
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.