Cassandra Installation Guide and Example Lecturer : Prof. Kyungbaek Kim Presenter : I Gde Dharma Nugraha
Cassandra Environment VMWare Player/Oracle VirtualBox Ubuntu LTS 64bits Java Version 1.7 (Oracle version) Cassandra Version cassandra bin.tar.gz
Preparing Java Java development kit $sudo add-apt-repository ppa:webupd8team/java $sudo apt-get update $sudo apt-get install oracle-java7-installer Check success installation, run: $java –version Automatically setting java environment variables $sudo apt-get install oracle-java7-set- default
Preparing Cassandra Code This exercise uses Cassandra Get apache-cassandra bin.tar.gz $wget cassandra bin.tar.gz cassandra bin.tar.gz $tar xvf apache-cassandra bin.tar.gz $mv apache-cassandra cassandra Run single mode Cassandra $cd cassandra/bin $./cassandra –f (running Cassandra in the foreground and log gratuitously to the console). Stop Cassandra Press “Control-C” in the same window with the command above.
Cassandra Single Mode Run single mode Cassandra $cd cassandra/bin $./cassandra –f Screenshot:
Cassandra Single Mode Check Cassandra Node Open new Terminal Window Type command $cd cassandra/bin $./nodetool status Screenshot
Cassandra Multiple Node Mode Preparation Make two new VM. Repeat the step for preparing the environment with Java 7. Download apache-cassandra bin.tar.gz and repeat Cassandra preparation step for each new VM. Setting the IP for all the VM with the same IP Network to perform LAN. For this exercise: Node 1 : Node 2 : Node 3 :
Cassandra Multiple Node Mode Configuration for all VM Enter conf folder $cd cassandra/conf Edit cassandra.yaml and configure with the setting below cluster_name: ‘MyCassandraCluster’ num_tokens: 256 seed_provider: class_name: org.apache.cassandra.locator.SimpleSeedProv ider Parameters: seeds: “ ” listen_address: or listen_interface: eth0 rpc_address: broadcast_rpc_address: endpoint_snitch: GossipingPropertyFileSnitch
Cassandra Multiple Node Mode Run Cassandra in each VM First empty cassandra/data directory $cd cassandra/data $rm –rf * $cd cassandra/bin $./cassandra -f
Cassandra Multiple Node Mode This screenshot shows that all nude have UP.
Cassandra Multiple Node Mode To make sure, run Nodetool command in new terminal window. $./nodetool status
Cassandra Interaction Interaction with Cassandra, use cqlsh (CQL Shell) Type command below to run cqlsh $cd cassandra/bin $./cqlsh
Practical Example (1) Interaction using CQL Source : Running cqlsh from installation_tarball/cassandra/bin directory.
Practical Example (1) Cont’d Interaction using CQL Create keyspace then use the new keyspace.
Practical Example (1) Cont’d Interaction using CQL Create user table within demo keyspace.
Practical Example (1) Cont’d Interaction using CQL Show schema
Practical Example (1) Cont’d Interaction using CQL Insert Data Select Data
Practical Example (1) Cont’d Interaction using CQL Update Data and show the result Delete Data
Practical Example (1) Cont’d Inside Cassandra Data Directory
Practical Example (2) Simple Java Application with Cassandra Java Driver Preparation Download Cassandra java driver $wget java-driver tar.gzhttp://downloads.datastax.com/java-driver/cassandra- java-driver tar.gz $tar xvf cassandra-java-driver tar.gz $mv cassandra-java-driver cassandrajava Download slf4j $wget $tar xvf slf4j tar.gz $mv slf4j slf4j Make java file with filename GettingStarted.java
Practical Example (2) Cont’d Simple Java Application with Cassandra Java Driver GettingStarted.java
Practical Example (2) Cont’d Simple Java Application with Cassandra Java Driver GettingStarted.java
Practical Example (2) Cont’d Simple Java Application with Cassandra Java Driver Compile GettingStarted.java $javac –classpath $HOME_PATH/cassandrajava/cassandra- driver-core jar:. GettingStarted.java Run GettingStarted $java –classpath $HOME_PATH/cassandrajava/*:$HOME_PATH/cassandrajava/li b*:$HOME_PATH/slf4j/slf4j-nop jar:. GettingStarted The Result
Practical Example (3) Integrate Hadoop and Cassandra Integrating Hadoop and Cassandra perform big analytic tools. Cassandra will be the data source and Hadoop will be the processor. Requirements for integrate Hadoop and Cassandra: Isolate Cassandra and Hadoop nodes in separate data centers. Disable virtual nodes (vnodes) Set num_tokens to 1 in the cassandra.yaml file. Uncomment the initial_token property and set it to 1 or the value of a generated token for a multimode cluster. Start the cluster for the first time.
Practical Example (3) Cont’d Integrate Hadoop and Cassandra Preparation Download source code and library $wget t.tar.gz?dl=0 t.tar.gz?dl=0 $mv CassandraWordCount.tar.gz?dl=0 CassandraWordCount.tar.gz $tar xvf CassandraWordCount.tar.gz Create keyspace in Cassandra, name : test cqlsh>CREATE KEYSPACE test WITH REPLICATION = { ‘class’ : ‘SimpleStrategy’, ‘replication_factor’ : 2}; cqlsh>USE test; cqlsh>CREATE TABLE documents(id uuid, content text, primary key (id)); Insert sample data.
Practical Example (3) Cont’d Integrate Hadoop and Cassandra Compile $javac -classpath $HADOOP_HOME/hadoop-core jar:/home/hduser/CassandraWordCount/lib/cassandra- driver-core jar:/home/hduser/cassandra/lib/*.jar:/home/hduser/Cas sandraWordCount/lib/cassandra-all jar -d bin src/WordCount.java $cd bin $jar –cvf wordcount.jar *.class
Practical Example (3) Cont’d Integrate Hadoop and Cassandra Run Run Hadoop Run Cassandra Run apps using command: hadoop jar wordcount.jar WordCount -libjar /home/hduser/CassandraWordCount/lib/cassandra-all jar,/home/hduser/CasssandraWordCount/lib/cassandra- driver-core jar,/home/hduser/CassandraWordCount/lib/cassandra- thrift jar,/home/hduser/CassandraWordCount/lib/google- collect-1.0.jar,/home/hduser/CassandraWordCount/lib/jamm jar,/home/hduser/CassandraWordCount/lib/libthrift jar,/home/hduser/CassandraWordCount/lib/metrics-core jar,/home/hduser/CassandraWordCount/lib/netty.jar - Dinput=localhost -Doutput=/user/hduser/out31
Practical Example (3) Cont’d WordCount mapper
Practical Example (3) Cont’d WordCount reducer
Practical Example (3) Cont’d WordCount run
Practical Example (3) Cont’d