Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Networks & Systems Lab Distributed Networks and Systems(DNS) Lab, Department of Electronics and Computer Engineering Chonnam National University.

Similar presentations


Presentation on theme: "Distributed Networks & Systems Lab Distributed Networks and Systems(DNS) Lab, Department of Electronics and Computer Engineering Chonnam National University."— Presentation transcript:

1 Distributed Networks & Systems Lab Distributed Networks and Systems(DNS) Lab, Department of Electronics and Computer Engineering Chonnam National University Sungmin Hwang

2 Distributed Networks & Systems Lab Column-oriented data store Distributed  Designed for large tables Scalable NoSQL DB  No SQL based access  Not attached to Relational Model for storage Based on Google’s Bigtable  Built on top of HDFS

3 Distributed Networks & Systems Lab Being used by  Facebook, Twitter, Yahoo, Netflix, Adobe When to use?  Compared to RDBMS, Hbase has very simple and limited API  Suitable for large amounts of data Large data Large amounts of clients/requests  If data is too small, all the records will end up on a single node Bad for  Relational analytics such as join, group by  Text-based search access

4 Distributed Networks & Systems Lab Tables contain rows  rows – referenced by a unique key (string, long, …) Rows are made of columns which are grouped in column families Data is stored in cells  Identified by row _ column-family _ column Columns are grouped into faimilies Family definitions are static Movie familiyinfo:Columns: title, director, date contentColumns: story

5 Distributed Networks & Systems Lab

6 Region – a range of rows stored together Master server – daemon which manages region servers Hbase stores its data into HDFS  Hfile – key-value map  WAL(write ahead log) - when data is added, it’s also written to WAL  When-memory data exceeds maximum value, it is flushed to an HFile Hbase utilizes Zookeeper for distributed coordination

7 Distributed Networks & Systems Lab

8 Hbase Shell Native Java API HBql Restful API

9 Distributed Networks & Systems Lab

10 Same environment with previous Hadoop assignment  Pseudo-distributed mode  Hadoop 1.0.3  Java 1.7  Ubuntu 12.04 LTS

11 Distributed Networks & Systems Lab Download hbase  Recent version – 0.98.7  http://www.interior- dsgn.com/apache/hbase/hbase-0.98.7/ http://www.interior- dsgn.com/apache/hbase/hbase-0.98.7/  In the folder, there are 2 versions hbase-0.98.7-hadoop1-bin.tar.gz hbase-0.98.7-hadoop2-bin.tar.gz Each number represents Hadoop version In here, as use hadoop1.03, we download version 1  wget http://www.interior- dsgn.com/apache/hbase/hbase-0.98.7/hbase- 0.98.7-hadoop1-bin.tar.gz

12 Distributed Networks & Systems Lab extract $ tar zxvf hbase-0.98.7-hadoop1-bin.tar.gz $ cd hbase-0.98.7-hadoop1 Configure JAVA_HOME directory $ vim conf/hbase-env.sh In hbase-env.sh, remove comment where the line starts with JAVA_HOME, and set up the path for java

13 Distributed Networks & Systems Lab hbase-site.xml $ vim conf/hbase-site.xml Configuration for pseudo-distributed mode

14 Distributed Networks & Systems Lab For starting, $./bin/start-hbase.sh For stopping Hbase, $./bin/stop-hbase.sh

15 Distributed Networks & Systems Lab Both Master and Region servers run web server Master: http://localhost:60010http://localhost:60010

16 Distributed Networks & Systems Lab Both Master and Region servers run web server Region server: http://localhost:60030http://localhost:60030

17 Distributed Networks & Systems Lab $./bin/hbase shell hbase> help “command” to get detailed use of commands example ) hbase> help “get”

18 Distributed Networks & Systems Lab Jruby IRB (Interactive Ruby shell) + hbase commands Quote all names  Table and column names  Single quotes for text Hbase> create 'test', 'cf‘  Double quotes for binary Use hexadecimal representation of that binary value Specifying parameters  {‘key1’ => ‘value1’, ‘key2’ => ‘value2’, …}  Example: Hbase> get ‘UserTable’, ‘userId1’, {COLUMN => ‘address:str}

19 Distributed Networks & Systems Lab General  status, version DDL  Alter, create, describe, disable, drop, enable, exists, list DML  Count, delete, deleteall, get, get_counter, incr, put, scan, truncate Cluster administration  Balancere, close_region, move, split, …

20 Distributed Networks & Systems Lab Create table called ‘Movie’ with the following schema  2 families ‘info’ with 3 columns: ‘title’, ‘director’, and ‘date’ ‘content with 1 column family: ‘story’ Hbase> create ‘Movie’, {NAME=>’info’}, {NAME=>’content’} Hbase> put ‘Movie’, ‘movie-1’, ‘info:title’, ‘AboutTime’ Hbase> put ‘Movie’, ‘movie-1’, ‘info:director’, ‘Richard Curtis’ Hbase> put ‘Movie’, ‘movie-1’, ‘info:date’, ‘2013’ Hbase> put ‘Movie’, ‘movie-1’, ‘content:summary’, ‘Time traveler story. ’ Movie familiyinfo:Columns: title, director, date contentColumns: story

21 Distributed Networks & Systems Lab Select single row  Hbase> get ‘table’, ‘row_id’ Select specific coloumns  Hbase> get ‘table’, ‘row_id’, {COLUMN=>[‘c1’, ‘c2’]}

22 Distributed Networks & Systems Lab Select specific timestamp or time-range  Hbase> get ‘table’, ‘row_id’, {TIMERANGE=>[ts1,ts2]} Modifying maximum version Select more than one version  Hbase> get ‘table’, ‘row_id’, {VERSIONS=>3}

23 Distributed Networks & Systems Lab Scan an entire table  Hbase> scan ‘table_name’ Limit the number of results  Hbase> scan ‘table_name’, {LIMIT=>1} Scan a range  Hbase> scan ‘Movie’, {STARTROW=>’startRow’, STOPROW=>’stopRow’}

24 Distributed Networks & Systems Lab Applying filters to scan Some filters are included in hbase

25 Distributed Networks & Systems Lab Delete cell by providing table, row id and column coordinates  Delete ‘table’, ‘row_id’, ‘column’  Deletes all versions of that cell Delete only versions before certain timestamp  Delete ‘table’, ‘row_id’, ‘column’, timestamp

26 Distributed Networks & Systems Lab Table should be disabled before dropping  Hbase> disable ‘table’  Hbase> drop ‘table’

27 Distributed Networks & Systems Lab Thank you


Download ppt "Distributed Networks & Systems Lab Distributed Networks and Systems(DNS) Lab, Department of Electronics and Computer Engineering Chonnam National University."

Similar presentations


Ads by Google