Download presentation
Presentation is loading. Please wait.
Published byStewart Percival Porter Modified over 9 years ago
1
Distributed Networks & Systems Lab Distributed Networks and Systems(DNS) Lab, Department of Electronics and Computer Engineering Chonnam National University Sungmin Hwang
2
Distributed Networks & Systems Lab Column-oriented data store Distributed Designed for large tables Scalable NoSQL DB No SQL based access Not attached to Relational Model for storage Based on Google’s Bigtable Built on top of HDFS
3
Distributed Networks & Systems Lab Being used by Facebook, Twitter, Yahoo, Netflix, Adobe When to use? Compared to RDBMS, Hbase has very simple and limited API Suitable for large amounts of data Large data Large amounts of clients/requests If data is too small, all the records will end up on a single node Bad for Relational analytics such as join, group by Text-based search access
4
Distributed Networks & Systems Lab Tables contain rows rows – referenced by a unique key (string, long, …) Rows are made of columns which are grouped in column families Data is stored in cells Identified by row _ column-family _ column Columns are grouped into faimilies Family definitions are static Movie familiyinfo:Columns: title, director, date contentColumns: story
5
Distributed Networks & Systems Lab
6
Region – a range of rows stored together Master server – daemon which manages region servers Hbase stores its data into HDFS Hfile – key-value map WAL(write ahead log) - when data is added, it’s also written to WAL When-memory data exceeds maximum value, it is flushed to an HFile Hbase utilizes Zookeeper for distributed coordination
7
Distributed Networks & Systems Lab
8
Hbase Shell Native Java API HBql Restful API
9
Distributed Networks & Systems Lab
10
Same environment with previous Hadoop assignment Pseudo-distributed mode Hadoop 1.0.3 Java 1.7 Ubuntu 12.04 LTS
11
Distributed Networks & Systems Lab Download hbase Recent version – 0.98.7 http://www.interior- dsgn.com/apache/hbase/hbase-0.98.7/ http://www.interior- dsgn.com/apache/hbase/hbase-0.98.7/ In the folder, there are 2 versions hbase-0.98.7-hadoop1-bin.tar.gz hbase-0.98.7-hadoop2-bin.tar.gz Each number represents Hadoop version In here, as use hadoop1.03, we download version 1 wget http://www.interior- dsgn.com/apache/hbase/hbase-0.98.7/hbase- 0.98.7-hadoop1-bin.tar.gz
12
Distributed Networks & Systems Lab extract $ tar zxvf hbase-0.98.7-hadoop1-bin.tar.gz $ cd hbase-0.98.7-hadoop1 Configure JAVA_HOME directory $ vim conf/hbase-env.sh In hbase-env.sh, remove comment where the line starts with JAVA_HOME, and set up the path for java
13
Distributed Networks & Systems Lab hbase-site.xml $ vim conf/hbase-site.xml Configuration for pseudo-distributed mode
14
Distributed Networks & Systems Lab For starting, $./bin/start-hbase.sh For stopping Hbase, $./bin/stop-hbase.sh
15
Distributed Networks & Systems Lab Both Master and Region servers run web server Master: http://localhost:60010http://localhost:60010
16
Distributed Networks & Systems Lab Both Master and Region servers run web server Region server: http://localhost:60030http://localhost:60030
17
Distributed Networks & Systems Lab $./bin/hbase shell hbase> help “command” to get detailed use of commands example ) hbase> help “get”
18
Distributed Networks & Systems Lab Jruby IRB (Interactive Ruby shell) + hbase commands Quote all names Table and column names Single quotes for text Hbase> create 'test', 'cf‘ Double quotes for binary Use hexadecimal representation of that binary value Specifying parameters {‘key1’ => ‘value1’, ‘key2’ => ‘value2’, …} Example: Hbase> get ‘UserTable’, ‘userId1’, {COLUMN => ‘address:str}
19
Distributed Networks & Systems Lab General status, version DDL Alter, create, describe, disable, drop, enable, exists, list DML Count, delete, deleteall, get, get_counter, incr, put, scan, truncate Cluster administration Balancere, close_region, move, split, …
20
Distributed Networks & Systems Lab Create table called ‘Movie’ with the following schema 2 families ‘info’ with 3 columns: ‘title’, ‘director’, and ‘date’ ‘content with 1 column family: ‘story’ Hbase> create ‘Movie’, {NAME=>’info’}, {NAME=>’content’} Hbase> put ‘Movie’, ‘movie-1’, ‘info:title’, ‘AboutTime’ Hbase> put ‘Movie’, ‘movie-1’, ‘info:director’, ‘Richard Curtis’ Hbase> put ‘Movie’, ‘movie-1’, ‘info:date’, ‘2013’ Hbase> put ‘Movie’, ‘movie-1’, ‘content:summary’, ‘Time traveler story. ’ Movie familiyinfo:Columns: title, director, date contentColumns: story
21
Distributed Networks & Systems Lab Select single row Hbase> get ‘table’, ‘row_id’ Select specific coloumns Hbase> get ‘table’, ‘row_id’, {COLUMN=>[‘c1’, ‘c2’]}
22
Distributed Networks & Systems Lab Select specific timestamp or time-range Hbase> get ‘table’, ‘row_id’, {TIMERANGE=>[ts1,ts2]} Modifying maximum version Select more than one version Hbase> get ‘table’, ‘row_id’, {VERSIONS=>3}
23
Distributed Networks & Systems Lab Scan an entire table Hbase> scan ‘table_name’ Limit the number of results Hbase> scan ‘table_name’, {LIMIT=>1} Scan a range Hbase> scan ‘Movie’, {STARTROW=>’startRow’, STOPROW=>’stopRow’}
24
Distributed Networks & Systems Lab Applying filters to scan Some filters are included in hbase
25
Distributed Networks & Systems Lab Delete cell by providing table, row id and column coordinates Delete ‘table’, ‘row_id’, ‘column’ Deletes all versions of that cell Delete only versions before certain timestamp Delete ‘table’, ‘row_id’, ‘column’, timestamp
26
Distributed Networks & Systems Lab Table should be disabled before dropping Hbase> disable ‘table’ Hbase> drop ‘table’
27
Distributed Networks & Systems Lab Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.