Download presentation
Presentation is loading. Please wait.
Published byHerbert Arnold Modified over 8 years ago
1
Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE
2
OUTLINE 開發環境安裝 JDK NetBeans 第一個專案 Hadoop 開發 HBase 開發 架構 基本環境設定 HBase 基本操作 Put Data To HBase Scan Data In HBase HIVE 基本操作
3
NetBeans 介紹 NetBeans 是由昇陽電腦( Sun Microsystems )建立的開放原始碼的軟體開 發工具,是一個開發框架,可擴展的開發平台,可以用於 Java , C 語言/ C++ , PHP , HTML5 等程式的開發,本身是一個開發平台,可以通過擴展外 掛模組來擴展功能。昇陽電腦 Java C 語言 C++ PHP HTML5 在 NetBeans Platform 平台中,應用軟體是用一系列的軟體模組( modular software components )建構出來。而這些模組是一個 jar 檔( Java archive file )它包含了一組 Java 程式的型別而它們實作全依據依 NetBeans 定義了的 公開介面以及一系列用來區分不同模組的定義描述檔( manifest file )。有賴 於模組化帶來的好處,用模組來建構的應用程式可只要加上新的模組就能進 一步擴充。由於模組可以獨立地進行開發,所以由 NetBeans 平台開發出來 的應用程式就能利用著第三方軟體,非常容易及有效率地進行擴充。 Java http://zh.wikipedia.org/zh-tw/NetBeans
4
下載 NetBeans https://netbeans.org/downloads/index.html https://netbeans.org/downloads/index.html
5
下載 JDK http://www.oracle.com/technetwork/java/javase/downloads/index.html
6
安裝 JDK 因我們需要使用 JDK 開發,並且 NetBeans 安裝需要用到,所以需要先安裝 !
7
安裝 JDK
8
JDK 安裝完成
9
NetBeans 安裝
10
這邊需要選擇 JDK 的路徑 請選擇安裝 JDK 的路徑
11
NetBeans 安裝
13
NetBeans 開發環境
14
第一個專案
16
專案名稱
17
第一個專案 執行 程式碼 結果輸出
18
Hadoop 程式開發
19
import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; IMPORT CLASS public class WordCount { // 程式碼 … }
20
Hadoop 程式開發 public static class Map extends Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); } MAP
21
Hadoop 程式開發 public static class Reduce extends Reducer { public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); } Reduce
22
Hadoop 程式開發 public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "WordCount"); job.setJarByClass(WordCount.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); } main
23
Hadoop 程式開發 Build 成 JAR 檔 此為 JAR 檔路徑
24
Hadoop 程式開發 將 JAR 傳至 Hadoop 主機上 並透過以下指令和參數選擇 input 及輸出位置 , 執行 jar 檔 此時可以看到 JOB 執行的訊息 觀看執行結果
25
Hadoop 程式開發 (2) Main import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class MaxTemperature { public static void main(String[] args) throws Exception { if (args.length != 2) { System.err.println("Usage: MaxTemperature "); System.exit(-1); } Job job = new Job(); job.setJarByClass(MaxTemperature.class); job.setJobName("Max temperature"); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setMapperClass(MaxTemperatureMapper.class); job.setReducerClass(MaxTemperatureReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); System.exit(job.waitForCompletion(true) ? 0 : 1); }
26
Hadoop 程式開發 (2) Mapper import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class MaxTemperatureMapper extends Mapper { private static final int MISSING = 9999; @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String year = line.substring(15, 19); int airTemperature; if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs airTemperature = Integer.parseInt(line.substring(88, 92)); } else { airTemperature = Integer.parseInt(line.substring(87, 92)); } String quality = line.substring(92, 93); if (airTemperature != MISSING && quality.matches("[01459]")) { context.write(new Text(year), new IntWritable(airTemperature)); }
27
Hadoop 程式開發 (2) Reducer import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; public class MaxTemperatureReducer extends Reducer { @Override public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int maxValue = Integer.MIN_VALUE; for (IntWritable value : values) { maxValue = Math.max(maxValue, value.get()); } context.write(key, new IntWritable(maxValue)); }
28
Hadoop 程式開發 (2) 透過下列網只取得 sample 檔 https://github.com/tomwhite/hadoop-book/blob/master/input/ncd c/all/1901.gz?raw=true https://github.com/tomwhite/hadoop-book/blob/master/input/ncd c/all/1901.gz?raw=true 將檔案上傳至 HDFS 上 執行 JAR 檔 查 看結果
29
HBase 程式開發
30
基本架構 透過 ZooKeeper 連線至 HBase
31
JAVA JAR ZooKeeper Region Servers HDFS HBase Access HBase Data Use ZooKeeper
32
基本架構 透過 HIVE 連線至 HBase 並使用 SQL 查詢
33
JAVAJDBC ZooKeeper Region Servers HDFS HBase Access HBase Data Use Hive Hive JobTrackerHadoop
34
基礎環境設定 本篇將使用 JAVA 透過 Zookeeper 與 HBase 連線 使用的 JAR 檔 ( 附錄 /lib 中 ) log4j-1.2.17.jar zookeeper-3.4.5.jar hbase-client-0.96.0-hadoop2.jar hadoop-mapreduce-client-core-2.2.0.jar hbase-common-0.96.0-hadoop2.jar ….
35
基礎環境設定 加入 JAR 檔案至專案 右鍵 Add JAR 需先將 JAR 複製至專案目錄 開啟 確認出現在專案中
36
基礎環境設定 IMPORT CLASS import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.HBaseAdmin;
37
HBase 基本操作 連線設定 public static Configuration GetConfig(){ /*Config*/ Configuration hBaseConfig = HBaseConfiguration.create(); // 連線 zookeeper 主機 IP hBaseConfig.set(“hbase.zookeeper.quorum”,“172.24.12.160”); //zookeeper PORT hBaseConfig.set(“hbase.zookeeper.property.clientPort”, “2181”); //hbase.maste hBaseConfig.set(“hbase.master”, “172.24.12.160:9000”); }
38
HBase 基本操作 建立資料表及加入 Family public static void main(String[] args) throws IOException { /* Config */ Configuration hBaseConfig = GetConfig(); HBaseAdmin hBaseAdmin = new HBaseAdmin(hBaseConfig); /*Create Table*/ HTableDescriptor tableDescriptor = new HTableDescriptor("TEST"); /*AddFamily*/ tableDescriptor.addFamily(new HColumnDescriptor("Name")); tableDescriptor.addFamily(new HColumnDescriptor("Birth")); tableDescriptor.addFamily(new HColumnDescriptor("Address")); tableDescriptor.addFamily(new HColumnDescriptor("Sex")); hBaseAdmin.createTable(tableDescriptor); }
39
HBase 基本操作 在資料表中加入 Column public static void main(String[] args) throws IOException { /* Config */ Configuration hBaseConfig = GetConfig(); HBaseAdmin hBaseAdmin = new HBaseAdmin(hBaseConfig); /*addColumn*/ hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Chinese")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Type")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Day")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Home")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Sex")); }
40
HBase 基本操作 HBaseWebUI: http://172.24.12.160:60010/master-status 透過網頁介面可以看到新增的資料表
41
HBase 基本操作 刪除資料表 public static void DeleteTable(String TableName) throws IOException { HBaseAdmin admin = new HBaseAdmin(GetConfig()); admin.disableTable(TableName); admin.deleteTable(TableName); System.out.println("delete table success"); admin.close(); }
42
Put Data to HBase 增加下列 import import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.util.Bytes;
43
Put Data to HBase Put 一個 value 到 HBase HTable table = new HTable(GetConfig(), "TEST");// 指定 table String row="1";// 寫入的 row /* 寫入的 family& 寫入的 column*/ String family[] = {"Name", "Birth","Birth","Address","Sex"}; String column[] = {"Chinese","Type","Day","Home","Sex"}; String value="40";// 寫入的 value byte[] brow = Bytes.toBytes(row); byte[] bfamily = Bytes.toBytes(family[0]); byte[] bcolumn = Bytes.toBytes(column[0]); byte[] bvalue = Bytes.toBytes(value); Put p = new Put(brow); p.add(bfamily, bcolumn, bvalue); table.put(p); TABLE NAME 要寫入的 value
44
Scan Data In HBase 掃描資料表的所有內容 HTablePool pool = new HTablePool(GetConfig(), 1000); HTable table = (HTable) pool.getTable(tableName); Scan scan = new Scan(); ResultScanner rs = table.getScanner(scan); for (Result r : rs) { out.println(new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容 取得 Family 名稱 取得 Value 內容
45
Scan Data In HBase 掃描指定 ROWKEY HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); Get scan = new Get(Rowkey.getBytes());// 根據 ROWKEY 查詢 Result r = table.get(scan); out.println (new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容 取得 Family 名稱 取得 Value 內容
46
Scan Data In HBase 透過關鍵字查詢指定的 Column HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); Filter filter = new SingleColumnValueFilter( Bytes.toBytes(col), Bytes.toBytes("Chinese"), CompareOp.EQUAL, Bytes.toBytes(keyword)); Scan sc = new Scan(); sc.setFilter(filter); ResultScanner rs = table.getScanner(sc); for (Result r : rs) { out.println(new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容 取得 Family 名稱 取得 Value 內容 關鍵字 指定 Column
47
開始使用 HIVE 輸入指令後進入 HIVE SHELL hpc@hpc-hbase0:~$ hive hive>
48
HIVE 基本操作 Creating Hive Tables hive> CREATE TABLE pokes (foo INT, bar STRING); Browsing through Tables hive> SHOW TABLES; Dropping Tables hive> DROP TABLE pokes; INSERT hive> INSERT OVERWRITE TABLE tablename [PARTITON(partcol1=val1,partclo2=val2)]select_statement FROM from_statement insert overwrite table test_insert select * from test_table;
49
HIVE 基本操作 SELECT hive> SELECT * FROM tablename LIMIT 20;
50
HIVE 基本操作 SELECT+WHERE hive> SELECT * FROM tablename WHERE key = 'r1';
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.