Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE.

Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE

OUTLINE  開發環境安裝  JDK  NetBeans  第一個專案  Hadoop 開發  HBase 開發  架構  基本環境設定  HBase 基本操作  Put Data To HBase  Scan Data In HBase  HIVE 基本操作

NetBeans 介紹  NetBeans 是由昇陽電腦（ Sun Microsystems ）建立的開放原始碼的軟體開發工具，是一個開發框架，可擴展的開發平台，可以用於 Java ， C 語言／ C++ ， PHP ， HTML5 等程式的開發，本身是一個開發平台，可以通過擴展外掛模組來擴展功能。昇陽電腦 Java C 語言 C++ PHP HTML5  在 NetBeans Platform 平台中，應用軟體是用一系列的軟體模組（ modular software components ）建構出來。而這些模組是一個 jar 檔（ Java archive file ）它包含了一組 Java 程式的型別而它們實作全依據依 NetBeans 定義了的公開介面以及一系列用來區分不同模組的定義描述檔（ manifest file ）。有賴於模組化帶來的好處，用模組來建構的應用程式可只要加上新的模組就能進一步擴充。由於模組可以獨立地進行開發，所以由 NetBeans 平台開發出來的應用程式就能利用著第三方軟體，非常容易及有效率地進行擴充。 Java http://zh.wikipedia.org/zh-tw/NetBeans

下載 NetBeans  https://netbeans.org/downloads/index.html https://netbeans.org/downloads/index.html

下載 JDK  http://www.oracle.com/technetwork/java/javase/downloads/index.html

安裝 JDK 因我們需要使用 JDK 開發，並且 NetBeans 安裝需要用到，所以需要先安裝 !

安裝 JDK

JDK 安裝完成

NetBeans 安裝

這邊需要選擇 JDK 的路徑請選擇安裝 JDK 的路徑

NetBeans 安裝

NetBeans 開發環境

第一個專案

專案名稱

第一個專案執行程式碼結果輸出

Hadoop 程式開發

import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;  IMPORT CLASS public class WordCount { // 程式碼 … }

Hadoop 程式開發 public static class Map extends Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); }  MAP

Hadoop 程式開發 public static class Reduce extends Reducer { public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); }  Reduce

Hadoop 程式開發 public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "WordCount"); job.setJarByClass(WordCount.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); }  main

Hadoop 程式開發 Build 成 JAR 檔此為 JAR 檔路徑

Hadoop 程式開發  將 JAR 傳至 Hadoop 主機上  並透過以下指令和參數選擇 input 及輸出位置，執行 jar 檔  此時可以看到 JOB 執行的訊息  觀看執行結果

Hadoop 程式開發 (2)  Main import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class MaxTemperature { public static void main(String[] args) throws Exception { if (args.length != 2) { System.err.println("Usage: MaxTemperature "); System.exit(-1); } Job job = new Job(); job.setJarByClass(MaxTemperature.class); job.setJobName("Max temperature"); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setMapperClass(MaxTemperatureMapper.class); job.setReducerClass(MaxTemperatureReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); System.exit(job.waitForCompletion(true) ? 0 : 1); }

Hadoop 程式開發 (2)  Mapper import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class MaxTemperatureMapper extends Mapper { private static final int MISSING = 9999; @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String year = line.substring(15, 19); int airTemperature; if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs airTemperature = Integer.parseInt(line.substring(88, 92)); } else { airTemperature = Integer.parseInt(line.substring(87, 92)); } String quality = line.substring(92, 93); if (airTemperature != MISSING && quality.matches("[01459]")) { context.write(new Text(year), new IntWritable(airTemperature)); }

Hadoop 程式開發 (2)  Reducer import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; public class MaxTemperatureReducer extends Reducer { @Override public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int maxValue = Integer.MIN_VALUE; for (IntWritable value : values) { maxValue = Math.max(maxValue, value.get()); } context.write(key, new IntWritable(maxValue)); }

Hadoop 程式開發 (2)  透過下列網只取得 sample 檔  https://github.com/tomwhite/hadoop-book/blob/master/input/ncd c/all/1901.gz?raw=true https://github.com/tomwhite/hadoop-book/blob/master/input/ncd c/all/1901.gz?raw=true  將檔案上傳至 HDFS 上  執行 JAR 檔  查看結果

HBase 程式開發

基本架構透過 ZooKeeper 連線至 HBase

JAVA JAR ZooKeeper Region Servers HDFS HBase Access HBase Data Use ZooKeeper

基本架構透過 HIVE 連線至 HBase 並使用 SQL 查詢

JAVAJDBC ZooKeeper Region Servers HDFS HBase Access HBase Data Use Hive Hive JobTrackerHadoop

基礎環境設定  本篇將使用 JAVA 透過 Zookeeper 與 HBase 連線  使用的 JAR 檔 ( 附錄 /lib 中 )  log4j-1.2.17.jar  zookeeper-3.4.5.jar  hbase-client-0.96.0-hadoop2.jar  hadoop-mapreduce-client-core-2.2.0.jar  hbase-common-0.96.0-hadoop2.jar ….

基礎環境設定  加入 JAR 檔案至專案右鍵 Add JAR 需先將 JAR 複製至專案目錄開啟確認出現在專案中

基礎環境設定  IMPORT CLASS import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.HBaseAdmin;

HBase 基本操作  連線設定 public static Configuration GetConfig(){ /*Config*/ Configuration hBaseConfig = HBaseConfiguration.create(); // 連線 zookeeper 主機 IP hBaseConfig.set(“hbase.zookeeper.quorum”,“172.24.12.160”); //zookeeper PORT hBaseConfig.set(“hbase.zookeeper.property.clientPort”, “2181”); //hbase.maste hBaseConfig.set(“hbase.master”, “172.24.12.160:9000”); }

HBase 基本操作  建立資料表及加入 Family public static void main(String[] args) throws IOException { /* Config */ Configuration hBaseConfig = GetConfig(); HBaseAdmin hBaseAdmin = new HBaseAdmin(hBaseConfig); /*Create Table*/ HTableDescriptor tableDescriptor = new HTableDescriptor("TEST"); /*AddFamily*/ tableDescriptor.addFamily(new HColumnDescriptor("Name")); tableDescriptor.addFamily(new HColumnDescriptor("Birth")); tableDescriptor.addFamily(new HColumnDescriptor("Address")); tableDescriptor.addFamily(new HColumnDescriptor("Sex")); hBaseAdmin.createTable(tableDescriptor); }

HBase 基本操作  在資料表中加入 Column public static void main(String[] args) throws IOException { /* Config */ Configuration hBaseConfig = GetConfig(); HBaseAdmin hBaseAdmin = new HBaseAdmin(hBaseConfig); /*addColumn*/ hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Chinese")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Type")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Day")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Home")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Sex")); }

HBase 基本操作  HBaseWebUI: http://172.24.12.160:60010/master-status 透過網頁介面可以看到新增的資料表

HBase 基本操作  刪除資料表 public static void DeleteTable(String TableName) throws IOException { HBaseAdmin admin = new HBaseAdmin(GetConfig()); admin.disableTable(TableName); admin.deleteTable(TableName); System.out.println("delete table success"); admin.close(); }

Put Data to HBase  增加下列 import import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.util.Bytes;

Put Data to HBase  Put 一個 value 到 HBase HTable table = new HTable(GetConfig(), "TEST");// 指定 table String row="1";// 寫入的 row /* 寫入的 family& 寫入的 column*/ String family[] = {"Name", "Birth","Birth","Address","Sex"}; String column[] = {"Chinese","Type","Day","Home","Sex"}; String value="40";// 寫入的 value byte[] brow = Bytes.toBytes(row); byte[] bfamily = Bytes.toBytes(family[0]); byte[] bcolumn = Bytes.toBytes(column[0]); byte[] bvalue = Bytes.toBytes(value); Put p = new Put(brow); p.add(bfamily, bcolumn, bvalue); table.put(p); TABLE NAME 要寫入的 value

Scan Data In HBase  掃描資料表的所有內容 HTablePool pool = new HTablePool(GetConfig(), 1000); HTable table = (HTable) pool.getTable(tableName); Scan scan = new Scan(); ResultScanner rs = table.getScanner(scan); for (Result r : rs) { out.println(new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容取得 Family 名稱取得 Value 內容

Scan Data In HBase  掃描指定 ROWKEY HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); Get scan = new Get(Rowkey.getBytes());// 根據 ROWKEY 查詢 Result r = table.get(scan); out.println (new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容取得 Family 名稱取得 Value 內容

Scan Data In HBase  透過關鍵字查詢指定的 Column HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); Filter filter = new SingleColumnValueFilter( Bytes.toBytes(col), Bytes.toBytes("Chinese"), CompareOp.EQUAL, Bytes.toBytes(keyword)); Scan sc = new Scan(); sc.setFilter(filter); ResultScanner rs = table.getScanner(sc); for (Result r : rs) { out.println(new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容取得 Family 名稱取得 Value 內容關鍵字指定 Column

開始使用 HIVE  輸入指令後進入 HIVE SHELL hpc@hpc-hbase0:~$ hive hive>

HIVE 基本操作  Creating Hive Tables hive> CREATE TABLE pokes (foo INT, bar STRING);  Browsing through Tables hive> SHOW TABLES;  Dropping Tables hive> DROP TABLE pokes;  INSERT hive> INSERT OVERWRITE TABLE tablename [PARTITON(partcol1=val1,partclo2=val2)]select_statement FROM from_statement insert overwrite table test_insert select * from test_table;

HIVE 基本操作  SELECT hive> SELECT * FROM tablename LIMIT 20;

HIVE 基本操作  SELECT+WHERE hive> SELECT * FROM tablename WHERE key = 'r1';

Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE.

Similar presentations

Presentation on theme: "Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE.

Similar presentations

Presentation on theme: "Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE."— Presentation transcript:

Similar presentations

About project

Feedback