Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE.

Similar presentations


Presentation on theme: "Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE."— Presentation transcript:

1 Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE

2 OUTLINE  開發環境安裝  JDK  NetBeans  第一個專案  Hadoop 開發  HBase 開發  架構  基本環境設定  HBase 基本操作  Put Data To HBase  Scan Data In HBase  HIVE 基本操作

3 NetBeans 介紹  NetBeans 是由昇陽電腦( Sun Microsystems )建立的開放原始碼的軟體開 發工具,是一個開發框架,可擴展的開發平台,可以用於 Java , C 語言/ C++ , PHP , HTML5 等程式的開發,本身是一個開發平台,可以通過擴展外 掛模組來擴展功能。昇陽電腦 Java C 語言 C++ PHP HTML5  在 NetBeans Platform 平台中,應用軟體是用一系列的軟體模組( modular software components )建構出來。而這些模組是一個 jar 檔( Java archive file )它包含了一組 Java 程式的型別而它們實作全依據依 NetBeans 定義了的 公開介面以及一系列用來區分不同模組的定義描述檔( manifest file )。有賴 於模組化帶來的好處,用模組來建構的應用程式可只要加上新的模組就能進 一步擴充。由於模組可以獨立地進行開發,所以由 NetBeans 平台開發出來 的應用程式就能利用著第三方軟體,非常容易及有效率地進行擴充。 Java http://zh.wikipedia.org/zh-tw/NetBeans

4 下載 NetBeans  https://netbeans.org/downloads/index.html https://netbeans.org/downloads/index.html

5 下載 JDK  http://www.oracle.com/technetwork/java/javase/downloads/index.html

6 安裝 JDK 因我們需要使用 JDK 開發,並且 NetBeans 安裝需要用到,所以需要先安裝 !

7 安裝 JDK

8 JDK 安裝完成

9 NetBeans 安裝

10 這邊需要選擇 JDK 的路徑 請選擇安裝 JDK 的路徑

11 NetBeans 安裝

12

13 NetBeans 開發環境

14 第一個專案

15

16 專案名稱

17 第一個專案 執行 程式碼 結果輸出

18 Hadoop 程式開發

19 import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;  IMPORT CLASS public class WordCount { // 程式碼 … }

20 Hadoop 程式開發 public static class Map extends Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); context.write(word, one); }  MAP

21 Hadoop 程式開發 public static class Reduce extends Reducer { public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } context.write(key, new IntWritable(sum)); }  Reduce

22 Hadoop 程式開發 public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); Job job = new Job(conf, "WordCount"); job.setJarByClass(WordCount.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.waitForCompletion(true); }  main

23 Hadoop 程式開發 Build 成 JAR 檔 此為 JAR 檔路徑

24 Hadoop 程式開發  將 JAR 傳至 Hadoop 主機上  並透過以下指令和參數選擇 input 及輸出位置 , 執行 jar 檔  此時可以看到 JOB 執行的訊息  觀看執行結果

25 Hadoop 程式開發 (2)  Main import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; public class MaxTemperature { public static void main(String[] args) throws Exception { if (args.length != 2) { System.err.println("Usage: MaxTemperature "); System.exit(-1); } Job job = new Job(); job.setJarByClass(MaxTemperature.class); job.setJobName("Max temperature"); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); job.setMapperClass(MaxTemperatureMapper.class); job.setReducerClass(MaxTemperatureReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); System.exit(job.waitForCompletion(true) ? 0 : 1); }

26 Hadoop 程式開發 (2)  Mapper import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; public class MaxTemperatureMapper extends Mapper { private static final int MISSING = 9999; @Override public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String line = value.toString(); String year = line.substring(15, 19); int airTemperature; if (line.charAt(87) == '+') { // parseInt doesn't like leading plus signs airTemperature = Integer.parseInt(line.substring(88, 92)); } else { airTemperature = Integer.parseInt(line.substring(87, 92)); } String quality = line.substring(92, 93); if (airTemperature != MISSING && quality.matches("[01459]")) { context.write(new Text(year), new IntWritable(airTemperature)); }

27 Hadoop 程式開發 (2)  Reducer import java.io.IOException; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Reducer; public class MaxTemperatureReducer extends Reducer { @Override public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int maxValue = Integer.MIN_VALUE; for (IntWritable value : values) { maxValue = Math.max(maxValue, value.get()); } context.write(key, new IntWritable(maxValue)); }

28 Hadoop 程式開發 (2)  透過下列網只取得 sample 檔  https://github.com/tomwhite/hadoop-book/blob/master/input/ncd c/all/1901.gz?raw=true https://github.com/tomwhite/hadoop-book/blob/master/input/ncd c/all/1901.gz?raw=true  將檔案上傳至 HDFS 上  執行 JAR 檔  查 看結果

29 HBase 程式開發

30 基本架構 透過 ZooKeeper 連線至 HBase

31 JAVA JAR ZooKeeper Region Servers HDFS HBase Access HBase Data Use ZooKeeper

32 基本架構 透過 HIVE 連線至 HBase 並使用 SQL 查詢

33 JAVAJDBC ZooKeeper Region Servers HDFS HBase Access HBase Data Use Hive Hive JobTrackerHadoop

34 基礎環境設定  本篇將使用 JAVA 透過 Zookeeper 與 HBase 連線  使用的 JAR 檔 ( 附錄 /lib 中 )  log4j-1.2.17.jar  zookeeper-3.4.5.jar  hbase-client-0.96.0-hadoop2.jar  hadoop-mapreduce-client-core-2.2.0.jar  hbase-common-0.96.0-hadoop2.jar ….

35 基礎環境設定  加入 JAR 檔案至專案 右鍵 Add JAR 需先將 JAR 複製至專案目錄 開啟 確認出現在專案中

36 基礎環境設定  IMPORT CLASS import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.HColumnDescriptor; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.HBaseAdmin;

37 HBase 基本操作  連線設定 public static Configuration GetConfig(){ /*Config*/ Configuration hBaseConfig = HBaseConfiguration.create(); // 連線 zookeeper 主機 IP hBaseConfig.set(“hbase.zookeeper.quorum”,“172.24.12.160”); //zookeeper PORT hBaseConfig.set(“hbase.zookeeper.property.clientPort”, “2181”); //hbase.maste hBaseConfig.set(“hbase.master”, “172.24.12.160:9000”); }

38 HBase 基本操作  建立資料表及加入 Family public static void main(String[] args) throws IOException { /* Config */ Configuration hBaseConfig = GetConfig(); HBaseAdmin hBaseAdmin = new HBaseAdmin(hBaseConfig); /*Create Table*/ HTableDescriptor tableDescriptor = new HTableDescriptor("TEST"); /*AddFamily*/ tableDescriptor.addFamily(new HColumnDescriptor("Name")); tableDescriptor.addFamily(new HColumnDescriptor("Birth")); tableDescriptor.addFamily(new HColumnDescriptor("Address")); tableDescriptor.addFamily(new HColumnDescriptor("Sex")); hBaseAdmin.createTable(tableDescriptor); }

39 HBase 基本操作  在資料表中加入 Column public static void main(String[] args) throws IOException { /* Config */ Configuration hBaseConfig = GetConfig(); HBaseAdmin hBaseAdmin = new HBaseAdmin(hBaseConfig); /*addColumn*/ hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Chinese")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Type")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Day")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Home")); hBaseAdmin.addColumn("TEST", new HColumnDescriptor ("Sex")); }

40 HBase 基本操作  HBaseWebUI: http://172.24.12.160:60010/master-status 透過網頁介面可以看到新增的資料表

41 HBase 基本操作  刪除資料表 public static void DeleteTable(String TableName) throws IOException { HBaseAdmin admin = new HBaseAdmin(GetConfig()); admin.disableTable(TableName); admin.deleteTable(TableName); System.out.println("delete table success"); admin.close(); }

42 Put Data to HBase  增加下列 import import org.apache.hadoop.hbase.client.HTable; import org.apache.hadoop.hbase.client.Put; import org.apache.hadoop.hbase.util.Bytes;

43 Put Data to HBase  Put 一個 value 到 HBase HTable table = new HTable(GetConfig(), "TEST");// 指定 table String row="1";// 寫入的 row /* 寫入的 family& 寫入的 column*/ String family[] = {"Name", "Birth","Birth","Address","Sex"}; String column[] = {"Chinese","Type","Day","Home","Sex"}; String value="40";// 寫入的 value byte[] brow = Bytes.toBytes(row); byte[] bfamily = Bytes.toBytes(family[0]); byte[] bcolumn = Bytes.toBytes(column[0]); byte[] bvalue = Bytes.toBytes(value); Put p = new Put(brow); p.add(bfamily, bcolumn, bvalue); table.put(p); TABLE NAME 要寫入的 value

44 Scan Data In HBase  掃描資料表的所有內容 HTablePool pool = new HTablePool(GetConfig(), 1000); HTable table = (HTable) pool.getTable(tableName); Scan scan = new Scan(); ResultScanner rs = table.getScanner(scan); for (Result r : rs) { out.println(new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容 取得 Family 名稱 取得 Value 內容

45 Scan Data In HBase  掃描指定 ROWKEY HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); Get scan = new Get(Rowkey.getBytes());// 根據 ROWKEY 查詢 Result r = table.get(scan); out.println (new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容 取得 Family 名稱 取得 Value 內容

46 Scan Data In HBase  透過關鍵字查詢指定的 Column HTablePool pool = new HTablePool(configuration, 1000); HTable table = (HTable) pool.getTable(tableName); Filter filter = new SingleColumnValueFilter( Bytes.toBytes(col), Bytes.toBytes("Chinese"), CompareOp.EQUAL, Bytes.toBytes(keyword)); Scan sc = new Scan(); sc.setFilter(filter); ResultScanner rs = table.getScanner(sc); for (Result r : rs) { out.println(new String(r.getRow(),"UTF-8")); for (KeyValue keyValue : r.raw()) { out.println(new String(keyValue.getFamily(),"UTF-8")); out.println(new String(keyValue.getValue(),"UTF-8")); } TABLE NAME 取得 ROW KEY 逐一掃瞄每個列的內容 取得 Family 名稱 取得 Value 內容 關鍵字 指定 Column

47 開始使用 HIVE  輸入指令後進入 HIVE SHELL hpc@hpc-hbase0:~$ hive hive>

48 HIVE 基本操作  Creating Hive Tables hive> CREATE TABLE pokes (foo INT, bar STRING);  Browsing through Tables hive> SHOW TABLES;  Dropping Tables hive> DROP TABLE pokes;  INSERT hive> INSERT OVERWRITE TABLE tablename [PARTITON(partcol1=val1,partclo2=val2)]select_statement FROM from_statement insert overwrite table test_insert select * from test_table;

49 HIVE 基本操作  SELECT hive> SELECT * FROM tablename LIMIT 20;

50 HIVE 基本操作  SELECT+WHERE hive> SELECT * FROM tablename WHERE key = 'r1';


Download ppt "Hadoop&Hbase Developed Using JAVA USE NETBEANS IDE."

Similar presentations


Ads by Google