Download presentation
Presentation is loading. Please wait.
Published byScott Turner Modified over 8 years ago
1
DataNode 硬碟空間配置 軟體貨櫃主機
2
DataNode 硬碟空間配置 ( 一 ) $ ssh dna1 $ df -h Filesystem Size Used Avail Use% Mounted on rootfs 19G 5.6G 12G 32% / none 19G 5.6G 12G 32% / :: $ hdfs dfsadmin -report :: Name: 172.17.10.20:50010 (dna1) Hostname: dna1 Rack: /17/10 Decommission Status : Normal Configured Capacity: 19674116096 (18.32 GB) DFS Used: 61440 (60 KB) Non DFS Used: 6972833792 (6.49 GB) DFS Remaining: 12701220864 (11.83 GB) DFS Used%: 0.00% DFS Remaining%: 64.56% ::
3
DataNode 硬碟空間配置 ( 二 ) $ sudo nano /opt/conf/A/hdfs-site.xml :: dfs.datanode.du.reserved 8589934592 $ stophdfs a $ starthdfs a $ ssh nna hdfs dfsadmin -report :: Name: 172.17.10.20:50010 (dna1) Hostname: dna1 Rack: /17/10 Decommission Status : Normal Configured Capacity: 11084181504 (10.32 GB) DFS Used: 32768 (32 KB) Non DFS Used: 6974406656 (6.50 GB) DFS Remaining: 4109742080 (3.83 GB) DFS Used%: 0.00% DFS Remaining%: 37.08%
4
管理 YARN 運算資源 軟體貨櫃主機
5
管理 YARN 運算資源 $ sudo nano /opt/conf/A/yarn-site.xml yarn.nodemanager.resource.memory-mb 1024 yarn.nodemanager.resource.cpu-vcores 1 $ curl http://rma:8088/ws/v1/cluster/metrics {"clusterMetrics":{"appsSubmitted":0,"appsCompleted":0, "appsPending":0,"appsRunning":0,"appsFailed":0,"appsKilled":0, "reservedMB":0,"availableMB":2048,"allocatedMB":0,"reservedVirtualCores":0, "availableVirtualCores":2,"allocatedVirtualCores":0,"containersAllocated":0, "containersReserved":0,"containersPending":0,"totalMB":2048,"totalVirtualCores":2, "totalNodes":2,"lostNodes":0,"unhealthyNodes":0,"decommissionedNodes":0, "rebootedNodes":0,"activeNodes":2}}
6
管理 YARN 運算資源 設定 MapReduce 程式記憶體需求 $ cat /opt/conf/A/mapred-site.xml yarn.app.mapreduce.am.resource.mb 512
7
軟體貨櫃主機 新增 Node Manager
8
新增 Node Manger $ sudo nano /opt/hosts-0.2 :: 172.17.10.52 nma3 # node manager :: $ dkcreate a :: Warning: Permanently added 'nma3,172.17.10.52' (ECDSA) to the list of known hosts. nma3 created :: $ dkstart a.yarn bigred@dk:~$ dkstart a.yarn Rma Running Nma1 Running Nma2 Running Nma3 starting java version “1.7.0_79” Scala compiler version 2.11.5 – Copyright 2002-2013, LAMP/EPFL $ startyarn a
9
新增 Node Manger $ ssh rma yarn node -list -all 15/08/19 00:29:26 INFO client.RMProxy: Connecting to ResourceManager at rma/172.17.10.30:8032 Total Nodes:3 Node-Id Node-State Node-Http-Address Number-of-Running-Containers nma1:55970 RUNNING nma1:8042 0 nma2:44551 RUNNING nma2:8042 0 nma3:35229 RUNNING nma3:8042 0 $ curl http://rma:8088/ws/v1/cluster/metrics {"clusterMetrics":{"appsSubmitted":0,"appsCompleted":0,"appsPending":0,"ap psRunning":0,"appsFailed":0,"appsKilled":0,"reservedMB":0,"availableMB":3 072,"allocatedMB":0,"reservedVirtualCores":0,"availableVirtualCores":3,"all ocatedVirtualCores":0,"containersAllocated":0,"containersReserved":0,"contain ersPending":0,"totalMB":3072,"totalVirtualCores":3,"totalNodes":3,"lostNodes" :0,"unhealthyNodes":0,"decommissionedNodes":0,"rebootedNodes":0,"activeN odes":3}}
10
設定 YARN 分散運算 Node Manager 白名單 軟體貨櫃主機
11
設定 YARN 分散運算 – Node Manager 白名單 $ sudo nano /opt/conf/A/yarn.allow nma1 nma2 $ sudo nano /opt/conf/A/yarn-site.xml :: yarn.resourcemanager.nodes.include-path /opt/conf/A/yarn.allow
12
啟用 Node Manager 白名單 $ stopyarn a $ startyarn a $ ssh rma yarn node -list -all 15/08/19 00:45:53 INFO client.RMProxy: Connecting to ResourceManager at rma/172.17.10.30:8032 Total Nodes:2 Node-Id Node-State Node-Http-Address Number-of-Running-Containers nma2:34726 RUNNING nma2:8042 0 nma1:48948 RUNNING nma1:8042 0
13
修改 Node Manager 白名單 $ sudo nano /opt/conf/A/yarn.allow nma1 nma2 Nma3 $ ssh nma3 yarn-daemon.sh start nodemanager $ yarn rmadmin -refreshNodes $ yarn node -list -all 15/08/19 00:52:35 INFO client.RMProxy: Connecting to ResourceManager at rma/172.17.10.30:8032 15/08/19 00:52:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Total Nodes:3 Node-Id Node-State Node-Http-Address Number-of-Running-Containers nma3:37007 RUNNING nma3:8042 0 nma2:34726 RUNNING nma2:8042 0 nma1:48948 RUNNING nma1:8042 0
14
Pig 分析工具 軟體貨櫃主機
15
下載資料 $ ssh ds01@cla01 $ wget http://tobala.net/x/download/hypermarket.csv --2015-08-19 02:44:02-- http://tobala.net/x/download/hypermarket.csv Resolving tobala.net (tobala.net)... 69.89.27.215 Connecting to tobala.net (tobala.net)|69.89.27.215|:80... connected. HTTP request sent, awaiting response... 200 OK :: 2015-08-19 02:44:11 (67.6 KB/s) - ‘hypermarket.csv’ saved [15944/15944] $ hdfs dfs -put hypermarket.csv hypermarket.csv
16
啟動 pig $ pig grunt> clear; grunt> a = load '/data/hypermarket.csv' using PigStorage(','); 2015-08-19 03:03:35,233 [main] INFO org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is deprecated. Instead, use fs.defaultFS grunt> dump a; grunt> b = foreach a generate $0,$1,$2,$3,$4,$5; grunt> dump b; grunt> c = order b by $0 ASC; grunt> dump c; grunt> store d into 'customer.csv' using PigStorage(',');
17
執行 customer.pig $ nano customer.pig a = load 'hypermarket.csv' using PigStorage(','); b = foreach a generate $0,$1,$2,$3,$4,$5; c = order b by $0 ASC; d = filter c by $0 != ' 會員編號 '; store d into 'customer.csv' using PigStorage(',');$ pig -f customer.pig $ pig -f customer.pig
18
HDFS Balancer 軟體貨櫃主機 參考文章 1. Hadoop HDFS Balancer Explained http://www.swiss-scalability.com/2013/08/hadoop-hdfs-balancer-explained.html
19
上載檔案至 HDFS $ ssh cla01 $ dd if=/dev/zero of=foo.bar bs=1M count=250 1500+0 records in 1500+0 records out 1572864000 bytes (1.6 GB) copied, 178.525 s, 8.8 MB/s $ hdfs dfs -put foo.bar /tmp
20
設定 HDFS Balancer 傳輸量 $ sudo nano /opt/A/hdfs-site.xml :: dfs.datanode.balance.bandwidthPerSec 52428800
21
啟動 HDFS Balancer ( 一 ) $ hdfs dfsadmin -report Configured Capacity: 5554536448 (5.17 GB) :: Name: 172.18.1.20:50010 (dna1) :: Configured Capacity: 566042624 (539.82 MB) DFS Used: 528461824 (503.98 MB) Non DFS Used: 0 (0 B) DFS Remaining: 37580800 (35.84 MB) DFS Used%: 93.36% DFS Remaining%: 6.64% :: Name: 172.17.10.30:50010 (dna2) :: Configured Capacity: 2494246912 (2.32 GB) DFS Used: 281980935 (268.92 MB) Non DFS Used: 0 (0 B) DFS Remaining: 2212265977 (2.06 GB) DFS Used%: 11.31% DFS Remaining%: 88.69% :: * 上面 dna1 的 DFS Used% 值為 93.36%, 代表可儲存空間已快用完
22
啟動 HDFS Balancer ( 二 ) $ hdfs balancer -threshold 30 ("-threshold" : Percentage of disk capacity) :: 15/07/10 20:17:38 INFO balancer.Balancer: 1 over-utilized: [172.18.1.20:50010:DISK] 15/07/10 20:17:38 INFO balancer.Balancer: 0 underutilized: [] 15/07/10 20:17:38 INFO balancer.Balancer: Need to move 237.59 MB to make the cluster balanced. 15/07/10 20:17:38 INFO balancer.Balancer: Decided to move 161.95 MB bytes from 172.18.1.20:50010:DISK to 172.17.10.30:50010:DISK 15/07/10 20:17:38 INFO balancer.Balancer: Will move 161.95 MB in this iteration 2015/7/10 下午 08:17:39 0 0 B 237.59 MB 161.95 MB :: 15/07/10 20:17:48 INFO balancer.Balancer: 1 over-utilized: [172.18.1.20:50010:DISK] 15/07/10 20:17:48 INFO balancer.Balancer: 0 underutilized: [] 15/07/10 20:17:48 INFO balancer.Balancer: Need to move 237.59 MB to make the cluster balanced. 15/07/10 20:17:48 INFO balancer.Balancer: Decided to move 161.95 MB bytes from 172.18.1.20:50010:DISK to 172.17.10.30:50010:DISK :: * 172.18.1.20 是 dna1 的 IP 位址
23
HDFS Federation 軟體貨櫃主機 參考文章 1. HDFS Federation Configuration http://datadotz.com/hdfs-federation/
24
建立第二個 HDFS 分散檔案系統 建立第二個 HDFS 所有貨櫃主機 $ sudo nano /opt/hosts-0.2 $ dkcreate b :: clb1 created : dsb01 dsb02 nnb created : dsb01 dsb02 dnb1 created dnb2 created rmb created nmb1 created nmb2 created
25
啟動第二個 HDFS 所有貨櫃主機 $ dkstart b.hdfs nnb starting java version "1.7.0_79" Scala compiler version 2.11.5 -- Copyright 2002-2013, LAMP/EPFL dnb1 starting java version "1.7.0_79" Scala compiler version 2.11.5 -- Copyright 2002-2013, LAMP/EPFL dnb2 starting java version "1.7.0_79" Scala compiler version 2.11.5 -- Copyright 2002-2013, LAMP/EPFL $ formathdfs b myring format (yes/no) yes nnb format ok nnb clean sn dnb1 clean dn dnb2 clean dn
26
設定第二個 HDFS 分散檔案系統 $ sudo nano /opt/conf/B/hdfs-site.xml :: fs.defaultFS hdfs://nnb:8020 fs.default.name hdfs://nnb:8020
27
設定第二個 HDFS 分散檔案系統 $ sudo nano /opt/conf/B/hdfs-site.xml :: dfs.nameservices hdfs1,hdfs2 dfs.namenode.rpc-address.hdfs1 nna:8020 dfs.namenode.rpc-address.hdfs2 nnb:8020
28
啟動 HDFS 分散檔案系統 $ starthdfs b starting namenode, logging to /tmp/hadoop-bigred-namenode-nnb.out starting secondarynamenode, logging to /tmp/hadoop-bigred- secondarynamenode-nnb.out starting datanode, logging to /tmp/hadoop-bigred-datanode-dnb1.out starting datanode, logging to /tmp/hadoop-bigred-datanode-dnb2.out $ ssh nnb $ hdfs dfsadmin -report :: ------------------------------------------------- Live datanodes (2): Name: 172.17.20.30:50010 (dnb1) Hostname: dnb1 :: Name: 172.17.20.31:50010 (dnb2) Hostname: dnb2 ::
29
HDFS Federation 軟體貨櫃主機
30
設定第一個 HDFS 分散檔案系統 $ sudo nano /opt/conf/A/hdfs-site.xml :: dfs.nameservices hdfs1,hdfs2 dfs.namenode.rpc-address.hdfs1 nna:8020 dfs.namenode.rpc-address.hdfs2 nnb:8020
31
重新格式化第一個 HDFS 關閉 NameNode $ hadoop-daemon.sh stop namenode stopping namenode $ hdfs namenode -format -clusterID myring :: 15/07/10 16:40:34 INFO namenode.NNConf: XAttrs enabled? true 15/07/10 16:40:34 INFO namenode.NNConf: Maximum size of an xattr: 16384 Re-format filesystem in Storage Directory /home/pi/nn ? (Y or N) y :: 15/07/10 16:41:34 INFO util.ExitUtil: Exiting with status 0 15/07/10 16:41:34 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at nna/172.18.1.10 ************************************************************/
32
清除 DataNode 的資料目錄 $ ssh dna1 pi@dna1's password: $ hadoop-daemon.sh stop datanode stopping datanode $ rm -r dn $ exit $ ssh dna2 pi@dna1's password: $ hadoop-daemon.sh stop datanode stopping datanode $ rm -r dn $ exit $ ssh dna3 pi@dna1's password: $ hadoop-daemon.sh stop datanode stopping datanode $ rm -r dn $ exit
33
啟動第一個 HDFS 分散檔案系統 $ hadoop-daemon.sh start namenode starting namenode, logging to /tmp/hadoop-pi-namenode-nna.out $ ssh dna1 pi@dna1's password: $ hadoop-daemon.sh start datanode starting datanode, logging to /tmp/hadoop-pi-datanode-dna1.out $ exit $ ssh dna2 pi@dna2's password: $ hadoop-daemon.sh start datanode starting datanode, logging to /tmp/hadoop-pi-datanode-dna2.out $ exit $ ssh dna3 pi@dna3's password: $ hadoop-daemon.sh start datanode starting datanode, logging to /tmp/hadoop-pi-datanode-dna3.out $ exit
34
HDFS Federation 軟體貨櫃主機 參考文章 1. HDFS Federation Configuration http://datadotz.com/hdfs-federation/
35
設定第一個 HDFS 的 Hadoop Client $ sudo nano /opt/conf/A/core-site.xml :: fs.default.name viewfs://myring/ fs.viewfs.mounttable.myring.link./hdfs1 hdfs://nna:8020 fs.viewfs.mounttable.myring.link./hdfs2 hdfs://nnb:8020
36
使用 HDFS 聯邦分散檔案系統 $ ssh cl01 $ hdfs dfs -ls / Found 2 items -r-xr-xr-x - bigred bigred 0 2015-07-11 01:38 /hdfs1 -r-xr-xr-x - bigred bigred 0 2015-07-11 01:38 /hdfs2 $ hdfs dfs -ls -d /hdfs1 drwxr-xr-x - bigred supergroup 0 1970-01-01 08:00 /hdfs1 $ hdfs dfs -mkdir /hdfs1/abc $ hdfs dfs -mkdir /hdfs2/xyz mkdir: Permission denied: user=bigred, access=WRITE, inode="/":pi:supergroup:drwxr-xr-x
37
設定第二個 HDFS 的 Hadoop Client $ sudo nano /opt/pi/hadoop-2.6.0/etc/hadoop/core-site.xml :: fs.default.name viewfs://myring/ fs.viewfs.mounttable.myring.link./hdfs1 hdfs://nna:8020 fs.viewfs.mounttable.myring.link./hdfs2 hdfs://nnb:8020
38
使用 HDFS 聯邦分散檔案系統 $ ssh cla01 $ hdfs dfs -ls / Found 2 items -r-xr-xr-x - bigred bigred 0 2015-07-11 01:38 /hdfs1 -r-xr-xr-x - bigred bigred 0 2015-07-11 01:38 /hdfs2 $ hdfs dfs -ls /hdfs1 Found 1 items drwxr-xr-x - bigred supergroup 0 2015-07-11 01:41 /hdfs1/abc $ hdfs dfs -mkdir /hdfs2/xyz
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.