Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working with Hadoop. Requirement Virtual machine software –VM Ware –VirtualBox Virtual machine images –Download from Cloudera (Founded by leaders in the.

Similar presentations


Presentation on theme: "Working with Hadoop. Requirement Virtual machine software –VM Ware –VirtualBox Virtual machine images –Download from Cloudera (Founded by leaders in the."— Presentation transcript:

1 Working with Hadoop

2 Requirement Virtual machine software –VM Ware –VirtualBox Virtual machine images –Download from Cloudera (Founded by leaders in the field, including father of Hadoop)

3 Start the Virtual Machine

4 Inside the Virtual machine CentOS 6.4 JDK Hadoop 2.5.0 Eclipse 4.2.6 (Juno)

5 Basics of HDFS (routine) 5 With Terminal –hadoop –hadoop version –hadoop jar –hadoop fs … –hadoop fs -ls : List all file in HDFS –hadoop fs –put / -get / -mkdir / -rmdir...

6 Copy Files from Windows to VM WinSCP (see Demo at bin\scp_ssh\winscp575) –Protocol scp –Hostname (Get from ifconfig in Terminal) –Username/Passoword = cloudera/cloudera 6

7 Copy Files from VM (CentOS) to HDFS hadoop fs -put localfiles /user/cloudera 7

8 Copy Files from Windows to HDFS Via HUE services 8

9 Using web server – port 8888 (File manager)

10 Hadoop Administration http://hostname:50070/dfshealth.html#tab-overview 10

11 WordCount Example in Hadoop #1: Via guidelines in Cloudera website #2: Directly in Eclipse (Preferred)

12 WordCount in Cloudera Website http://www.cloudera.com/content/cloudera/en/documentation /hadoop-tutorial/CDH5/Hadoop- Tutorial/ht_wordcount1.html Source code downloaded from http://tiny.cloudera.com/hadoopTutorialSample Source code details and explanations: http://www.cloudera.com/content/cloudera/en/documentation /hadoop-tutorial/CDH5/Hadoop- Tutorial/ht_wordcount1_source.html 12

13 WordCount in Cloudera Website Create directory in HDFS –$ hadoop fs -mkdir /user/cloudera –$ hadoop fs -chown cloudera /user/cloudera –$ hadoop fs -mkdir /user/cloudera/wordcount /user/cloudera/wordcount/input Create sample text –1: Directly in CentOS $ $ echo "Hadoop is an elephant" > file0 $ echo "Hadoop is as yellow as can be" > file1 $ echo "Oh what a yellow fellow is Hadoop" > file2 And then move to HDFS $ hadoop fs -put file* /user/cloudera/wordcount/input –2: Create in Windows and Copy to HDFS via HUE 13

14 WordCount in Cloudera Website Compilation error 14

15 WordCount Example in Hadoop #1: Via guidelines in Cloudera website #2: Directly in Eclipse (Preferred)

16 WordCount in Eclipse environment http://kishorer.in/2014/10/22/running-a-wordcount- mapreduce-example-in-hadoop-2-4-1-single-node-cluster-in- ubuntu-14-04-64-bit/ https://www.youtube.com/watch?v=hJsaChh2Yhk (Some parts are different for ClouderaVM) 16

17

18 18

19 19

20 Update source codes (from website) 20

21 Adding JAR files to Project 21

22 usr/lib/hadoop; usr/lib/hadoop/lib; usr/lib/hadoop-mapreduce; usr/lib/hadoop-mapreduce/lib 22

23 Run Config Run  Run Configurations 23

24 File  Export  24

25 25

26 Update Properties in jar file 26

27 Prepare for run Make HDFS directory 27

28 Copy sample input to HDFS (via HUE) 28

29 Run the example (in.jar folder) (Make sure to remove output folder before use) 29

30 View the result 30

31 Other sources Very nice example @ https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html 31


Download ppt "Working with Hadoop. Requirement Virtual machine software –VM Ware –VirtualBox Virtual machine images –Download from Cloudera (Founded by leaders in the."

Similar presentations


Ads by Google