Download presentation
Presentation is loading. Please wait.
Published byJohn McCormick Modified over 9 years ago
1
Hadoop Demo Presented by: Imranul Hoque 1
2
Topics Hadoop running modes – Stand alone – Pseudo distributed – Cluster Running MapReduce jobs Status/logs Sample MapReduce code 2
3
Required Software Hadoop (release 0.18.3) – http://apache.osuosl.org/hadoop/core/hadoop- 0.18.3/hadoop-0.18.3.tar.gz http://apache.osuosl.org/hadoop/core/hadoop- 0.18.3/hadoop-0.18.3.tar.gz Java Development Kit (jdk 1.6.0_01) – http://java.sun.com/javase/downloads/index.jsp http://java.sun.com/javase/downloads/index.jsp Ant (ant 1.7.1) – http://apache.inetbridge.net/ant/binaries/apache -ant-1.7.1-bin.tar.gz http://apache.inetbridge.net/ant/binaries/apache -ant-1.7.1-bin.tar.gz 3
4
Setup NameNode: sherpa01JobTracker: sherpa02 DataNode/TaskTracker: sherpa05, sherpa06 4
5
Assumptions ssh must be installed and sshd must be running Shared home directory (nfs) across all nodes in the cluster (makes life easier) 5
6
Steps Install JDK, ant Passphraseless ssh Compiling Hadoop Setting up config parameters Starting up Hadoop Running jobs Job status 6
7
Passphraseless ssh SourceDestination 1.Generate private-public key-pair 2.~/.ssh/id_dsa and ~/.ssh/id_dsa.pub 3.Send the public key to Destination 3.Add the public key to the authorized key list ~/.ssh/authorized_keys 7
8
Passphraseless ssh (2) NFS 1.ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa 2.cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys (four times) 3.Modify hostname in authorized_keys sherpa01sherpa02sherpa05sherpa06 Add “StrictHostKeyChecking no” in /etc/ssh/ssh_config to turn off prompt 8
9
Setting the PATH JAVA_HOME=/usr/java/jdk1.6.0_01 ANT_HOME=~/ant PATH=/usr/java/jdk1.6.0_01/bin:$PATH PATH=~/ant/bin:$PATH 9
10
Installing and Configuring Hadoop Extract Build (ant) Modify conf/hadoop-env.sh: – export JAVA_HOME=/usr/java/jdk1.6.0_01 Inform Hadoop of the Masters and Slaves – conf/masters – conf/slaves Modify conf/hadoop-site.xml 10
11
Rack Awareness topology.script.file.name conf/fakedns.sh In fakedns.sh: – echo /rack_id 11
12
Staring Hadoop Format Namenode FS (sherpa01): – bin/hadoop namenode -format From NameNode (sherpa01): – bin/start-dfs.sh From JobTracker (sherpa02): – bin/start-mapred.sh 12
13
Running MapReduce Copy data to HDFS – bin/hadoop dfs -copyFromLocal ~/data gutenberg Run MapReduce – bin/hadoop jar hadoop-0.18.3-examples.jar wordcount -r 6 gutenberg gutenberg-output Some HDFS commands – copyToLocal, cat, cp, rm, du, ls, etc. 13
14
Job/Node Status NameNode: – http://sherpa01.cs.uiuc.edu:50001 http://sherpa01.cs.uiuc.edu:50001 DataNode: – http://sherpa02.cs.uiuc.edu:50002 http://sherpa02.cs.uiuc.edu:50002 Also look at the logs: – logs/ 14
15
WordCount.java src/examples/org/apache/hadoop/examples/ WordCount.java – Map function – Reduce function – Driver function 15
16
Shutdown From NameNode (sherpa01): – bin/stop-dfs.sh From JobTracker (sherpa02): – bin/stop-mapred.sh 16
17
Conclusion For more details: – http://hadoop.apache.org/core/ http://hadoop.apache.org/core/ – http://wiki.apache.org/hadoop/ http://wiki.apache.org/hadoop/ 17
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.