Presentation is loading. Please wait.

Presentation is loading. Please wait.

Three modes of Hadoop.

Similar presentations


Presentation on theme: "Three modes of Hadoop."— Presentation transcript:

1 Three modes of Hadoop

2 Three modes of Hadoop Standalone mode Pseudo-distributed mode
There are no daemons running and everything runs in a single JVM. Standalone mode is suitable for running MapReduce programs during development, since it is easy to test and debug them. Pseudo-distributed mode The Hadoop daemons run on the local machine, thus simulating a cluster on a small scale. Fully distributed mode The Hadoop daemons run on a cluster of machines.

3 Configurations To run Hadoop in a particular mode, you need to do two things: set the appropriate properties, and start the Hadoop daemons.

4 Standalone mode In standalone mode, there is no further action to take, since the default properties are set for standalone mode and there are no daemons to run.

5 Pseudo-distributed mode
In pseudo distributed mode, the configuration files should be created with the following contents and placed in the etc/hadoop directory. Configuration files include: core-site.xml : common configuration hdfs-site.xml mapred-site.xml

6 Pseudo-distributed mode: core-site.xml

7 Pseudo-distributed mode: hdfs-site.xml

8 Pseudo-distributed mode: mapred-site.xml

9 Pseudo-distributed mode: yarn-site.xml

10 Pseudo-distributed mode
Hadoop doesn’t actually distinguish between pseudo-distributed and fully distributed modes. Pseudo-distributed mode is just a special case of fully distributed mode in which the host is localhost. So we need to make sure that we can SSH to the localhost without having to enter password.

11 Pseudo-distributed mode
$ sudo apt-get install ssh install ssh $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa generate ssh key pairs $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys put the public key in authorized_keys file $ ssh localhost SSH to the localhost

12 Pseudo-distributed mode: Format namenode
Before HDFS can be used for the first time, the filesystem must be formatted. This is done by running the following command: hdfs namenode -format

13 Pseudo-distributed mode
Starting the daemons $ sbin/start-dfs.sh Start hdfs daemon $ sbin/start-yarn.sh Start yarn daemon $ sbin/mr-jobhistory-daemon.sh start historyserver Start map/reduce daemon

14 Pseudo-distributed mode
Stopping the daemon $ sbin/mr-jobhistory-daemon.sh stop historyserver Stop map/reduce daemon $ sbin/stop-yarn.sh Stop yarn daemon $ sbin/stop-dfs.sh Stop hdfs daemon

15 Fully Distributed Mode
Here are some detailed tutorials concerning how to install Hadoop in fully distributed mode. With the explanation of configuring pseudo- distributed mode, these materials can help you configure your Hadoop in fully distributed mode. Video tutorial: Written tutorial:

16 Reference Hadoop: the definitive guide


Download ppt "Three modes of Hadoop."

Similar presentations


Ads by Google