Download presentation
Presentation is loading. Please wait.
1
Hadoop Installation Fully Distributed Mode
Qianwen Ye
2
Before We Start 1. create a few VM instances (Ubuntu is suggested)
2. set proper security group constraints 3. allow passphraseless connection between them
3
Security Group Snapshot
Inbound Outbound
4
What I Have: 4 Ubuntu VMS in AWS
Already set up passphraseless ssh connection
5
Overview Change /etc/hosts File (not necessary) Java Installation
Hadoop Environment Configuration
6
Change Hosts File On each VM’s Terminal: Add following content:
7
Change Hosts File Then we can use the following command to connect to each other:
8
Install Java on each VM Install Java
9
Install Java on each VM Configure JAVA HOME
10
Download Hadoop: Master Node Only
Goes to Hadoop Download Page Find the link for downloading (binary)
11
Download Hadoop: Master Node Only
Download and unzip it
12
Configure ~/.bash_profile
For all VMs:
13
Configure Hadoop: Master Node Only
Hadoop’s directory Files need to be modified core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml hadoop-env.sh slaves, masters
14
core-site.xml
15
hdfs-site.xml
16
mapred-site.xml.template
17
yarn-site.xml
18
hadoop-env.sh
19
Masters and slaves Slaves Master
20
Send Hadoop to all other nodes
21
Format Namenode and Start Hadoop
22
Processes on Master node and Slave node
23
Example: WordCount
24
WordCount: Map
25
WordCount: Reduce
26
WordCount: Main
27
Compile WordCount and make jar package
28
Prepare Input
29
Execute WordCount Program
30
Check Result
31
Thank you!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.