Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hadoop Architecture Mr. Sriram

Similar presentations


Presentation on theme: "Hadoop Architecture Mr. Sriram"— Presentation transcript:

1 Hadoop Architecture Mr. Sriram

2 Objectives Hadoop Cluster – A typical usage
Hadoop 2.X Cluster Architecture Analyze Hadoop 2.X Cluster Architecture - Federation Analyze Hadoop 2.X Cluster Architecture - High Availability Hadoop 2.X Resource Management Run Hadoop in different Cluster modes Installation & Configuration of Hadoop Implement basic Hadoop commands on terminal Prepare Hadoop 2.X configuration files and analyze the parameters in it Implement different loading techniques

3 Hadoop Architecture Hadoop Cluster Typical Use
Hadoop 2.X Cluster Architecture Hadoop 2.X Cluster Architecture – Federation Hadoop 2.X Cluster Architecture – High Availability Hadoop 2.X Resource Management Hadoop Cluster – Facebook Hadoop Cluster Modes

4 Hadoop Cluster – A Typical Use Case

5 Hadoop 1.0 Cluster

6 Hadoop 2.0 Cluster

7 Hadoop 2.X Cluster Master-Slave Architecture

8 Hadoop 2.X Cluster Architecture

9 Hadoop 2.X Cluster Architecture - Federation

10 Hadoop 2.X Cluster Architecture – High Availability

11 Hadoop 2.X Resource Management

12 Hadoop 2.X Resource Management..

13 Hadoop Cluster - Facebook

14 Hadoop Cluster Modes

15 Hadoop Installation & Configuration
Hadoop FS shell Commands Terminal Commands Hadoop 2.X Configuration Files -> core-site.xml -> hdfs-site.xml -> mapred-site.xml -> yarm-site.xml Slaves & Masters Per-Process Run Time Environment Hadoop Daemons Hadoop Web UI Parts Hadoop Installation & Configuration

16 Hadoop Installation Pre-requisites
Install JAVA(version 6 or later) Use java –version for checking for java installation Use which java to locate the java directory Hadoop runs on Unix and Windows Linux is the only supported production platform Windows and Mac is supported only as development platform Windows additionally requiresCygwinto run

17 Hadoop Installation We can install Hadoop in any of the following ways: 1) Automated method using Cloudera Manager 2) Manual methods described below: i. Install from a CDH5 tarball ii. Install from RPMs

18 Hadoop Installation from a Tarball
Downloading Tarball Download stable Hadoop release from Cloudera Hadoop release page UnpackingTarball Unpack the Hadoop archive to /home/$USER/using $ tar xzfhadoop-x.y.z.tar.gz

19 Hadoop Installation from a Tarball
Setting Environment Variables Edit conf/hadoop-env.sh in the Hadoop folder Specify the JAVA_HOME variable by adding export JAVA_HOME=/usr/java/ Edit.profile file using any text editor and set the HADOOP_HOME, JAVA_HOME & necessary CLASSPATHs

20 Hadoop Installation from RPM
Download the CDH packages that matches your Red Hat or CentOS system from cloudera one click install page: archive.cloudera.com/cdh4/one-click-install/redhat Install the RPM: sudo yum--nogpgchecklocalinstallcloudera-cdh-4-0.x86_64.rpm Optionally Add a RepositoryKey Sudo rpm--importhttp://archive.cloudera.com/cdh4/redhat/5/x86_64/cdh/RPM Install Hadoop in pseudo-distributed mode Sudo yuminstallhadoop-0.20-conf-pseudo

21 Hadoop Installation from Cloudera Manager
1.Download and run the Cloudera Manager Installer Download cloudera-manager-installer.bin from the Cloudera Downloads page 2.After downloading cloudera-manager-installer.bin, change it to have executable permission chmod u+x cloude chmod u+x cloudera-manager-installer.bin 3.Run cloudera-manager-installer.bin sudo./cloudera-manager-installer.bin Read the Cloudera Manager Readme and then press Enter to choose Next

22 Hadoop Installation from Cloudera Manager
4. Start the Cloudera Manager Admin Console Log into Cloudera Manager The default credentials are: Username: admin Password: admin Use Cloudera Manager for Automated CDH Installation and Configuration Find the cluster hosts you specify via hostname and ILP–address ranges Click Search Cloudera Manager identifies the hosts on your cluster to allow you to configure them for CDH Choose the CDH version to install

23 Hadoop FS Shell Commands

24 Terminal Commands

25 Terminal Commands – mkdir, touchz, ls, count
Terminal Type admin terminal # | user terminal $ To make a directory $ hadoop fs -mkdir /user/cloudera/Monday To create an empty file $ hadoop fs -touchz /user/cloudera/Monday/one.txt To list number of files and directories present in HDFS location $ hadoop fs -ls /user/cloudera/Monday To count the number of files and directories available in HDFS location $ hadoop fs -count /user/cloudera/Monday

26 Terminal Commands - Copy
To copy the file from LFS to HDFS $ hadoop fs -put /home/cloudera/Desktop/two.txt /user/cloudera/Monday (or) $ hadoop fs -copyFromLocal /home/cloudera/Desktop/three.txt /user/cloudera/Monday To copy file from HDFS to LFS $ hadoop fs -get /user/cloudera/Monday/two.txt /home/cloudera/Desktop/Tuesday (or ) $ hadoop fs -copyToLocal /user/cloudera/Monday/one.txt /home/cloudera/Desktop/Tuesday

27 Terminal Commands – cat, rm
To print the contents of HDFS file: $ hadoop fs -cat /user/cloudera/Monday/two.txt or $ hadoop fs -text /user/cloudera/Monday/two.txt To remove the directory from HDFS location $ hadoop fs -rm -r /user/cloudera/Monday

28 Hadoop Configuration Files
Each component in Hadoop is configured using an XML file These XML files are allocated in the conf subdirectory in Hadoop folder The three most important XML files are: Core-site.xml-Core properties Hdfs-site.xml-HDFS properties Mapred-site.xml-MapReduce properties To run Hadoop in a particular mode, you need to do two things: Set the appropriate properties in the configuration files Start the Hadoop daemons

29 Hadoop 2.x Configuration Files

30 Hadoop 2.x Configuration Files – Apache Hadoop

31 core-site.xml

32 hdfs-site.xml

33 mapred-site.xml

34 yarn-site.xml

35 Slaves & Masters

36 Per-Process Run Time Environment

37 All Properties

38 Running Hadoop Before Hadoop can be used,a brand-new HDFS installation needs to be formatted Commands To Format the Name Node: hadoop namenode -format To start the HDFS and MapReduce daemon $ sstart-dfs.sh, $ start-mapred.sh. Or use $ start-all.sh to start all daemons If you have placed configuration files outside the default conf directory, start the daemons with the— config option, start-xyz.sh—config path-to-config-directory To stop a daemon $stop-dfs.sh, $stop-mapred.sh Or use $stop-all.sh to stop all daemons.

39 Hadoop 2.x Daemons

40 Hadoop Daemons

41 Hadoop Web UI Parts

42 Hadoop Web UI URL’s

43 Hadoop Stack

44 Data Loading Techniques and Data Analysis

45 Data Loading using Flume

46 Data Loading using SQOOP

47 Further Reading

48 Further Reading..

49 Thank You !!!!!!!!!!!


Download ppt "Hadoop Architecture Mr. Sriram"

Similar presentations


Ads by Google