Download presentation
Presentation is loading. Please wait.
Published byShon Burns Modified over 9 years ago
1
GROUP 7 TOOLS FOR BIG DATA Sandeep Prasad Dipojjwal Ray
2
Objectives... Apache Hadoop Apache hadoop v1.0.3 and v1.0.4 successful installation Wordcount functionality by hadoop mapreduce Estimating value of 'Pi' by hadoop mapreduce MapReduce and HDFS
3
Apache Hadoop... High-Availability Distributed object-oriented platform Open Source Pseudo-Distributed single-node cluster A part of Apache Lucene project Handles petabytes of data
4
Installation of Hadoop v1.0.3 & 1.0.4... Release Date v1.0.3 : October 12, 2012 Release Date v1.0.4 : May 16, 2012 OS : Ubuntu v12.04 Prerequisites : Sun Java, hduser Configuration
5
Examples... WordCount example : $ /bin/hadoop jar hadoop-1.0.3-examples.jar wordcount file01.txt Estimation of 'Pi' $ /bin/hadoop jar hadoop-1.0.3-examples.jar pi (x) (y) x= Number of maps y= Sample per maps Runtime 2.25 seconds (x=10 ; y=100) Estimated value 3.1480000000000
6
MapReduce & HDFS... Divide and conquer algorithm Map() and Reduce() function derive roots from functional programming JobTracker and TaskTracker NameNode and DataNode Hadoop Distributed File System Java Framework
7
References... http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu- linux-single-node-cluster http://lintool.github.io/Cloud9/ Data intensive text-processing using Mapreduce Book by Jimmy Lin and Chris Dyer http://hadoop.apache.org/releases.html http://www.apache.org/dyn/closer.cgi/hadoop/co
8
THANK YOU
9
framework written in Java highly fault-tolerant distributed file system JobTracker web UI provides information about general job statistics of the Hadoop cluster, running/completed/failed jobs and a job history log file The task tracker web UI shows you running and non-running tasks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.