Presentation is loading. Please wait.

Presentation is loading. Please wait.

By: Joel Dominic and Carroll Wongchote 4/18/2012.

Similar presentations


Presentation on theme: "By: Joel Dominic and Carroll Wongchote 4/18/2012."— Presentation transcript:

1 By: Joel Dominic and Carroll Wongchote 4/18/2012

2  Cloud Computing  Hadoop  Fault Tolerance  Mishaps  Solutions  Techniques  Results

3  Many computers working together to complete a problem The Cloud

4 Big Problem Smaller Problem

5  Software framework for distributed computing  Written in Java  Two components: HDFS and MapReduce.  Apache software project  Mimics Google File System and Google Map Reduce  Used for processing large amounts of text data  i.e. logs, web pages, etc.

6  Hadoop Distributed File System Source: http://hadoop.apache.org/core/docs/current/hdfs_design.html

7  Built off 2 functional programming paradigms  map  reduce  map  map +2 [ 1, 2, 3, 4, 5, 6]  [(1+2), (2+2), (3+2), (4+2), (5+2), (6+2)] = [3, 4, 5, 6, 7, 8]  reduce  reduce + [3, 4, 5, 6, 7, 8]  (3 + 4 + 5 + 6 + 7 + 8) = 33  reduce * [3, 4, 5, 6, 7, 8]  (3 * 4 * 5 * 6 * 7 * 8) = 60480

8 Object Mapper Object Mapper Result Reducer Final Result

9  Facebook  “A 1100-machine cluster with 8800 cores and about 12 PB raw storage.”  “A 300-machine cluster with 2400 cores and about 3 PB raw storage.”  Yahoo!  “More than 100,000 CPUs in >40,000 computers running Hadoop”  “Our biggest cluster: 4500 nodes (2*4cpu boxes w 4*1TB disk & 16GB RAM)”

10  What is fault tolerance?  Examples of fault tolerant systems  Brake system in cars  Columns on patio

11  Hadoop was built with fault tolerance in mind  Failures happen  Don’t worry about failures just replicate data or processes  Hadoop works at the application layer to handle failures http://en.wikipedia.org/wiki/File:Computer_abstraction_layers.svg

12

13  Topology  Machine Specifications  Methods  Physical computers  Virtualized computers  All in the same room  Manually installing the software (OS, Hadoop, etc) on each physical machine

14  4 Virtual Machines  3GHz single-core processors, 512MB RAM, 8GB HDD  7 Physical Machines  Dell (2)  3GHz dual-core processor, 2GB RAM, 160GB HDD  3.4GHz single-core processor, 1GB RAM, 120GB HDD  Lenovo (5)  2.4GHz dual-core processor, 2GB RAM, 250GB HDD  Running Ubuntu 10.04 LTS  Sun Java 6 JDK  Hadoop 0.20

15 Slave Node Master Node

16

17  Campus blocking ports  MapReduce WARN: Attempt failure  MapReduce WARN: Connection failure  MapReduce job not completing  Virtualization  Copying machines  Connecting to the network

18

19  Campus blocking ports  Moved from campus network to private network  MapReduce WARN: Attempt failure  MapReduce WARN: Connection failure  MapReduce job not completing  Both solved by editing the /etc/hosts file  /etc/hosts deals with resolving hostnames on local computers  Virtualization  Solved with determination

20  Downloaded 164 books from gutenberg.org  ~200MB of text data  Ran a word count on the books with all nodes active  Control group  Ran the same program with different times and percentages of failures

21

22

23

24

25  Increase in networking skills  Strong unix skills  Basic scripting  Network troubleshooting  Virtualization experience  Installing operating systems (~30+)  Understanding of Hadoop and fault tolerance  Programming routers

26  Cloud Computing  Hadoop  Fault Tolerance  Mishaps  Solutions  Techniques  Results

27  http://wiki.apache.org/hadoop/PoweredBy http://wiki.apache.org/hadoop/PoweredBy  http://www.michael- noll.com/tutorials/running-hadoop-on- ubuntu-linux-multi-node-cluster/ http://www.michael- noll.com/tutorials/running-hadoop-on- ubuntu-linux-multi-node-cluster/  http://www.gutenberg.org/ http://www.gutenberg.org/  http://hadoop.apache.org/ http://hadoop.apache.org/  http://stackoverflow.com/ http://stackoverflow.com/  http://mail-archives.apache.org/ http://mail-archives.apache.org/  http://www.ubuntu.com/ http://www.ubuntu.com/


Download ppt "By: Joel Dominic and Carroll Wongchote 4/18/2012."

Similar presentations


Ads by Google