Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sky Agile Horizons Hadoop at Sky. What is Hadoop? - Reliable, Scalable, Distributed Where did it come from? - Community + Yahoo! Where is it now? - Apache.

Similar presentations


Presentation on theme: "Sky Agile Horizons Hadoop at Sky. What is Hadoop? - Reliable, Scalable, Distributed Where did it come from? - Community + Yahoo! Where is it now? - Apache."— Presentation transcript:

1 Sky Agile Horizons Hadoop at Sky

2 What is Hadoop? - Reliable, Scalable, Distributed Where did it come from? - Community + Yahoo! Where is it now? - Apache Software Foundation Why is it called “Hadoop”? 1.01 Hadoop at Sky Overview

3 To name just a few… 1.02 Hadoop at Sky Who is using it?

4 This screengrab is from one of the Hadoop clusters at Facebook (May 2010) 1.03 Hadoop at Sky Is it “production” ready?

5 1.04 Hadoop at Sky So, what does it give you?

6 Distributed Filesystem (HDFS) -Name Node -Data Node(s) Distributed Processing Infrastructure -Job Tracker -Task Tracker(s) 1.05 Hadoop at Sky Just two things...

7 Blocks - 64MB chunks (configurable) WORM (Write once, read many) - NO EDITS - NO APPENDS Replication - 3 copies - direct 1.06 Hadoop at Sky HDFS - Overview

8 1.07 Hadoop at Sky HDFS - Read

9 1.08 Hadoop at Sky HDFS - Write

10 Slots -X mapper slots, Y reducer slots (per node) Jobs -Queued -Prioritised Tasks -Data-aware 1.09 Hadoop at Sky Distributed Processing

11 1.10 Hadoop at Sky Distributed Processing

12 Two modes of operation 1.11 Hadoop at Sky Implementation

13 1.12 Hadoop at Sky Building upon the basics

14 Map/Reduce – divide & conquer Pig – SQL-like “Pig Latin” HBase – column-based database Hive – data-warehousing (SQL-like queries) Mahout – distributed algorithms 1.13 Hadoop at Sky Sub-projects

15 Java-based -Key,Value input, Key,Value output(s) Intended for low-level / bespoke work 1.14 Hadoop at Sky Map/Reduce

16 SQL-like syntax, Map/Reduce under the hood Client-only software 1.15 Hadoop at Sky Hive

17 1.16 Hadoop at Sky Live Demo

18 It’s not a magic bullet… If the tools you need don’t exist… Approach is everything… Hadoop is *just* the framework 1.17 Hadoop at Sky Lastly, word of warning...

19 1.18 Hadoop at Sky Thank you! Questions? http://cotdp.com/hadoop.html - Soft-copy of this presentation - VM image available to download - Example code is on GitHub


Download ppt "Sky Agile Horizons Hadoop at Sky. What is Hadoop? - Reliable, Scalable, Distributed Where did it come from? - Community + Yahoo! Where is it now? - Apache."

Similar presentations


Ads by Google