Presentation is loading. Please wait.

Presentation is loading. Please wait.

This is a free Course Available on Hadoop-Skills.com.

Similar presentations


Presentation on theme: "This is a free Course Available on Hadoop-Skills.com."— Presentation transcript:

1 This is a free Course Available on Hadoop-Skills.com

2

3  Facebook, Twitter, Google generating petabytes of data everyday.  Hadron Collider project discarding large amount of data as they won’t be able to analyse. Hoping that they haven’t thrown anything valuable. Interesting facts but …. Why is Big Data important? Lets understand via an example This is a free Course Available on Hadoop-Skills.com

4 Insurance 3 rd Party SurveyExpert Debates This is a free Course Available on Hadoop-Skills.com

5 Insurance Optimal Price Data Warehousing Repository Web Activity Transaction Competitors Pricing Market Trends Statistics Data Warehouse Run Statistical Algorithms Decision Support System Decision Support System This is a free Course Available on Hadoop-Skills.com

6

7 V olume V elocity V ariety This is a free Course Available on Hadoop-Skills.com

8 Insurance Optimal Price Data Warehousing Repository Web Activity Transaction Competitors Pricing Market Trends Statistics Data Warehouse Run Statistical Algorithms Decision Support System Decision Support System This is a free Course Available on Hadoop-Skills.com

9

10 Decision Support System Digital Nervous System Data Fundamental block to Data Fundamental Block to Business @ speed of thought SenseInterpretDecideAct Organisations behaving like Biological nervous system Avatar Skynet This is a free Course Available on Hadoop-Skills.com

11 Bank Repository Web Activity Transaction Competitors Pricing Market Trends Statistics Optimal Price Mobile Alert with Travel insurance This is a free Course Available on Hadoop-Skills.com

12 International Data Corporation’s (IDC) 6 th annual study:  From 2005 to 2020, the digital universe will grow by a factor of 300, from 130 exabytes to 40,000 exabytes, or 40 trillion gigabytes  More than 5,200 gigabytes for every man, woman, and child in 2020.  From now until 2020, the digital universe will about double every two years.  33% of the digital data might be valuable if analysed, compared with 25% today. From Gartner:  4.4 Million IT Jobs Globally to Support Big Data By 2015. This is a free Course Available on Hadoop-Skills.com

13

14 2003-04 1996-2000 2005-06 2010 2013 Google File System And MapReduce Papers Google File System And MapReduce Papers YARN/MapReduce 2/ Next Generation Hadoop YARN/MapReduce 2/ Next Generation Hadoop Hadoop spawns off Nutch Nutch Big Data problem faced by All Search engines Big Data problem faced by All Search engines and Mike Dreadnaught Doug Joins Cloudera Cloudera 0.xx Releases of hadoop This is a free Course Available on Hadoop-Skills.com

15

16

17 Price Advantage: 1. Clusters use commodity hardware, cheaper than one expensive server. 2. Software License is free. This is a free Course Available on Hadoop-Skills.com

18 HDFS MapReduce Google File System Google MapReduce file1 Name node Data nodes map Reduce User This is a free Course Available on Hadoop-Skills.com

19

20

21 HDFS MapReduce HBase Pig Hive Sqoop/Flume Log collection YahooFacebook Storm Chukwa Kafka Structured Stores Message broker Oozie This is a free Course Available on Hadoop-Skills.com

22

23 Complex Algorithm on a small dataset Simple Algorithm on a large dataset 1. Complex Algorithms needs to be correctly sensitive to week correlations. 2. Complex Algorithms are thus difficult to code and design. This is a free Course Available on Hadoop-Skills.com

24 Data EngineerData Scientist Role Skills To solve business problems using data. To engineer software solutions. More of programing and technical skills and ability to architect technical solutions. Strong of Mathematical Skills and understanding of statistical Models. This is a free Course Available on Hadoop-Skills.com

25 -> Skeleton Version -> All the ecosystems need to be additionally installed. -> Important ecosystem members included. -> Few Proprietary tools like Enterprise Manager. -> Proprietary Hadoop code written in C. -> Integrated with Hadoop ecosystem members. -> Based out of Apache hadoop. -> Supports.NET framework -> Launches Hadoop Distribution: Pivotal HD This is a free Course Available on Hadoop-Skills.com

26 Thank You!!! This is a free Course Available on Hadoop-Skills.com

27 Superstar-Doug!!! A small fan :- Me And the real Hadoop This is a free Course Available on Hadoop-Skills.com


Download ppt "This is a free Course Available on Hadoop-Skills.com."

Similar presentations


Ads by Google