Bleeding edge technology to transform Data into Knowledge HADOOP In pioneer days they used oxen for heavy pulling, and when one ox couldn’t budge a log,

Slides:



Advertisements
Similar presentations
Large Scale Computing Systems
Advertisements

Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)
Distributed Database Management Systems
Big Data A big step towards innovation, competition and productivity.
Ch 4. The Evolution of Analytic Scalability
Computers Are Your Future Tenth Edition Chapter 12: Databases & Information Systems Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall1.
Bleeding edge technology to transform Data into Knowledge HADOOP In pioneer days they used oxen for heavy pulling, and when one ox couldn’t budge a log,
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
O’Reilly – Hadoop: The Definitive Guide Ch.1 Meet Hadoop May 28 th, 2010 Taewhi Lee.
Hadoop Ali Sharza Khan High Performance Computing 1.
Bleeding edge technology to transform Data into Knowledge HADOOP In pioneer days they used oxen for heavy pulling, and when one ox couldn’t budge a log,
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Project Name Program Name Project Scope Title Project Code and Name Insert Project Branding Image Here.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
K E Y : DATA SW Service Use Big Data Information Flow SW Tools and Algorithms Transfer Hardware (Storage, Networking, etc.) Big Data Framework Scalable.
Data Centers and Cloud Computing 1. 2 Data Centers 3.
Internet of Things. Creating Our Future Together.
BIG DATA/ Hadoop Interview Questions.
B ig D ata Analysis for Page Ranking using Map/Reduce R.Renuka, R.Vidhya Priya, III B.Sc., IT, The S.F.R.College for Women, Sivakasi.
Big Data Analytics Hadoop is here to Stay!. What is Big Data? Large databases which are hard to dealComplex and Unstructured dataNeed for Parallel ProcessingHigh.
Dell Software Unified Communications Command Suite (UCCS) Provides Flexible, Cross-Platform Management, Reporting and Data Diagnostics MICROSOFT AZURE.
The Derivitec Risk Portal Provides Powerful, Cost-Effective Risk Management Solutions, Powered by Azure, that Deploy in Minutes MICROSOFT AZURE ISV PROFILE:
Hadoop is a platform that enables business to: Process Big Data at reasonable costs Provides fault tolerance (continue operating in the event of failure)
Google. Android What is Android ? -Android is Linux Based OS -Designed for use on cell phones, e-readers, tablet PCs. -Android provides easy access to.
Distributed Systems Architectures Chapter 12. Objectives  To explain the advantages and disadvantages of different distributed systems architectures.
Map reduce Cs 595 Lecture 11.
Organizations Are Embracing New Opportunities
Big Data Enterprise Patterns
Hadoop Aakash Kag What Why How 1.
Connected Maintenance Solution
Barracuda Networks Creates Next-Generation Security Solutions That Enable Customers to Accelerate Their Adoption of Microsoft Azure MICROSOFT AZURE APP.
Partner Logo Veropath Offers a Next-Gen Expense Management SaaS Technology Solution, Built Specifically to Harness Big Data Analytics Capabilities in Azure.
CLUSTER COMPUTING Presented By, Navaneeth.C.Mouly 1AY05IS037
A10 Networks vThunder Leverages the Powerful Microsoft Azure Cloud Platform to Offer Advanced Layer 4-7 Networking, Security on a Global Scale MICROSOFT.
Connected Maintenance Solution
Hadoop.
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Introduction to MapReduce and Hadoop
Algorithms for Big Data Delivery over the Internet of Things
Veeam Backup Repository
Language Understanding Intelligent Service and Microsoft Azure Enable Rover, PLEX.AI’s Artificial Intelligence-Powered Virtual Insurance Advisor MICROSOFT.
Get Real Value and Insights from Your Data: Biin Solutions Provides Predictive Analytics, IoT, and Business Intelligence with Microsoft Azure Power MICROSOFT.
FACTON Provides Businesses with a Cloud Solution That Elevates Enterprise Product Cost Management to a New Level Using the Power of Microsoft Azure MICROSOFT.
Built on the Powerful Microsoft Azure Platform, iSwarm Helps Businesses Analyze Social Media Conversations, then Connect with Individuals MICROSOFT AZURE.
Automating Profitable Growth™
On-Premises, or Deployed in a Hybrid Environment
Partner Logo Azure Provides a Secure, Scalable Platform for ScheduleMe, an App That Enables Easy Meeting Scheduling with People Outside of Your Company.
Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.
CS110: Discussion about Spark
Ch 4. The Evolution of Analytic Scalability
Adra ACCOUNTS: Transaction Matching Software Powered by the Microsoft Azure Cloud That Helps Optimize the Accounting and Finance Processes MICROSOFT AZURE.
Hadoop Technopoints.
Appcelerator Arrow: Build APIs in Minutes. Connect to Any Data Source
AIMS for BizTalk, Built on the Microsoft Azure Platform, Empowers Enterprises to Automate Insight and Analytics and Boost Value Creation MICROSOFT AZURE.
Bleeding edge technology to transform Data into Knowledge
Harness the competitive advantages of Power BI and obtain business-critical insights with Adastra’s enterprise analytics platform using Microsoft Azure.
Last.Backend is a Continuous Delivery Platform for Developers and Dev Teams, Allowing Them to Manage and Deploy Applications Easier and Faster MICROSOFT.
Zoie Barrett and Brian Lam
Big DATA.
Big-Data Analytics with Azure HDInsight
Copyright © JanBask Training. All rights reserved Get Started with Hadoop Hive HiveQL Languages.
Presentation transcript:

Bleeding edge technology to transform Data into Knowledge HADOOP In pioneer days they used oxen for heavy pulling, and when one ox couldn’t budge a log, they didn’t try to grow a larger ox. - Grace Hoppe

In 2012, an estimated 2.8 Zettabytes (2.8 Trillion GBs) was created in the world. This is enough to fill 80 Billion 32GB iPads Facebook hosts approximately 10 billion photos - taking up 1 petabyte of storage NYSE generates ~1 TB of new trade data per day Current Situation

Popular Saying: “More data usually beats better algorithms” Big Data is here, but companies are unable to properly store and analyze it Organizations IS departments can either: - Give up: Succumb to information-overload paralysis, or - Monetize big data: Attempt to harness new technologies Data - Moving Forward

Yahoo Store in internet Index 43,000 nodes Servers racked with Velcro (MTBF 1000 days) FaceBook Analysis for target adds Over 100 Petabytes Growing at ½ PB/day Early Adopters - Tech

Storage distributed across multiple nodes Nodes are composed of commodity servers Nodes orchestrated to process requests in parallel. Process any kind of data Hadoop Architecture

Hardware Commodity servers Software Unix OS, Hadoop Network Fiber network backend, IP LAN for users Data Any format (structured or not ) Hadoop Infrastructure

Map Reduce

Framework is not suited for transactional environments Long load time; difficulty to edit partial dataset Limitations

Process Big Data at reasonable costs Provides fault tolerance (continue operating in the event of failure) Enables massive parallel processing (MPP) Runs on commodity infrastructure – Cheap to run Available under the GNU GPL (General Public License) – Free to use. Business value

Challenges of adoption Fear of the unknown Lack of skillset Lack of understanding of ROI Open Source (potential security risk)

Data Governance Exploratory Phase Implementation Phase

Data Governance Exploratory Phase Recognizing responsibilities associated with deploying Hadoop Determining key issues and requirements around privacy. Hadoop integration into existing IT infrastructure.

Data Governance Implementation Phase Determining business needs Organization Structure Stewardship Data Risk and Quality Management Information Life Cycle Management Security & Privacy Data Architecture

The Hadoop platform was designed to solve problems where you have a lot of data. Designed to run deep and computationally extensive analytics. Hadoop applies to a bunch of markets and create competitive advantage Knowledge is power Conclusions

Thank you!