The World Moves Fast, and Data is Driving: Big Data and - #GILSV September 2013 | Silicon - #GILSV September.

Slides:



Advertisements
Similar presentations
Syncsort Data Integration Update Summary Helping Data Intensive Organizations Across the Big Data Continuum Hadoop – The Operating System.
Advertisements

Big Data Training Course for IT Professionals Name of course : Big Data Developer Course Duration : 3 days full time including practical sessions Dates.
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
BigData Tools Seyyed mohammad Razavi. Outline  Introduction  Hbase  Cassandra  Spark  Acumulo  Blur  MongoDB  Hive  Giraph  Pig.
A Fast Growing Market. Interesting New Players Lyzasoft.
FAST FORWARD WITH MICROSOFT BIG DATA Vinoo Srinivas M Solutions Specialist Windows Azure (Hadoop, HPC, Media)
Paula Ta-Shma, IBM Haifa Research 1 “Advanced Topics on Storage Systems” - Spring 2013, Tel-Aviv University Big Data and.
Cloudera & Hadoop Use Cases Rob Lancaster | Omer Trajman "Big Data"... Applications From Enterprises to Individuals.
StorIT Certified - Big Data Sales Expert Name of the course: StorIT Certified Bigdata Sales Expert Duration: 1 day full time Date: November 12, 2014 Location:
Architecting for the Internet of Things
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
Fraud Detection in Banking using Big Data By Madhu Malapaka For ISACA, Hyderabad Chapter Date: 14 th Dec 2014 Wilshire Software.
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark 2.
Big Data A big step towards innovation, competition and productivity.
Hadoop Ecosystem Overview
Business Intelligence: The Next Big Thing (Really!) John Bair CTO, Ajilitee Sep 14, 2012 Presented to TDWI St. Louis Chapter.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Copyright © 2012 Cleversafe, Inc. All rights reserved. 1 Combining the Power of Hadoop with Object-Based Dispersed Storage.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
SQL Server 2014: The Data Platform for the Cloud.
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark Cluster Monitoring 2.
Hadoop Basics -Venkat Cherukupalli. What is Hadoop? Open Source Distributed processing Large data sets across clusters Commodity, shared-nothing servers.
W HAT IS H ADOOP ? Hadoop is an open-source software framework for storing and processing big data in a distributed fashion on large clusters of commodity.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
1 Apache Spark and Its Role in the Enterprise Data Hub Mike Olson, Chief Strategy Officer,
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
Last Updated 1/17/02 1 Business Drivers Guiding Portal Evolution Portals Integrate web-based systems to increase productivity and reduce.
1 Melanie Alexander. Agenda Define Big Data Trends Business Value Challenges What to consider Supplier Negotiation Contract Negotiation Summary 2.
Hadoop IT Services Hadoop Users Forum CERN October 7 th,2015 CERN IT-D*
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Project Management May 30th, Team Members Name Project Role Gint of Communications Sai
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Nov 2006 Google released the paper on BigTable.
HADOOP Carson Gallimore, Chris Zingraf, Jonathan Light.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
Microsoft Azure and DataStax: Start Anywhere and Scale to Any Size in the Cloud, On- Premises, or Both with a Leading Distributed Database MICROSOFT AZURE.
Big Data Tools Hadoop S.S.Mulay Sr. V.P. Engineering February 1, 2013.
Next Generation of Apache Hadoop MapReduce Owen
Big Data Analytics with Excel Peter Myers Bitwise Solutions.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.
Learn. Hadoop Online training course is designed to enhance your knowledge and skills to become a successful Hadoop developer and In-depth knowledge of.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
BIG DATA/ Hadoop Interview Questions.
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
Apache Hadoop on Windows Azure Avkash Chauhan
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Microsoft Partner since 2011
Microsoft Ignite /28/2017 6:07 PM
BI 202 Data in the Cloud Creating SharePoint 2013 BI Solutions using Azure 6/20/2014 SharePoint Fest NYC.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.
From RDBMS to Hadoop A case study Mihaly Berekmeri School of Computer Science University of Manchester Data Science Club, 14th July 2016 Hayden Clark,
Data Analytics (CS40003) Introduction to Data Lecture #1
Connected Infrastructure
SAS users meeting in Halifax
Connected Living Connected Living What to look for Architecture
Connected Living Connected Living What to look for Architecture
Connected Infrastructure
Hadoop Clusters Tess Fulkerson.
Presentation transcript:

The World Moves Fast, and Data is Driving: Big Data and - #GILSV September 2013 | Silicon - #GILSV September 2013 | Silicon Valley Jeff Cotrupe Global Program Director Big Data & Analytics (BDA) Stratecast | Frost & Sullivan

Three Key Takeaways 1.An understanding of the structure (or UN-structure) of Big Data, and where you need to look to be sure you’re capturing it all 2.A blueprint for the bases a Big Data, analytics, and business intelligence (BI) solution must cover to ensure that your organization wrings every drop of value out of the data 3.Real-world use cases showing Big Data in action 2

Business Intelligence (BI): “The ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal.” - IBM researcher Hans Peter Luhn* 3 *IBM Journal: A Business Intelligence System, October 1958

The Growth Partnership Company: Analyzing this Large, Growing Market 4 Global Mkt: $22 BILLION Global Mkt: $22 BILLION

COMPONENT: ONLINE ANALYTICS COMPONENT: ONLINE ANALYTICS Big data core: Platforms Applications Systems Services Big data core: Platforms Applications Systems Services COMPONENT: MOBILE COMMERCE MGMT (MCM) COMPONENT: MOBILE COMMERCE MGMT (MCM) COMPONENT: CUSTOMER EXPERIENCE MGMT (CEM) APM-CSA QoE CEA COMPONENT: CUSTOMER EXPERIENCE MGMT (CEM) APM-CSA QoE CEA SOCIAL NETWORK ANALYSIS (c) SOCIAL NETWORK ANALYSIS (c) Customer Loyalty (C,M) Customer Loyalty (C,M) Retail/wifi analytics (o,m)

STRUCTURED DATA Big Data? MANY Data

UNSTRUCTURED and SEMI-STRUCTURED DATA: Enterprise 39 types 5 categories UNSTRUCTURED and SEMI-STRUCTURED DATA: Enterprise 39 types 5 categories Communications & Messaging Communications & Messaging OTHER/External… Online/Digital

Networks/Services OSS/BSS Operators/CSPs: ENTERPRISE categories, +… 14 types 3 categories OTHER/External including Content Providers… OTHER/External including Content Providers…

“Just add servers.”  Big Data architectural elements –Hyperscale computing and high-performance cluster computing for ultra-high- speed processing –Reconfigurable, massively parallel architecture –Shared-nothing (memory or disk): processes maximum amount of data –Data auto-sharing or “sharding”: partition data across DBs, maintain copy of served application’s data –Memcached: in-memory key-value store for small chunks of data; e.g., superior Web user experience through faster page-loading –Trillions of calculations per second (TeraOPs) vs existing floating point operations per second (FLOPS)  “Open your eyes” (data-wise) –Traditional: sampling/summaries –Unlocking the value of Big Data value: unfiltered, ALL 9

10 “I’ll take one Hadoop, please. Extra analytics.”  What you get: open source distributed computing framework –Open source implementation of Google’s MapReduce data framework –Good when data too large for single DB; cost- prohibitive to index data updates; many simultaneous users  “DB not included” | Add: NoSQL DBs –To get information, must run MapReduce job…TIME* –NoSQL Wide Colum Stores for distributed data storage  Ultra-high-speed performance executing highly-complex queries over similar data  Read only queried attributes (row DBs: all surrounding data)  More efficient attribute storage enhances data compression –Examples: Apache Hbase and Cassandra; Google BigTable;* Cloudata; Cloudera Invented by open source search advocate Douglas Cutting, and named for one of his son’s childhood toys, Hadoop is operated by the Apache Software Foundation. * Google: MapReduce > BigTable

 Challenge: distributed data processing | Solution: Apache Hadoop Distributed File System (HDFS) –Grid computing approach –MapReduce to distribute processing across servers  Challenge: Java programming | Solutions: Apache Pig and Hive –Pig simplifies tasks, accommodates semi-structured data –Hive is a DW(H) system for Hadoop  Challenge: integrating structured data | Solutions: Apache Sqoop and Flume –Sqoop does bulk data transfers between RDBs and Hadoop (HDFS or Hive) –Flume imports streaming (Web) log data into HDFS  Other tasks: deployment and ongoing management | Solutions: Apache… –Deployment and administration: Ambarri and Whirr –Workflow management and data sync: Zookeeper and Oozie 11 “One Hadoop, please” (continued)

1.Knowledge Management & Baselining 2.Master Data Management (MDM) 3.Data Integration-ETL-ELT 4.Storage & DW(H) 5.Enterprise Search 6.Security & Enterprise Rights Management 7.Analytics & Reporting 8.Collaboration 9.Fast Start & Extensibility Tools 12 Big Data Blueprint for Success

“We all offer real-time analytics.” Not So Fast… 13

Use Cases? Boundless. Here Are a Few  Predicting and identifying security threats; fraud detection  Pricing optimization  Behavorial analytics  Predictive customer support  Device analytics (performance failure/part swapouts)  Mobile branding-advertising-commerce  Customer experience  Customer loyalty programs  Integrated retail business optimization  Integrated online/offline business processes …our Featured Speaker and PANEL 14

Using Big Data to Make a Powerful Impact Niloy Sanyal Software Commercial Strategy Leader GE Software 15

16 Ask the Experts! Panel Discussion | Driving Big Data: Real-World Big Data Issues & Answers 16  Southard Jones, Vice President of Product Strategy, Birst  Matti Aksela, Ph.D., Vice President, Analytics & Technology, Comptel  Mike Brown, Chief Technology Officer, comScore  Niloy Sanyal, Software Commercial Strategy Leader, GE Software  Lucia Gradinariu, Ph.D., Chief Marketing Strategist, Huawei  Eugene Kolker, Ph.D., Chief Data Officer, Seattle Children’s Hospital TEXT to