Big Data …Big Opportunities ? ……Big Hype ?

Slides:



Advertisements
Similar presentations
Chapter 1 Business Driven Technology
Advertisements

Big Data and Predictive Analytics in Health Care Presented by: Mehadi Sayed President and CEO, Clinisys EMR Inc.
Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views Dr. Matthew Ganis IBM Senior Technical Staff Member CIO.
Big Data Workflows N AME : A SHOK P ADMARAJU C OURSE : T OPICS ON S OFTWARE E NGINEERING I NSTRUCTOR : D R. S ERGIU D ASCALU.
© 2012 TeraMedica, Inc. Big Data: Challenges and Opportunities for Healthcare Joe Paxton Healthcare and Life Sciences Sales Leader.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Big Data. What is Big Data? Big Data Analytics: 11 Case Histories and Success Stories
Dr. Michael D. Featherstone Summer 2013 Introduction to e-Commerce Web Analytics.
MIS – 3030 Business Technologies Social Media & Conversation Big Data.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
Business Solutions. Agenda Overview Business Solutions Benefits Company Summary.
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
IoT Meets Big Data Standardization Considerations
© 2012 IBM Corporation Click here for Table of Contents This document is for IBM and IBM Business Partner use only. It is not intended for client distribution.
BUSINESS INTELLIGENCE & ADVANCED ANALYTICS DISCOVER | PLAN | EXECUTE JANUARY 14, 2016.
BIG DATA. The information and the ability to store, analyze, and predict based on that information that is delivering a competitive advantage.
Big Data Javad Azimi May First of All… Sorry about the language  Feel free to ask any question Please share similar experiences.
Data Analytics (CS40003) Introduction to Data Lecture #1
Reinventing Customer Experiences
CNIT131 Internet Basics & Beginning HTML
Thank you/Appreciate time Intro me- Manage channel last 2 years
New Product Innovation
IOT – Firefighting Example
Makes Insurance Smarter.
2017 Applications Trend Report Explore. Adopt. Optimize.
Based on four case studies and a follow-up survey, we have identified the key success factors for realizing value from DDS (digital data stream) investments.
Understanding Big Data
Advertising Agencies and Interactive Media
Top 10 Strategic Technology Trends for 2013
MIS2502: Data Analytics Advanced Analytics - Introduction
April 25, 2012 The Three R’s Are Old School – Now It Is All About Volume, Velocity & Variety Peter Guest Alberta Public Sector Client Technical Advisor.
Mobile learning three C’s
Big Data.
BIG Data 25 Need-to-Know Facts.
of our Partners and Customers
Eric Schmidt - Assignment
Chapter 11 Building a Customer-Centric Organization – Customer Relationship Management 11-1.
Big-Data Fundamentals
Vocabulary Big Data - “Big data is a broad term for datasets so large or complex that traditional data processing applications are inadequate.” Moore’s.
MyHealthDirect’s Enterprise Scheduling Platform, Based on Microsoft Azure, Improves the Patient Experience and Reduces Patient Readmissions MICROSOFT AZURE.
April 25, 2012 The Three R’s Are Old School – Now It Is All About Volume, Velocity & Variety Peter Guest Alberta Public Sector Client Technical Advisor.
Customer Services Single view of the customer, enabling wide variety of customer requests to be dealt with at the point of contact Self-Service Portal.
Big Data - in Performance Engineering
American Brush Manufactures Association
Where is Your Organization on the Accessibility Maturity Scale
PowerHub on Microsoft Azure Enables Renewable Energy Professionals to Track and Manage Projects from a Centralized Platform Accessible Anywhere MICROSOFT.
Chapter GS Getting Started.
Top 10 Strategic Technology Trends for 2013
Automating Profitable Growth™
Big Data.
Why listen to me? Sr. Digital Marketing Specialist for Fastline Media Group Social media is my world Fastline has seen a… 1,044% growth in Facebook audience.
Chapter GS Getting Started.
VUDU ADVERTISING PRESENTATION
IT Megatrends that shape the Digital Future…
Web Mining Department of Computer Science and Engg.
Big Data: Four Vs Salhuldin Alqarghuli.
Big Data Analysis in Digital Marketing
Chapter GS Getting Started.
Automating Profitable Growth™
School Districts Can Analyze and Report on Data Across Multiple Systems with EdWire, a Powerful Integration Solution that Utilizes Microsoft Azure MICROSOFT.
Big DATA.
Data Analysis and R : Technology & Opportunity
Chapter GS Getting Started.
Presented By:- Abhinav Shashtri. Index SR.NOTitleSlide No 1Introduction: Build Awareness: Buildup Brand Image: Content Improves Website.
Automating Profitable Growth
UNIT 6 RECENT TRENDS.
The Intelligent Enterprise and SAP Business One
Presentation transcript:

Big Data …Big Opportunities ? ……Big Hype ? (or just a Big Mess ?) Data challenges and IBM views Dr. Matthew Ganis IBM Senior Technical Staff Member CIO Social Media Analytics Chief Architect Member, IBM Academy of Technology ganis@us.ibm.com @mattganis (twitter) 1 1 1

Big Data has been used to convey all sorts of concepts, including huge The Term “Big Data” is pervasive - but still provokes a bit of confusion. SO what is it ? Big Data has been used to convey all sorts of concepts, including huge Quantities of data, social media analytics, next generation data management Capabilities, real time data and much much more..... 2

That means we create about 1.8 Zetabytes of Information every two years. 3

Extracting insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible. 4

44x 80% 1 in 3 1 in 2 83% 60% Velocity Variety Volume Information is at the Center of a New Wave of Opportunity… … And Organizations Need Deeper Insights 44x 2020 35 zettabytes Business leaders frequently make decisions based on information they don’t trust, or don’t have 1 in 3 as much Data and Content Over Coming Decade More and More Data More Sources and more type of Data (Structured and unstructured Data) Data arrived Faster and Faster Leaders don’t always have access to the right information to takes decision Leaders need to have deeper Insight and get them faster Sources: The Guardian, May 2010 IDC Digital Universe, 2010 IBM Institute for Business Value, 2009 IBM CIO Study 2010 TDWI: Next Generation Data Warehouse Platforms Q4 2009 Summary Data is exploding – in volume, variety and velocity. And both struc and unstruc info will continue to grow at astronomical rates. This creates a tremendous opportunity for organizations to make timely decisions and achieve business goals. However, at the same time, organizations are struggling to gain deeper insights from this data. Business leaders continue to make decisions without access to the trusted information they need. CEOs understand that they need to do a better job in capturing and understanding information Tera = 10 puissance 12 bytes Peta = 10 puissance 15 = 1000 TB Exa = 10 puissance 18 Zeta = 10 puissance 21  1 milliars TB 1 in 2 Business leaders say they don’t have access to the information they need to do their jobs Velocity Variety of CIOs cited “Business intelligence and analytics” as part of their visionary plans to enhance competitiveness 80% 83% Volume 2009 800,000 petabytes Of world’s data is unstructured of CEOs need to do a better job capturing and understanding information rapidly in order to make swift business decisions 60% 5 5 5 5 5 5

Structured vs Unstructured Structured data refers to information with a high degree of organization, such that inclusion in a relational database is seamless and readily searchable by simple, straightforward search engine algorithms or other search operations; whereas unstructured data is essentially the opposite. The lack of structure makes compilation a time and energy-consuming task.

The Challenge: Bring Together a Large Volume and Variety of Data to Find New Insights Multi-channel customer sentiment and experience a analysis Detect life-threatening conditions at hospitals in time to intervene Predict weather patterns to plan optimal wind turbine usage, and optimize capital expenditure on asset placement Big data is more than simply a matter of size; it is an opportunity to find insights in new and emerging types of data and content, to make your business more agile, and to answer questions that were previously considered beyond your reach Imagine if you could analyze the 12B TB of tweets being created each day to figure out what people are saying about your products, figure out who the key influencers are within your target demographics. Can you imagine being able to mine this data to identify new market opportunities. What if hospitals could take the thousands of sensor readings collected every hour per patients in ICUs to identify subtle indications that the patient is becoming unwell, days earlier that is allowed by traditional techniques. Imagine if a green energy company could use PBs of weather data along with massive volumes of operational data to optimize asset location and utlization, making these environmentally friends energy sources more cost competitive with traditional sources. Imagine if you could make risk decisions, such as whether or not someone qualifies for a mortgage, in minutes, by analyzing many sources of data, including real-time transactional data, while the client is still on the phone or in the office. Image if law enforcement agencies could analyze audio and video feeds in real-time without human intervention to identify suspicious activity. As these new sources of data continue to grow in volume, variety and velocity, so too does the potential of this data to revolutionize the decision-making processes in every industry. Make risk decisions based on real- time transactional data Identify criminals and threats from disparate video, audio, and data feeds 7 7 7 7

Where we want to go

Merging the Traditional and Big Data Approaches Traditional Approach Structured & Repeatable Analysis Big Data Approach Iterative & Exploratory Analysis IT Delivers a platform to enable creative discovery Business Users Determine what question to ask The Big Data approach complements the traditional approach. Traditional approach – Biz users determine what questions to ask and IT structure the data to answer that question. This is well suited to many common business processes, such as monitoring sales by geography, product or channel; extract insight from customer surveys; cost and profitability analyses. The Big Data approach – IT delivers a platform that consolidates all sources of info and enables creative discover. Then the business users use the platform to explore data for idea and questions to ask. Most of the time, the data are raw data. On the left, the traditional approach allows organization to answer questions that will be asked time and time again . . . On the right, users have the ability to explore their data in a more creative way . . . Before finding the answer, they must first define the question. Are my customers starting to change their preferences? What is the best way to measure brand image? IT Structures the data to answer that question Business Users Explores what questions could be asked Monthly sales reports Profitability analysis Customer surveys Structured vs. Exploratory Brand sentiment Product strategy Maximum asset utilization 9 9 9 9

Where is all this data coming from ? 10

Where is all this data coming from ? 11

The Internet of Things (IoT) is a scenario in which objects, animals or people are provided with unique identifies and the ability to automatically transfer data over a network without requiring human-to-human or human-to-computer interaction 12

Where is all this data coming from ? 13

Approximately 2.7 billion users on the Internet today

Social Media as Big Data

What are we running ? Who is talking about us ? Male / Female / Student / Professional / Retired / Customers ? What do they “feel” ? Positive/Negative Sentiment / Angry / Annoyed ? Where are they talking ? Who are they influencing ? Who’s listening to them ?

When customers are talking about us or about our products we want to know where those conversations are happening so we can: Interact with interested customers Get in front of any issues

Numerous studies show that word-of-mouth and personal recommendations are seen as far more credible to consumers than newspaper and television advertisements. While such mass advertisements are still necessary because of their powerful reach, these findings show that companies need to increase their focus on more personalized approaches. Clearly, this is incredibly difficult, maybe even impossible, for most companies to deal directly with the countless number of potential consumers. This is where influencers come in……

What makes someone Influential ? The number of tweets they make ? The number of times people mention them ? The number of followers they have? How often they are retweeted ?

We were asked to look at why a particular product launch wasn’t performing as expected. We pulled all the “chatter” about it and found:

But there were people talking about it…..

Some things to think about…..

Where is all this data coming from ? While it is true that vast amounts of data are and will be generated from financial transactions, medical records, mobile phones and social media to the Internet of Things but there are questions that need to be asked to understand data’s meaningful use: How will data be managed? How will data be shared? Some thoughts about “data as a service” Establishment of standards, governance, guidelines. (E.g., open architectures) Creation of industry specific data exchanges. (E.g., healthcare data exchanges, environment data exchanges etc.) Creation of cross-industry data exchanges. (E.g., healthcare data exchanges seamlessly interacting with environmental data exchanges etc.) 33

Enterprise Integration Data Warehouse Big Data Platform Trusted Information & Governance Companies need to govern what comes in, and the insights that come out Data Management Insights from Big Data must be incorporated into the warehouse Enterprise Integration Key Points Trust, governance, privacy – how you use data for the enterprise matters – this isn’t just a technology for an internet company, this is managing large volumes of potentially sensitive data for the enterprise. Govern what comes it, govern what goes out - How you use Big Data matters Even though “Big data” means all of the data, it doesn’t necessarily mean you bring in all of the data and expose it to everyone without any sort of governance or quality. Example of internet tweets or blog posts on upcoming M&A, it could be factored into brand sentiment analysis, but what if you are not supposed to factor that data into internal decision making? Commentaires additionels Integration is of great importance. IBM has a mature and broad software stack. A key differentiator for IBM is the high degree of integration between these components. The Big Data Plaftform is no exception, and will integrate with the established components of the IBM IM software stack. With integration come questions of trust, governance, and privacy – how you use data for the enterprise matters – this isn’t just a technology for an internet company, this is managing large volumes of potentially sensitive data for the enterprise. You must govern what comes in, and govern what goes out - How you use Big Data matters. Even though “Big data” means all of the data, it doesn’t necessarily mean you bring in all of the data and expose it to everyone without any sort of governance or quality. For example, internet tweets or blog posts on upcoming M&A, it could be factored into brand sentiment analysis, but what if you are not supposed to factor that data into internal decision making? Traditional Sources New Sources 34 34 34 34

Poor data quality Dirty data Missing values Inadequate data size Poor representation in data sampling

How do we link them together ? Data variety - trying to accommodate data that comes from different sources and in a variety of different forms (images, geo data, text, social, numeric, etc.). How do we link them together ? Is there a common taxonomy or why to organize it ? Is there a “signal” in one source of data that points to another ?

Dealing with huge datasets, or 'Big Data,' that require distributed approaches.

Who is influential ? How do we define influence ?

Thank you for your attention 39 39 39

Where is all this data coming from ? 40

The Big Data Opportunity Extracting insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible. Variety: Velocity: Volume: Manage the complexity of multiple relational and non-relational data types and schemas Streaming data and large volume data movement Scale from terabytes to zettabytes (1B TBs) Big data is THE opportunity to extract insight from an immense volume, variety and velocity of data, in context, beyond what was previously possible. Massive volume, variety and velocity are defining characteristics of Big Data. It is obvious that a Big Data Problem will have, well, Big Data Volume. This Volume can start in the Terabytes and quickly move to the 100’s of Petabytes. New Storage solution is needed to be used to have this type of volume. Velocity is obvious as well because no organization wants their answers slower! We hear demands for new insights or analytics ranging from; “we need it in 4 hours not 4 weeks” to “response must be real time, that is sub-second response.” The third “V”, Variety, is the least understood, and could be the most profound. Big Data is derived mostly from sources not analyzed or not used before. Why, because that data is not derived from classical transaction systems which lends themselves to structured models. Most, but not all, data that is in a Big Data Platform is unstructured, or has part of it unstructured. IBM offers a unique platform to ingest, store, manipulate, manage, and, most importantly, analyze Big Data to discover fresh insight that drives new business opportunities. The marketplace is driving the need for new insights. IBM has done extensive research on this pas utile It is not just that classic data warehouse platforms cannot store and access those volumes, it is that to cost and labor associated with storing that data on those platforms is prohibitively expensive and difficult to deploy. However, answers that businesses are demanding have to be based on increasingly sophisticated analytics and the rate of response demanded can be an order of magnitude faster some of which I will summarize with you now. In order to capitalize on this opportunity, enterprises must be able to analyze ALL types of data – relational and non relational. Texts, sensor data, audio, video, transactional. Sometimes, getting an edge over your competition can mean identifying a trend, problem or opportunity, seconds, or even microseconds before someone else. More and more of the data being produced today, has a very short half-life. Organizations must be able to analyze this data in real-time if they are to be able to find insights in this data. And, as implied by the term Big Data, organization are facing massive volumes of data. Organizations who don’t know how to manage this data, are overwhelmed by it. But the opportunity is, with the right technology, to analyze ALL the data, to gain a better understanding or your business, your customers, the marketplace. The most expedient way to describe a Big Data Problem is the to use the three V’s: Variety, Velocity, and Volume. Lets take them in reverse order and start with the most obvious “V” and that is Volume. 41 41 41 41

Send Consolidate result Big Data : why is it possible Now ? Traditional approach : Data to Function Traditional approach Application server and Database server are separate Data can be on multiple servers Analysis Program can run on multiple Application servers Network is still a the middle Data have to go through the network Big Data Approach Analysis Program runs where are the data : on Data Node Only the Analysis Program are have to go through the network Analysis Program need to be MapReduce aware Highly Scalable : 1000s Nodes Petabytes and more User request Query Data Database server Application server Send result return Data Data process Data Big Data approach : Function to Data Query & process Data Send Function to process on Data Data nodes Data User request Data nodes Data Master node Data nodes Data Data nodes Data Send Consolidate result 42 42 42

It is not a replacement for your Database strategy What Big Data Is Not It is not a replacement for your Database strategy It is not a replacement for your Warehouse strategy It is not a solution by itself, it needs jobs/applications to drive value 43 43 43 43