Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2014 IBM Corporation Integrated Data Management David Majcher Information Architect 847 530-1411 Looking at Hadoop in the Rearview.

Similar presentations


Presentation on theme: "© 2014 IBM Corporation Integrated Data Management David Majcher Information Architect 847 530-1411 Looking at Hadoop in the Rearview."— Presentation transcript:

1 © 2014 IBM Corporation Integrated Data Management David Majcher Information Architect dmajcher@us.ibm.com 847 530-1411 Looking at Hadoop in the Rearview Mirror

2 © 2014 IBM Corporation Integrated Data Management Technologies Lead to Revolution

3 © 2014 IBM Corporation Integrated Data Management “Olden days” of Gold Mining: Miners could spot nuggets/veins of gold Opportunity/Value of gold is obvious Too expensive to dig everywhere – only dig where gold could be “seen” Today’s Databases / Warehouses High value data is identified in source systems Opportunity/Value of data is obvious Too expensive to store/analyze everything Before Big Data, we invested where the value to be extracted was obvious

4 © 2014 IBM Corporation Integrated Data Management 4 Today, new-age equipment allows us to dig everywhere and process millions of tons of dirt to extract tiny fragments (1-2 ppm) of gold that are hard to see. The fragments are combined to create gold bars.

5 © 2014 IBM Corporation Integrated Data Management Gartner is trying to coin the term “Data Reservoir”

6 © 2014 IBM Corporation Integrated Data Management Paradigm Shifts Enabled by Big Data Traditional ApproachBig Data Approach Start with Hypothesis, Test Against Selected Data Explore ALL Data, Identify Correlations ? Analyzed Information Question DataAnswer Hypothesis Data Correlation All Information Exploration Actionable Insight Data leads the way… and sometimes correlations are good enough

7 © 2014 IBM Corporation Integrated Data Management Paradigm Shifts Enabled by Big Data Traditional ApproachBig Data Approach Analyze data AFTER it has been processed and landed in a Warehouse or Data Mart Analyze data IN MOTION as it is generated, in real-time Leverage data as it is captured

8 © 2014 IBM Corporation Integrated Data Management Conventional Data and Analytics Architecture 8 Data at Rest Information Ingestion and Operational Information Decision Management BI and Predictive Analytics Navigation and Discovery Intelligence Analysis  Data Integration  Master Data Information Governance, Security and Business Continuity Operational Data Sources External and Partner Data MDM

9 © 2014 IBM Corporation Integrated Data Management Conventional Data and Analytics Architecture 9 Data at Rest Information Ingestion and Operational Information Decision Management BI and Predictive Analytics Navigation and Discovery Intelligence Analysis  Data Integration  Master Data Information Governance, Security and Business Continuity Operational Data Sources External and Partner Data MDM

10 © 2014 IBM Corporation Integrated Data Management Hadoop Data Use Cases Reengineer SQL ETL Framework Mgr Reengineer SQL

11 © 2014 IBM Corporation Integrated Data Management High Level Solution Component Diagram

12 © 2014 IBM Corporation Integrated Data Management

13 © 2014 IBM Corporation Integrated Data Management Data Integration 80% By most accounts…this is the amount of development effort the goes into data integration for a Big Data project. When all is said and done, this is the amount of effort going towards data analysis. 20%

14 © 2014 IBM Corporation Integrated Data Management 14 Top sources of information used as part of initial big data efforts – typically start with data already being captured Source: The real world use of Big Data, IBM & University of Oxford Big data sources Respondents with active big data efforts were asked which data sources are currently being collected and analyzed as part of active big data efforts within their organization. Banking & Fin Mgmt respondents Global respondents

15 © 2014 IBM Corporation Integrated Data Management Big data is governed in zones Source: “IBM Data Governance”, a commissioned study conducted by Forrester Consulting on behalf of IBM,August, 2013 Base: 512 Director or VP level professionals with decision making authority for Big Data technologies Unstructured Structured

16 © 2014 IBM Corporation Integrated Data Management Shared Analytics Information Zone Big Data Enhanced Analytic and Reporting Zones Key tenet: Data should be organized by data usage / workload instead of organizing by data consumers Data Ingestion Zone Structured More Governed, More Secure Agile Less Governed, Less Secure 16 Trusted / Guided Analytic Zone Exploration / Discovery Zone

17 © 2014 IBM Corporation Integrated Data Management Shared Analytics Information Zone Workload-Optimized Data Deployment Even “unstructured data” must eventually have a structure to gain insight … where and when you apply structure depends on the use case Data Ingestion Zone Structured More Governed, More Secure Agile Less Governed, Less Secure 17 Trusted / Guided Analytic Zone Exploration / Discovery Zone “Schema on read” supports high ingest rates Mostly non-interactive Governed usage and integration Small number of concurrent users Perhaps large data volumes for a given question Modeled and structured data support repeated questions Large numbers of concurrent users Limited history / data volumes Both unstructured and structured (modeled) data to support discovery and model development Smaller number of heavy data users Complex questions

18 © 2014 IBM Corporation Integrated Data Management Shared Analytics Information Zone  Modeled Data  Structured, Semi-structured  Consolidated / Harmonized  Aggregated  “Recent” History  “Landing Zone” (traditional)  Data Deployment and Self-Provisioning  Data Survey and Data Quality Assessment  Search & Indexing  Text Analytics  Archive Exploration / Discovery Zone Trusted / Guided Analytic Zone Big Data Enhanced Analytic and Reporting Zones Key tenet: Data should be organized by data usage / workload instead of organizing by data consumers Data Ingestion Zone Structured More Governed, More Secure Agile Less Governed, Less Secure  Raw data  Deep history  Structured & Unstructured Typical Uses Typical Data  “Enterprise Data Warehouse”  Operational and Business Reporting  Risk Assessment  Data Mining  Dashboards and KPIs  (many more…) Typical DataTypical Uses * Typical data storage / interaction technologies 18  Structured  Semi-structured  Unstructured  Raw and Derived  Trusted and New Sources  Interactive and Proactive Discovery  Forecasting  Predictive Modeling  Deep Analysis / Data Science  Rapid Response Typical DataTypical Uses

19 © 2014 IBM Corporation Integrated Data Management Data Warehouse Enterprise Data Landing Zone Architecture Overview Decision Support Operational Business Intelligence Reporting & Performance Management Reporting & Performance Management Modeling, Analytics & Simulation Discovery, Deep Analytics Visualize Sources Information Governance ERP Main- frame HR Web Files Landing & Discovery Zone Information Integration Exploration, Search & Visualization

20 © 2014 IBM Corporation Integrated Data Management Data Warehouse Queryable Archive Architecture Overview Decision Support Operational Business Intelligence Reporting & Performance Management Reporting & Performance Management Modeling, Analytics & Simulation Discovery, Deep Analytics Visualize Sources Information Governance ERP Main- frame HR Web Files Queryable Archive Information Integration Exploration, Search & Visualization SQL Big

21 © 2014 IBM Corporation Integrated Data Management Data Warehouse Warehouse Augmentation Architecture Overview Decision Support Operational Business Intelligence Reporting & Performance Management Reporting & Performance Management Modeling, Analytics & Simulation Discovery, Deep Analytics Visualize Sources Information Governance ERP Main- frame HR Web Files Exploration, Search & Visualization Landing Zone Harmonized Zone Information Integration SQL Big

22 © 2014 IBM Corporation Integrated Data Management Big Data & Analytics Architecture Big Data Platform Capabilities Information Ingest Real-time Analytics Warehouse & Data Marts Analytic Appliances All Data Sources Advanced Analytics/ New Insights New/ Enhanced Applications Cognitive Learn Dynamically? Prescriptive Best Outcomes? Predictive What Could Happen? Descriptive What Has Happened? Exploration and Discovery What Do You Have? Streaming Data Text Data Applications Data Time Series Geo Spatial Relational Social Network Video & Image Automated Process Case Management Analytic Applications Watson Cloud Services ISV Solutions Alerts

23 © 2014 IBM Corporation Integrated Data Management Questions

24 © 2014 IBM Corporation Integrated Data Management Demo

25 © 2014 IBM Corporation Integrated Data Management 25 Thank You Merci Grazie Gracias Obrigado Danke Japanese French Russian German Italian Spanish Portuguese Arabic Traditional Chinese Simplified Chinese Hindi Romanian Korean Multumesc Turkish Teşekkür ederim English


Download ppt "© 2014 IBM Corporation Integrated Data Management David Majcher Information Architect 847 530-1411 Looking at Hadoop in the Rearview."

Similar presentations


Ads by Google