Presentation is loading. Please wait.

Presentation is loading. Please wait.

Concepts and Components Chapters 1, 2, 7, 14, 15 1.

Similar presentations


Presentation on theme: "Concepts and Components Chapters 1, 2, 7, 14, 15 1."— Presentation transcript:

1 Concepts and Components Chapters 1, 2, 7, 14, 15 1

2 Agenda  Background  Data Warehouse vs Operational Data Store  Characteristics of a Data Warehouse  Improvements in Data Warehousing  Relationship to Business Intelligence 2

3 Evolution of Decision Support Technologies  Business people need information to make plans, decisions, and assess results  60's  Batch reports  70’s  DSSs  80’s  Info Centers  90’s  Early DWs  2000's  Business Intelligence  Issues:  Dependency on IT resources  Based on OLTP or extracts  Functionality often pre-programmed  "Big Data" Analytics 3

4 DW vs. Business Intelligence  Short:  DW = populating structures with data  BI = using DW data  Long:  DW = body of historical data, separate from the operations of the organization, used to create BI  BI = the delivery of timely, accurate, and useful information to decision-makers  Broad:  BI = a broad category of applications, technologies, and organizational processes for gathering, storing, accessing, and analyzing data to help business users make better decisions 4

5 Need for Decision-Optimized Data Storage  Business people need information to make plans, decisions, and assess results  What were sales volumes by region and product category for the last 3 years?  Which of two new medications will result in the best outcomes (higher recovery rate and shorter hospital stay)?  Data captured by complex operational systems (OLTPs) optimized to support well- defined transaction requirements  Difficult to get needed information from data grounded in OLTPs 5

6 Operational vs. Informational Data Operational (ie, OLTP)Informational (ie, OLAP) Data ContentCurrent ValuesHistorical, derived, summarized Data StructureOptimized for transactionsOptimized for complex queries Data VolumeMB/GB of dataGB/TB/PB… of data Access FrequencyHighMedium to low Access TypeRead, update, deleteRead-only UsagePredictable, repetitiveAd hoc, random, heuristic Response TimeSub-secondsSeveral seconds to minutes UsersLarge number; operational & data workers Relatively smaller number; data & knowledge workers 6

7 Data Warehouse 7 “… a subject-oriented, integrated, nonvolatile, and time variant collection of data in support of management decisions.”  Managing the Data Warehouse, W. H. Inmon, John Wiley & Sons, December, 1996.  “… a copy of transaction data specifically structured for query and analysis.”  The Data Warehouse Toolkit, R. Kimball, John Wiley & Sons, February, 1996.  Enterprise data, transformed, integrated, accumulated over time, optimized for decision- making, and accessible via analytical tools

8 Characteristics of a DW (ala Inmon)  Subject-Oriented  As opposed to business-process oriented  Integrated  Multiple sources, internal and external  Critical part of DW implementation  Time-Variant  History, time periods important  Non-Volatile  DW data not changed once stored 8

9 Subject Orientation  Data organized based on:  How users refer to it  Subject areas of interest to users  Areas important for tracking success, performance  Often based on transactions 9 Graphic Source: http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=22&ved=0CHUQFjALOAo&url=http%3A%2F%2Fwww.csun. edu%2F~hcmgt004%2FDWPresSp05.ppt&ei=U1S9VMuOJsTCgwSes4HADg&usg=AFQjCNGvjoMayU2- 79jot8f2zb4J_IbZGQ&bvm=bv.83829542,d.eXY&cad=rja http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=22&ved=0CHUQFjALOAo&url=http%3A%2F%2Fwww.csun. edu%2F~hcmgt004%2FDWPresSp05.ppt&ei=U1S9VMuOJsTCgwSes4HADg&usg=AFQjCNGvjoMayU2- 79jot8f2zb4J_IbZGQ&bvm=bv.83829542,d.eXY&cad=rja

10 Characteristics of a DW, cont…  Subject-Oriented  Needs are business subject-focused  Integrated  Multiple sources, internal and external  Time-Variant  History, time periods important  Non-Volatile  DW data not changed once stored  Data Granularity 10

11 Data Granularity*  Level of detail stored in database  Operational focus  Analytical focus  Examples:  Life Insurance Policy vs. Life Insurance Coverage  Product Category vs. Product Sales  High granularity (eg, transactional grain) is most flexible * See http://www.kimballgroup.com/2007/07/keep-to-the-grain-in-dimensional- modeling/ for excellent (and brief) description of granularityhttp://www.kimballgroup.com/2007/07/keep-to-the-grain-in-dimensional- modeling/ 11

12 Challenges in Early DW Implementation 1. Improper or infeasible architectures, approaches 2. Insufficient attention to organizational strategy and culture 3. Early information delivery tools too complex for business users 4. Storage technology made it difficult to store much detail or history, and slow to process 12

13 Improved Technology  User-friendly tools for analysis, visualization  Excel  Tableau  Reporting Services, …  Improved technology for accessing, aggregating, partitioning data  Advances in processing technology  Parallel processing  In-Memory data warehouses  Advances in storage technology  RAID  Solid State 13

14 Improved Architectures  Based on  Data Marts  Conformed dimensions  BI-emphasis 14

15 Data Warehouse vs. Data Marts  Enterprise Data Warehouse  Information about ALL subjects important to the organization 15 ProsCons Requires a corporate effort Single, central storageLonger (costlier) to implement Centralized control, architectureHigher risk of failure

16 Data Warehouse vs. Data Marts, cont… ProsCons Faster implementationCan introduce redundant data Earlier return on investmentCan make data mart integration more complex Less risk of failure Gives project team time to learn, grow 16  Data Marts  Subsets of data warehouse that focus on a selected subject area; typically departmental in nature

17 Data Warehouse Architecture: Basic 17

18 Data Warehouse Architecture: Types 18

19 BI Architecture 19 Source: Chaudhuri et. al., An Overview of Business Intelligence Technology, Communications of the ACM, 54(8), August 2011, pp. 88-98.

20 BI Architecture, cont… 20 Source: Oracle Corporation. Information Management and Big Data: A Reference Architecture, Oracle White Paper, February 2013, p. 12.

21 BI Architecture as “LDW” 21 Source: http://skylandtech.net/2014/09/22/a-modern-data-warehouse-architecture-part-1-add-a-data-lake/http://skylandtech.net/2014/09/22/a-modern-data-warehouse-architecture-part-1-add-a-data-lake/

22 Architecture Components 1. Data Sources 2. Data Staging (Movement) 3. Data Storage (Warehouse) 4. Data Analysis/Discovery (Mid-tier) 5. Information Delivery (Front-end Presentation) 22

23 1. Data Sources  Identifying required business data from  Production  Internal, Personal  Archived  External 23

24 2. Data Staging  Extract  From source systems  Transform  Cleanse  Supplement  Convert  Combine…  Load  Populate data warehouse/mart tables 24

25 3. Data Storage  Data Warehouse / Data Mart  Relational database for structured data  Non-relational (e.g., Hadoop) data store for "loosely-structured" data  Metadata  Relational database  Catalog  Extended properties  Custom tables  External products/tools  Spreadsheets… 25

26 4. Data Analysis: Supporting Knowledge Discovery Forms of Discovery Know What Info to Look For Know How to Explore Info Layout-Led Discovery X X Data-Led Discovery X Model-Led Discovery  Layout-Led Discovery  Pre-Designed Reports  Data-Led Discovery  OLAP Analysis  Model-Led Discovery  Data Mining 26

27 Pre-Defined Reports  Information pushed to user  Content and layout pre-determined  Can be parameter-driven  Can support some drill-down  May also include basic report development 27

28 OLAP  Online Analytical Processing  Providing On-Line Analytical Processing to User Analysts, E. F. Codd, Codd & Date, Inc 1993.  Short Definition:  Class of applications or tools that support ad-hoc analysis of multidimensional data  Longer Definition:  “…technology that enables [users]… to gain insight into data through…fast, consistent, interactive access [to]…information that has been transformed…to reflect the real dimensionality of the enterprise…”  OLAP Council (www.olapcouncil.org)www.olapcouncil.org 28

29 Data Mining  Search for patterns in large amounts of data  Making connections/associations with data  Predicting future outcomes  OLAP vs. Data Mining  “Report on the past” vs. “Predict the future”  Part of Knowledge Discovery… 29

30 5. Information Delivery 30

31 Examples of Uses of BI/DW IndustryUse RetailCustomer Loyalty Customer Service Effectiveness FinancialFraud Detection Profitability by product/LOB AirlinesRoute Profitability Customer Profitability ManufacturingCost Reduction Opportunities Product Shipments Non ProfitsGiving Campaign Effectiveness Salvation Army Bell Ringer Effectiveness GovernmentManpower Planning School Academic Performance MedicalPatient Risk for Disease 31

32 More Examples… IndustryUse MarketingChurn Reduction Brand/Product/Company Perception SalesCross-Selling Opportunities Technology/ConsultingEstimate future engagements 32

33 Next Time…  Data Warehouse Design (Dimensional Modeling) 33


Download ppt "Concepts and Components Chapters 1, 2, 7, 14, 15 1."

Similar presentations


Ads by Google