Download presentation
Presentation is loading. Please wait.
1
Chapter 13 The Data Warehouse
Database Systems: Design, Implementation, and Management, Fifth Edition, Rob and Coronel
2
In this chapter, you will learn:
How operational data and decision support differ What a data warehouse is and how its data are prepared What star schemas are and how they are constructed What steps are required to implement a data warehouse successfully What data mining is and what role it plays in decision support Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
3
The Need for Data Analysis
External and internal forces require tactical and strategic decisions Search for competitive advantage Business environments are dynamic Decision-making cycle time is reduced Different managers require different decision support systems (DSS) Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
4
Decision Support Systems
Is a methodology Extracts information from data Uses information as basis for decision making Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
5
Decision Support Systems
Decision support system (DSS) Arrangement of computerized tools Used to assist managerial decision Extensive data “massaging” to produce information Used at all levels in organization Tailored to focus on specific areas and needs Interactive Provides ad hoc query tools Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
6
DSS Components Figure 13.1 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
7
Operational vs. Decision Support Data
Operational data Relational, normalized database Optimized to support transactions Real time updates DSS Snapshot of operational data Summarized Large amounts of data Data analyst viewpoint Timespan Granularity Dimensionality Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
8
The DSS Database Requirements
Database schema Support complex (non-normalized) data Extract multidimensional time slices Data extraction and filtering End-user analytical interface Database size Very large databases (VLDBs) Contains redundant and duplicated data Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
9
Data Warehouse Integrated Subject-Oriented Time Variant Non-Volatile
Centralized Holds data retrieved from entire organization Subject-Oriented Optimized to give answers to diverse questions Used by all functional areas Time Variant Flow of data through time Projected data Non-Volatile Data never removed Always growing Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
10
Creating a Data Warehouse
Figure 13.3 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
11
Data Marts Single-subject data warehouse subset
Decision support to small group Can be test for exploring potential benefits of Data warehouses Address local or departmental problems Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
12
DSS Architectural Styles
Traditional mainframe-based OLTP Managerial information system (MIS) with 3GL First-generation departmental DSS First-generation enterprise data warehouse using RDMS Second-generation data warehouse using MDBMS Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
13
Twelve Data Warehouse Rules
1. Separated from operational environment 2. Data are integrated 3. Contains historical data over long time horizon 4. Snapshot data captured at given time 5. Subject-oriented data 6. Mainly read-only data with periodic batch updates from operational source, no online updates 7. Development life cycle differs from classical one, data driven not process driven Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
14
Twelve Data Warehouse Rules (Con’t.)
8. Contains different levels of data detail Current and old detail Lightly and highly summarized 9. Characterized by read-only transactions to large data sets 10. Environment has system to trace data resources, transformation, and storage 11. Metadata critical components Identify and define data elements Provide the source, transformation, integration, storage, usage, relationships, and history of data elements 12. Contains charge-back mechanism for usage Enforces optimal use of data Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
15
Online Analytical Processing (OLAP)
Advanced data analysis environment Supports decision making, business modeling, and operations research activities Characteristics of OLAP Use multidimensional data analysis techniques Provide advanced database support Provide easy-to-use end-user interfaces Support client/server architecture Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
16
OLAP Client/Server Architecture
Figure 13.6 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
17
OLAP Server Arrangement
Figure 13.7 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
18
OLAP Server with Multidimensional Data Store Arrangement
Figure 13.8 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
19
OLAP Server with Local Mini-Data-Marts
Figure 13.9 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
20
Relational OLAP (ROLAP)
OLAP functionality Uses relational DB query tools Extensions to RDBMS Multidimensional data schema support Data access language and query performance optimized for multidimensional data Support for very large databases (VLDBs) Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
21
Typical ROLAP Client/Server Architecture
Figure 13.10 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
22
Multidimensional OLAP (MOLAP)
OLAP functionality to multidimensional databases (MDBMS) Stored data in multidimensional data cube N-dimensional cubes called hypercubes Cube cache memory speeds processing Affected by how the database system handles density of data cube called sparsity Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
23
MOLAP Client/Server Architecture
Figure 13.11 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
24
Star Schema Data-modeling technique
Maps multidimensional decision support into relational database Yield model for multidimensional data analysis while preserving relational structure of operational DB Four Components: Facts Dimensions Attributes Attribute hierarchies Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
25
Simple Star Schema Figure 13.12
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
26
Slice and Dice View of Sales
Figure 13.14 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
27
Star Schema Representation
Facts and dimensions represented by physical tables in data warehouse DB Fact table related to each dimension table (M:1) Fact and dimension tables related by foreign keys Subject to the primary/foreign key constraints Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
28
Star Schema for Sales Figure 13.17
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
29
Performance-Improving Techniques for Star Schema
Normalization of dimensional tables Multiple fact tables representing different aggregation levels Denormalization of the fact tables Table partitioning and replication Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
30
Data Warehouse Implementation Road Map
Figure 13.21 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
31
Data Mining Seeks to discover unknown data characteristics
Automatically searches data for anomalies and relationships Data mining tools Analyze data Uncover problems or opportunities Form computer models based on findings Predict business behavior with models Require minimal end-user intervention Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
32
Extraction of Knowledge from Data
Figure 13.22 Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
33
Data Mining Process Figure 13.23
Database Systems: Design, Implementation, & Management, 5th Edition, Rob & Coronel
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.