Download presentation
Presentation is loading. Please wait.
1
Data Warehouse
2
Definition Data Warehouse: An integrated and consistent store of subject-oriented data that is obtained from a variety of sources and formatted into a meaningful context to support decision-making in an organization.
3
Need for Data Warehousing
Integrated, company-wide view of high-quality information. Separation of operational and informational systems and data. Table 14-1.
4
Examples of heterogeneous data
5
Factors Allowing Data Warehousing
Relational DBMS. Advances in hardware: speed and storage capacity. End-user computing interfaces and tools.
6
Data Warehouse Architectures
Two-level - Fig Three-level - Fig Operational data. Enterprise data warehouse (EDW)- single source of data for decision making. Data marts - limited scope; data selected from EDW.
7
Generic data warehouse architecture
8
Three-layer architecture
9
Reasons for the Three-Level Architecture
EDW and data marts have different purposes and data architectures. Data transformation is complex and is best performed in two steps. Data marts customized decision support for different groups.
10
Three-Level Data Architecture
Fig Operational data. Reconciled data. Derived data.
11
Three-layer data architecture
12
Data Characteristics Status vs. Event data.
Fig Transient vs. Periodic data. Fig. 14-6,7.
13
Example of DBMS log entry
14
Transient operational data
15
Reconciled Data Characteristics
Detailed Historical Normalized Enterprise-wide Quality controlled
16
The Data Reconciliation Process
Fig Capture Static - initial load. Incremental - ongoing update. Scrub or data cleansing Pattern recognition and other artificial intelligence techniques.
17
Steps in data reconciliation
18
The Data Reconciliation Process
Transform Convert the data format from the source to the target system. Record-Level Functions Selection. Joining. Aggregation (for data marts). Field-Level Functions Single-field transformation, Fig Multi-field transformation, Fig
19
The Data Reconciliation Process
Load and Index Refresh Mode When the warehouse is first created. Static data capture. Update Mode Ongoing update of the warehouse. Incremental data capture.
20
Derived Data Characteristics
Type of data Detailed, possibly periodic. Aggregated. Distributed to departmental servers. Implemented in star schema.
21
Star Schema Also called the dimensional model.
Fact and dimension tables. Fig ,12, 13. Grain of a fact table - time period for each record. Multiple Fact Table - Fig Snowflake Schema - Fig
22
Components of a star schema
23
Star schema example
24
Star schema with sample data
25
Star schema with two fact tables
26
Example of snowflake sample
27
Types of Data Marts Dependent - Populated from the EDW.
Independent - Data taken directly from the operational databases.
28
The User Interface The role of metadata.
Traditional query and reporting tools. On-line analytical processing. The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques.
29
The User Interface Fig. 14-16. Slicing a cube. Pivot
Rotate the view for a particular data point to obtain another perspective. E.g. take a value from the units column and obtain by-store values. Drill-down - Fig
30
Slicing a data cube
31
The User Interface Data Mining Data Visualization Knowledge discovery.
Search for patterns in the data. Table 14-3, 4. Data Visualization
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.