Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Warehouse and OLAP

Similar presentations


Presentation on theme: "Data Warehouse and OLAP"— Presentation transcript:

1 Data Warehouse and OLAP
Data Mining: Concepts and Techniques by J. Han and M. Kamber 11/19/2018 CSE591: Data Mining by H. Liu

2 What is a data warehouse?
A repository of information collected from multiple sources, stored under a unified schema at a single site Characteristics: Subject-oriented Integrated Time-variant Nonvolatile A semantically consistent data store for decision support at enterprise level 11/19/2018 CSE591: Data Mining by H. Liu

3 Data warehousing and a multidimensional data model
DWing - the process of constructing and using DW. OLTP and OLAP: user & system orientation, data contents, database design, view, access patterns (Table 2.1) A DW is usually modeled by a multidimensional database structure - a data cube An example of a data cube of Student Information dimensions: nationality, level, status, … 11/19/2018 CSE591: Data Mining by H. Liu

4 Schemas for multidimensional databases
OLTP and OLAP on the same databases(?) achieving high performance of both systems Dimensions are the perspectives or entities Facts are numeric measures Dimension table and fact table DW schemas Star Snowflakes Fact constellations 11/19/2018 CSE591: Data Mining by H. Liu

5 Examples for defining schemas
define cube, define dimension …as…in Star Fig. 2.4 Example 2.4 Snowflake Fig. 2.5 Example 2.5 Fact constellation Fig. 2.6 Example 2.6 11/19/2018 CSE591: Data Mining by H. Liu

6 CSE591: Data Mining by H. Liu
OLAP operations Concept hierarchy one for each dimension Operations (Fig. 2.10) Dice Slice Pivot Roll-up Drill-down 11/19/2018 CSE591: Data Mining by H. Liu

7 Data Warehouse Architecture
A 3-tier data warehouse architecture Front-end tools OLAP server Data warehouse server Three models: Enterprise warehouse Data mart Virtual warehouse 11/19/2018 CSE591: Data Mining by H. Liu

8 CSE591: Data Mining by H. Liu
Metadata Repository When used in DW, metadata are the data that define warehouse objects: a directory to help the decision support system analyst locate the contents of the DW a guide to the mapping of data from the source to the DW a guide to the algorithms used for summarization A metadata repository contains the DW structure, operational metadata, algorithms used, mapping, data related to system performance, business metadata 11/19/2018 CSE591: Data Mining by H. Liu

9 CSE591: Data Mining by H. Liu
From DW to DM DW back-end tools and utilites data extraction, cleaning, transformation, load, refresh The uses of DW generate reports and answer predefined queries analyze summarized and detailed data performing multidimensional analysis knowledge discovery and strategic decision making using data mining Information, analytical processing and DM 11/19/2018 CSE591: Data Mining by H. Liu


Download ppt "Data Warehouse and OLAP"

Similar presentations


Ads by Google