Download presentation
Presentation is loading. Please wait.
Published byTyrone Hood Modified over 8 years ago
1
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions
2
Hector Garcia Molina: Data Warehousing and OLAP 2 Warehouse Models & Operators Data Models u relations u stars & snowflakes u Cubes Operators u slice & dice u roll-up, drill down u pivoting u other
3
Hector Garcia Molina: Data Warehousing and OLAP 3 Star
4
Hector Garcia Molina: Data Warehousing and OLAP 4 Star Schema sale orderId date custId prodId storeId qty amt
5
Hector Garcia Molina: Data Warehousing and OLAP 5 Terms l Fact table l Dimension tables l Measures
6
Hector Garcia Molina: Data Warehousing and OLAP 6 Dimension Hierarchies store sType cityregion snowflake schema constellations
7
Hector Garcia Molina: Data Warehousing and OLAP 7 Cube Fact table view: Multi-dimensional cube: dimensions = 2
8
Hector Garcia Molina: Data Warehousing and OLAP 8 3-D Cube day 2 day 1 dimensions = 3 Multi-dimensional cube:Fact table view:
9
Hector Garcia Molina: Data Warehousing and OLAP 9 ROLAP vs. MOLAP l ROLAP: Relational On-Line Analytical Processing l MOLAP: Multi-Dimensional On-Line Analytical Processing
10
Hector Garcia Molina: Data Warehousing and OLAP 10 Aggregates Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE WHERE date = 1 81
11
Hector Garcia Molina: Data Warehousing and OLAP 11 Aggregates Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date
12
Hector Garcia Molina: Data Warehousing and OLAP 12 Another Example Add up amounts by day, product In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId drill-down rollup
13
Hector Garcia Molina: Data Warehousing and OLAP 13 Aggregates l Operators: sum, count, max, min, median, ave l “Having” clause l Using dimension hierarchy u average by region (within store) u maximum by month (within date)
14
Hector Garcia Molina: Data Warehousing and OLAP 14 Cube Aggregation day 2 day 1 129... drill-down rollup Example: computing sums
15
Hector Garcia Molina: Data Warehousing and OLAP 15 Cube Operators day 2 day 1 129... sale(c1,*,*) sale(*,*,*) sale(c2,p2,*)
16
Hector Garcia Molina: Data Warehousing and OLAP 16 Extended Cube day 2 day 1 * sale(*,p2,*)
17
Hector Garcia Molina: Data Warehousing and OLAP 17 Aggregation Using Hierarchies day 2 day 1 customer region country (customer c1 in Region A; customers c2, c3 in Region B)
18
Hector Garcia Molina: Data Warehousing and OLAP 18 Pivoting day 2 day 1 Multi-dimensional cube: Fact table view:
19
Hector Garcia Molina: Data Warehousing and OLAP 19 Integration l Data Cleaning l Data Loading l Derived Data Client Warehouse Source Query & Analysis Integration Metadata
20
Hector Garcia Molina: Data Warehousing and OLAP 20 Data Cleaning Migration (e.g., yen dollars) l Scrubbing: use domain-specific knowledge (e.g., social security numbers) l Fusion (e.g., mail list, customer merging) l Auditing: discover rules & relationships (like data mining) billing DB service DB customer1(Joe) customer2(Joe) merged_customer(Joe)
21
Hector Garcia Molina: Data Warehousing and OLAP 21 Loading Data l Incremental vs. refresh l Off-line vs. on-line l Frequency of loading u At night, 1x a week/month, continuously l Parallel/Partitioned load
22
Hector Garcia Molina: Data Warehousing and OLAP 22 Derived Data l Derived Warehouse Data u indexes u aggregates u materialized views (next slide) l When to update derived data? l Incremental vs. refresh
23
Hector Garcia Molina: Data Warehousing and OLAP 23 Materialized Views l Define new warehouse relations using SQL expressions does not exist at any source
24
Hector Garcia Molina: Data Warehousing and OLAP 24 Processing l ROLAP servers vs. MOLAP servers l Index Structures l What to Materialize? l Algorithms Client Warehouse Source Query & Analysis Integration Metadata
25
Hector Garcia Molina: Data Warehousing and OLAP 25 ROLAP Server l Relational OLAP Server relational DBMS ROLAP server tools utilities Special indices, tuning; Schema is “denormalized”
26
Hector Garcia Molina: Data Warehousing and OLAP 26 MOLAP Server l Multi-Dimensional OLAP Server multi- dimensional server M.D. tools utilities could also sit on relational DBMS Product City Date 1 2 3 4 milk soda eggs soap A B Sales
27
Hector Garcia Molina: Data Warehousing and OLAP 27 Join “Combine” SALE, PRODUCT relations In SQL: SELECT * FROM SALE, PRODUCT
28
Hector Garcia Molina: Data Warehousing and OLAP 28 Join Indexes join index
29
Hector Garcia Molina: Data Warehousing and OLAP 29 What to Materialize? l Store in warehouse results useful for common queries l Example: day 2 day 1 129... total sales materialize
30
Hector Garcia Molina: Data Warehousing and OLAP 30 Cube Aggregates Lattice city, product, date city, productcity, dateproduct, date cityproductdate all day 2 day 1 129 use greedy algorithm to decide what to materialize
31
Hector Garcia Molina: Data Warehousing and OLAP 31 Dimension Hierarchies all state city
32
Hector Garcia Molina: Data Warehousing and OLAP 32 Dimension Hierarchies city, product city, product, date city, date product, date city product date all state, product, date state, date state, product state not all arcs shown...
33
Hector Garcia Molina: Data Warehousing and OLAP 33 Interesting Hierarchy all years quarters months days weeks conceptual dimension table
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.