Download presentation
Presentation is loading. Please wait.
Published byMoris Oliver Modified over 9 years ago
1
Data Cube Computation Model dependencies among the aggregates: most detailed “view” can be computed from view (product,store,quarter) by summing-up all quarterly sales product,store,quarter productstorequarter none store,quarterproduct,quarterproduct, store
2
Computation Directives Hash/sort based methods (Agrawal et. al. VLDB’96) 1.Smallest-parent 2.Cache-results 3.Amortize-scans 4.Share-sorts 5.Share-partitions product,store,quarter productstorequarter none store,quarterproduct,quarterproduct, store
3
Alternative Array-based Approach Model data as a sparse multidimensional array – partition array into chunks (a small sub-cube which fits in memory). – fast addressing based on (chunk_id, offset) Compute aggregates in “multi-way” by visiting cube cells in the order which minimizes the # of times to visit each cell, and reduces memory access and storage cost. B What is the best traversing order to do multi-way aggregation?
4
Roadmap What is the data warehouse, data mart Multi-dimensional data modeling Data warehouse design – the star schema, bitmap indexes The Data Cube operator – semantics and computation Aggregate View Selection
5
Views and Decision Support OLAP queries are typically aggregate queries. – Pre-computation is essential for interactive response times. – The CUBE is in fact a collection of aggregate queries, and pre-computation is especially important: lots of work on what is best to pre-compute given a limited amount of space to store pre-computed results. Warehouses can be thought of as a collection of asynchronously replicated tables and periodically maintained views. – Has renewed interest in view maintenance!
6
Materialized Views A view whose tuples are stored in the database is said to be materialized. – Provides fast access, like a (very high-level) cache. – Need to maintain the view as the underlying tables change. – Ideally, we want incremental view maintenance algorithms. Close relationship to data warehousing, OLAP, (asynchronously) maintaining distributed databases, checking integrity constraints, and evaluating rules and triggers.
7
Issues in View Materialization Algorithm to maintain a materialized view? What views should we materialize, and what indexes should we build on the pre- computed results? Given a query and a set of materialized views (possibly with some indexes), can we use the materialized views to answer the query?
8
View Selection Problem Use some notion of benefit per view Limit: disk space product,store,quarter productstorequarter none store,quarterproduct,quarterproduct, store Hanarayan et al SIGMOD’96: Pick views greedily until space is filled
9
Reality check:too many views! 2 n views for n dimensions (no- hierarchies) Storage/update- time explosion More pre- computation doesn’t mean better performance!!!!
10
Sources of Information - Books 1.The Data Warehouse Toolkit – Ralph Kimball ISBN 0- 471-15337-0 2.The Data Warehouse Lifecycle Toolkit – Ralph Kimball, Laura Reeves, Warren Thornthwaite & Margy Ross ISBN 0-471-25547-5 3.Data Warehouse Design Solutions – Christopher Adamson & Michael Venerable ISBN 0-471-25195-X
11
Sources of Information – Web Sites Technology Guides for Data Warehousing - www.techguide.comwww.techguide.com Ralph Kimball Associates Articles - www.ralphkimball.com/html/articles.htmlwww.ralphkimball.com/html/articles.html Data Warehousing - Data Warehousing Knowledge Center - www.datawarehousing.orgwww.datawarehousing.org The Data Warehousing Information Center - www.dwinfocenter.orgwww.dwinfocenter.org Documenting data replication and data transformation sites on the Net - www.datawarehousing.comwww.datawarehousing.com DM Review Business Intelligence & Data Warehousing Enabling E-Business - www.dmreview.com/www.dmreview.com/ The Data Warehousing Institute - www.dw-institute.comwww.dw-institute.com Intelligent Enterprise Magazine - www.intelligententerprise.com/www.intelligententerprise.com/ DataWarehousing Forum - www.datawarehousing.com/forum/www.datawarehousing.com/forum/
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.