Physical Storage Indexes Partitions Materialized views March 2004

Slides:



Advertisements
Similar presentations
Yukon – What is New Rajesh Gala. Yukon – What is new.NET Framework Programming Data Types Exception Handling Batches Databases Database Engine Administration.
Advertisements

Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Data Warehousing and Decision Support, part 2
1 DynaMat A Dynamic View Management System for Data Warehouses Vicky :: Cao Hui Ping Sherman :: Chow Sze Ming CTH :: Chong Tsz Ho Ronald :: Woo Lok Yan.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Advanced Database Systems September 2013 Dr. Fatemeh Ahmadi-Abkenari 1.
March 2010ACS-4904 Ron McFadyen1 Indexes B-tree index Bitmapped index Bitmapped join index A data warehousing DBMS will likely provide these, or variations,
March 2010ACS-4904 Ron McFadyen1 Aggregate management References: Lawrence Corr Aggregate improvement
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
Database Management 9. course. Execution of queries.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
BI Terminologies.
SQL Jan 20,2014. DBMS Stores data as records, tables etc. Accepts data and stores that data for later use Uses query languages for searching, sorting,
Indexes and Views Unit 7.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 5 Index and Clustering
Chapter 8 Views and Indexes 第 8 章 视图与索引. 8.1 Virtual Views  Views:  “virtual relations”. Another class of SQL relations that do not exist physically.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Manipulating Data Lesson 3. Objectives Queries The SELECT query to retrieve or extract data from one table, how to retrieve or extract data by using.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
1 Introduction to Database Systems, CS420 SQL Views and Indexes.
Databases. What is a Database? A database is an organized collection of information or data. Databases can be paper-based or electronic. Information (text.
9 Copyright © 2006, Oracle. All rights reserved. Summary Management.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
All DBMSs provide variations of b-trees for indexing B-tree index
Table General Guidelines for Better System Performance
DBMS & TPS Barbara Russell MBA 624.
Module 11: File Structure
Indexes By Adrienne Watt.
Indexing Structures for Files and Physical Database Design
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
Parallel Databases.
Database Systems: Design, Implementation, and Management Tenth Edition
Database Performance Tuning &
Methodology – Physical Database Design for Relational Databases
Database Performance Tuning and Query Optimization
Dimensional Model January 14, 2003
Chapter 15 QUERY EXECUTION.
Access Path Selection in a Relational Database Management System
File Processing : Query Processing
Session #, Speaker Name Indexing Chapter 8 11/19/2018.
Database.
Physical Database Design
Databases.
Typically data is extracted from multiple sources
Physical Storage Indexes Partitions Materialized views March 2006
Table General Guidelines for Better System Performance
Physical Storage materialized views indexes partitions ETL CDC.
Retail Sales is used to illustrate a first dimensional model
Physical Storage Indexes Partitions Materialized views March 2005
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Data warehouse architecture CIF, DM Bus Matrix Star schema
Aggregate Improvement and Lost, shrunken, and collapsed
Query Processing CSD305 Advanced Databases.
Point-in-time balances Physical database Aggregation ETL Architecture
Introduction to Databases
Retail Sales is used to illustrate a first dimensional model
Chapter 8 Advanced SQL.
Chapter 11 Database Performance Tuning and Query Optimization
Database Systems: Design, Implementation, and Management Tenth Edition
Aggregate improvement Lost, shrunken, and collapsed Ralph Kimball
Query Processing.
Manipulating Data Lesson 3.
Evaluation of Relational Operations: Other Techniques
Unit 12 Index in Database 大量資料存取方法之研究 Approaches to Access/Store Large Data 楊維邦 博士 國立東華大學 資訊管理系教授.
Data Warehousing.
Presentation transcript:

Physical Storage Indexes Partitions Materialized views March 2004 91.4904 Ron McFadyen

All DBMSs provide variations of b-trees for indexing B-tree index Indexes All DBMSs provide variations of b-trees for indexing B-tree index Bitmapped index Bitmapped join index A data warehousing DBMS will likely provide these, or variations, on these March 2004 91.4904 Ron McFadyen

Most used access structure in database systems. B-tree structures Most used access structure in database systems. There are b-trees, b+-trees, b*-trees, etc. B-trees and their variations are: balanced very good in environments with mixed reading and writing operations, concurrency, exact searches and range searches provide excellent performance when used to find a few rows in big tables are studied in 91.3902 March 2004 91.4904 Ron McFadyen

Consider the following Customer table Bitmapped index Consider the following Customer table Cust_id gender province phone 22 M Ab (403) 444-1234 44 M Mb (204) 777-6789 77 F Sk (306) 384-8474 88 F Sk (306) 384-6721 99 M Mb (204) 456-1234 Note that Mb appears in rows 2, 5 Note that Sk appears in rows 3, 4; Ab in row 1 Note that M appears in rows 1, 2, 5; F in rows 3, 4 March 2004 91.4904 Ron McFadyen

A bitmapped index comprises a collection of bit arrays. There is one bit array for each distinct value of the indexed attribute. Useful for low-cardinality attributes Since gender has two values there are two bit arrays. Each bit array has the same number of bits as there are rows in the table. If we construct a bitmapped index for gender we would have two bit arrays of 5 bits each: M 1 1 0 0 1 F 0 0 1 1 0 March 2004 91.4904 Ron McFadyen

Bitmapped index If we construct a bitmapped index for Province we would have three bit arrays of 5 bits each: Ab 1 0 0 0 0 Mb 0 1 0 0 1 Sk 0 0 1 1 0 March 2004 91.4904 Ron McFadyen

Select Sum(s.amount) From Sales s Inner Join Customer c On ( … ) Bitmapped index Consider a query Select Sum(s.amount) From Sales s Inner Join Customer c On ( … ) where c.gender =M and c.province = Mb The database query processor could choose to AND the arrays for gender=M and province=Mb The result indicates that rows 2 and 5 of Customer satisfy the query; only these two rows need to be joined to Sales M 1 1 0 0 1 Mb 0 1 0 0 1 Result 0 1 0 0 1 March 2004 91.4904 Ron McFadyen

Bitmapped Join Index In general, a join index is a structure containing index entries (attribute value, row pointers), where the attribute values are in one table, and the row pointers are to related rows in another table Consider Customer Sales Date March 2004 91.4904 Ron McFadyen

Cust_id Cust_id Store_id Date_id Join Index Customer Cust_id gender province phone 22 M Ab (403) 444-1234 44 M Mb (204) 777-6789 77 F Sk (306) 384-8474 88 F Sk (306) 384-6721 99 M Mb (204) 456-1234 Sales Cust_id Store_id Date_id Amount row 1 22 1 90 100 2 44 2 7 150 3 22 2 33 50 4 44 3 55 50 5 99 3 55 25 March 2004 91.4904 Ron McFadyen

Bitmapped Join Index In some database systems, a bitmapped join index can be created which indexes Sales by an attribute of customer: home province. Here is SQL for the index: CREATE BITMAP INDEX cust_sales_bji ON Sales(Customer.province) FROM Sales, Customer WHERE Sales.cust_id = Customer.cust_id; March 2004 91.4904 Ron McFadyen

Bitmapped Join Index There are three province values in Customer. The join index will have three entries where each has a province value and a bitmap: Mb 0 1 0 1 1 Ab 1 0 1 0 0 Sk 0 0 0 0 0 The bitmap index shows that rows 2, 4, 5 of the Sales fact table are rows for customers with province = Mb The bitmap index shows that rows 1, 3 of the Sales fact table are rows for customers with province = Ab The bitmap index shows that no rows of the Sales fact table are rows for customers with province = Sk March 2004 91.4904 Ron McFadyen

Bitmapped Join Index The bitmap join index could be used to evaluate the following query. In this query, the CUSTOMER table will not even be accessed; the query is executed using only the bitmap join index and the sales table. SELECT SUM(Sales.dollar_amount) FROM Sales, Customer WHERE Sales.cust_id = Customer.cust_id AND Customer.province = Mb; The bitmap index will show that rows 2, 4, 5 of the Sales fact table are rows for customers with province = M March 2004 91.4904 Ron McFadyen

Often the attribute is a date. Partitions Physical storage can be organized in partitions where the value of an attribute is used to determine where a row is placed. Often the attribute is a date. In warehousing, dates are often used to restrict a query. Archiving older data is facilitated since one only has to drop the oldest partition. March 2004 91.4904 Ron McFadyen

Materialized Views An MV is a view where the result set is pre-computed and stored in the database Periodically an MV must be re-computed to account for updates to its source tables MVs represent a performance enhancement to a DBMS as the result set is immediately available when the view is referenced in a SQL statement MVs are used in some cases to provide summary or aggregates for data warehousing transparently to the end-user Query optimization is a DBMS feature that may re-write an SQL statement supplied by a user. March 2004 91.4904 Ron McFadyen