All DBMSs provide variations of b-trees for indexing B-tree index

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Data Warehousing and Decision Support, part 2
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Multidimensional Data Rtrees Bitmap indexes. R-Trees For “regions” (typically rectangles) but can represent points. Supports NN, “where­am­I” queries.
Matrix Multiplication To Multiply matrix A by matrix B: Multiply corresponding entries and then add the resulting products (1)(-1)+ (2)(3) Multiply each.
BTrees & Bitmap Indexes
1 Chapter 10 Query Processing: The Basics. 2 External Sorting Sorting is used in implementing many relational operations Problem: –Relations are typically.
March 2010ACS-4904 Ron McFadyen1 Indexes B-tree index Bitmapped index Bitmapped join index A data warehousing DBMS will likely provide these, or variations,
© Ron McFadyen1 Many-to-one-to-many We need information that can only be obtained by accessing two fact tables through a common dimension … drilling across.
March 2010ACS-4904 Ron McFadyen1 Aggregate management References: Lawrence Corr Aggregate improvement
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
1 Query Processing: The Basics Chapter Topics How does DBMS compute the result of a SQL queries? The most often executed operations: –Sort –Projection,
By N.Gopinath AP/CSE. Two common multi-dimensional schemas are 1. Star schema: Consists of a fact table with a single table for each dimension 2. Snowflake.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
CS 345: Topics in Data Warehousing Thursday, October 28, 2004.
CS 345: Topics in Data Warehousing Thursday, October 21, 2004.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
ITCS 6163 Lecture 5. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
BI Terminologies.
1 Chapter 10 Joins and Subqueries. 2 Joins & Subqueries Joins – Methods to combine data from multiple tables – Optimizer information can be limited based.
Views, Algebra Temporary Tables. Definition of a view A view is a virtual table which does not physically hold data but instead acts like a window into.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
Chapter 5 Index and Clustering
Chapter 8 Views and Indexes 第 8 章 视图与索引. 8.1 Virtual Views  Views:  “virtual relations”. Another class of SQL relations that do not exist physically.
Chapter 10 The Basics of Query Processing. Copyright © 2005 Pearson Addison-Wesley. All rights reserved External Sorting Sorting is used in implementing.
1 Introduction to Database Systems, CS420 SQL Views and Indexes.
IFS180 Intro. to Data Management Chapter 10 - Unions.
MTH108 Business Math I Lecture 20.
Properties and Applications of Matrices
Lab 13 Databases and SQL.
Indexing Structures for Files and Physical Database Design
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
ITIS 5160 Indexing.
On-Line Analytic Processing
Database System Implementation CSE 507
MATHEMATICS Matrix Multiplication
Matrix Multiplication
Assignment 2 Due Thursday Feb 9, 2006
Dimensional Model January 14, 2003
Inventory is used to illustrate:
On-Line Analytical Processing (OLAP)
Database.
Lecture 15: Bitmap Indexes
Physical Storage Indexes Partitions Materialized views March 2004
Chapter 4 Summary Query.
Physical Storage Indexes Partitions Materialized views March 2006
Physical Storage materialized views indexes partitions ETL CDC.
Minidimension Example
Assignment 2 Due Thursday Feb 9, 2006
Retail Sales is used to illustrate a first dimensional model
Physical Storage Indexes Partitions Materialized views March 2005
The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited)
Column-Stores vs. Row-Stores: How Different Are They Really?
Objectives Multiply two matrices.
Data warehouse architecture CIF, DM Bus Matrix Star schema
Point-in-time balances Physical database Aggregation ETL Architecture
Retail Sales is used to illustrate a first dimensional model
Dimensional Model January 16, 2003
Views and Indexes Controlling Concurrent Behavior
Aggregate improvement Lost, shrunken, and collapsed Ralph Kimball
Joins and other advanced Queries
3.6 Multiply Matrices.
External Sorting Sorting is used in implementing many relational operations Problem: Relations are typically large, do not fit in main memory So cannot.
Table Suitable for Bitmap Index
Applied Discrete Mathematics Week 4: Functions
Page 37 Figure 2.3, with attributes excluded
Data Warehousing.
Presentation transcript:

All DBMSs provide variations of b-trees for indexing B-tree index Indexes All DBMSs provide variations of b-trees for indexing B-tree index Bitmapped index Bitmapped join index A data warehousing DBMS will likely provide these, or variations, on these See excerpt from Data Warehouse Design handed out in class http://en.wikipedia.org/wiki/Bitmap_index http://www.dba-oracle.com/oracle_tips_bitmapped_indexes.htm http://www.oracle.com/technology/pub/articles/sharma_indexes.html March 2010 ACS-4904 Ron McFadyen

Most used access structure in database systems. B-tree structures Most used access structure in database systems. There are b-trees, b+-trees, b*-trees, etc. B-trees and their variations are: balanced very good in environments with mixed reading and writing operations, concurrency, exact searches and range searches provide excellent performance when used to find a few rows in big tables are studied in ACS-3902 March 2010 ACS-4904 Ron McFadyen

Consider the following Customer table Bitmapped index Consider the following Customer table Cust_id gender province phone 22 M Ab (403) 444-1234 44 M Mb (204) 777-6789 77 F Sk (306) 384-8474 88 F Sk (306) 384-6721 99 M Mb (204) 456-1234 Province: Mb … rows 2, 5 Sk … rows 3, 4 Ab … row 1 Gender: M … rows 1, 2, 5 F … rows 3, 4 March 2010 ACS-4904 Ron McFadyen

otherwise the ith bit of bk is 0 Bitmapped index Suppose for a relation R the cardinality of attribute A is c and so we can represent the values existing for A as a1, a2, … ac Then, if we have a bitmap index for R on attribute A there are c bit arrays, b1, b2, … bc , one for each value of attribute A: bi is the bit array corresponding to value ai Consider bk if the ith row of R contains the value aK for attribute A, then the ith bit of bk is 1 otherwise the ith bit of bk is 0 March 2010 ACS-4904 Ron McFadyen

Bitmapped index If we construct a bitmapped index for Customer on Gender we would have two bit arrays of 5 bits each or: m 1 1 0 0 1 f 0 0 1 1 0 m 1 1 0 0 1 f 0 0 1 1 0 March 2010 ACS-4904 Ron McFadyen

Bitmapped index If we construct a bitmapped index for Customer on Province we would have three bit arrays of 5 bits each: Ab 1 0 0 0 0 Mb 0 1 0 0 1 What values appear in this vector? Sk . . . . . March 2010 ACS-4904 Ron McFadyen

Select Customer.name, Sum(s.amount) Bitmapped index Consider a query Select Customer.name, Sum(s.amount) From Sales s Inner Join Customer c On ( … ) where c.gender =M and c.province = Mb Group by Customer.name How could the query access plan utilize bit map indexes? March 2010 ACS-4904 Ron McFadyen

Bitmapped index A query tree for Select Customer.name, Sum(s.amount) From Sales s Inner Join Customer c On ( … ) where c.gender =M and c.province = Mb Group by Customer.name groups and sums ( to report name and sum) Sort (to prepare for grouping) Join (inner join of left and right subtree) Selection (determine pertinent rows of Customer) Sales Relation Customer Relation March 2010 ACS-4904 Ron McFadyen

Consider the where clause that selects rows of Customer c.gender =M Bitmapped index Consider the where clause that selects rows of Customer c.gender =M and c.province = Mb By anding the two bit arrays for gender=M and province=Mb, the dbms knows which rows of Customer to join to Sales In our case, two rows of Customer are involved instead of the whole Customer table. M 1 1 0 0 1 “and”  0 1 0 0 1 Mb 0 1 0 0 1 March 2010 ACS-4904 Ron McFadyen

Consider a numeric attribute c of a relation R. Bit-Sliced Index Consider a numeric attribute c of a relation R. Suppose n is the number of bits needed in the binary coding of values of c. Suppose R has m tuples. Let B be a bit matrix of n columns and m rows where bi,j is 1 if the coding of c in the ith tuple has the jth bit on. Each column of B is stored separately. March 2010 ACS-4904 Ron McFadyen

Consider the following Customer table Bit-Sliced Index Consider the following Customer table Cust_id age province phone 22 20 Ab (403) 444-1234 44 21 Mb (204) 777-6789 77 22 Sk (306) 384-8474 88 23 Sk (306) 384-6721 99 40 Mb (204) 456-1234 A bit-sliced index on age for Customer: 00010100 00010101 00010110 00010111 00101000 March 2010 March 2010 ACS-4904 Ron McFadyen ACS-4904 Ron McFadyen 11

Consider the following Customer table Bit-Sliced Index Consider the following Customer table Cust_id age province phone 22 20 Ab (403) 444-1234 44 21 Mb (204) 777-6789 77 22 Sk (306) 384-8474 88 23 Sk (306) 384-6721 99 40 Mb (204) 456-1234 A bit-sliced index on age for Customer: 00010100 00010101 00010110 00010111 Calculate the average age without using the Customer dimension: ___________________________ March 2010 March 2010 ACS-4904 Ron McFadyen ACS-4904 Ron McFadyen 12

For column i, count number of bits on, multiply by 2i Bit-Sliced Index In the notes handed out there is a bit-sliced index on attribute quantity of the Sales relation. To find the total sales quantity without going to the data (i.e. using the index only) Examine the columns one by one… Accumulate a sum over the columns Bi, i=0, 1, 2, …: For column i, count number of bits on, multiply by 2i To find those sales where the quantity is > 63 Examine column B6 to determine if the bit is on March 2010 March 2010 ACS-4904 Ron McFadyen ACS-4904 Ron McFadyen 13

Bitmap-Encoded Index Skip this March 2010 ACS-4904 Ron McFadyen

Bitmapped Join Index In general, a join index is a structure containing index entries (attribute value, row pointers), where the attribute values are in one table, and the row pointers are to related rows in another table Consider Customer Sales Date March 2010 ACS-4904 Ron McFadyen

Cust_id Cust_id Store_id Date_id Join Index Customer Cust_id gender province phone 22 M Ab (403) 444-1234 44 M Mb (204) 777-6789 77 F Sk (306) 384-8474 88 F Sk (306) 384-6721 99 M Mb (204) 456-1234 Sales Cust_id Store_id Date_id Amount row 1 22 1 90 100 2 44 2 7 150 3 22 2 33 50 4 44 3 55 50 5 99 3 55 25 March 2010 ACS-4904 Ron McFadyen

Bitmapped Join Index In some database systems, a bitmapped join index can be created which indexes Sales using an attribute of Customer: home province. Here is an SQL statement to create the index: CREATE BITMAP INDEX cust_sales_bji ON Sales(Customer.province) FROM Sales, Customer WHERE Sales.cust_id = Customer.cust_id; March 2010 ACS-4904 Ron McFadyen

Bitmapped Join Index There are three province values in Customer. The join index will have three entries where each has a province value and a bitmap: Mb 0 1 0 1 1 Ab 1 0 1 0 0 Sk 0 0 0 0 0 The bitmap join index shows that rows 2, 4, 5 of the Sales fact table are rows for customers with province = Mb The bitmap join index shows that rows 1, 3 of the Sales fact table are rows for customers with province = Ab The bitmap join index shows that no rows of the Sales fact table are rows for customers with province = Sk March 2010 ACS-4904 Ron McFadyen

Bitmapped Join Index A bitmap join index could be used to evaluate the following query. In this query, the CUSTOMER table does not need to be accessed; the query can be executed using only the bitmap join index and the Sales table. SELECT SUM(Sales.dollar_amount) FROM Sales, Customer WHERE Sales.cust_id = Customer.cust_id AND Customer.province = Mb; The bitmap index will show that rows 2, 4, 5 of the Sales fact table are rows for customers with province = Mb March 2010 ACS-4904 Ron McFadyen

HOBI See handout and books.google.ca …. Note the example hierarchical bitmap index “points” to rows in Auctions and is based on attribute values in the Product dimension. March 2010 ACS-4904 Ron McFadyen

mentions star joins available in UDB and Oracle Bitmapped Join Index http://www.intelligententerprise.com/000929/feat1.jhtml mentions star joins available in UDB and Oracle March 2010 ACS-4904 Ron McFadyen