March 2010ACS-4904 Ron McFadyen1 Indexes B-tree index Bitmapped index Bitmapped join index A data warehousing DBMS will likely provide these, or variations,

Slides:



Advertisements
Similar presentations
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Advertisements

Data Warehousing and Decision Support, part 2
Top k Knapsack Joins and Closure Early Results Witold LITWIN & Thomas Schwarz U. Paris Dauphine, France
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Multidimensional Indexing
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Multidimensional Data Rtrees Bitmap indexes. R-Trees For “regions” (typically rectangles) but can represent points. Supports NN, “where­am­I” queries.
Bitmap Index Buddhika Madduma 22/03/2010 Web and Document Databases - ACS-7102.
BTrees & Bitmap Indexes
15.6 Index-based Algorithms Jindou Jiao 101. Index-based algorithms are especially useful for the selection operator Algorithms for join and other binary.
1 Chapter 10 Query Processing: The Basics. 2 External Sorting Sorting is used in implementing many relational operations Problem: –Relations are typically.
© Ron McFadyen1 Many-to-one-to-many We need information that can only be obtained by accessing two fact tables through a common dimension … drilling across.
IS 4420 Database Fundamentals Chapter 6: Physical Database Design and Performance Leon Chen.
March 2010ACS-4904 Ron McFadyen1 Aggregate management References: Lawrence Corr Aggregate improvement
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Alternative: Bitmap Indexing Imagine the following query in huge table Find customers living in London, with 2 cars and 3 children occupying a 4 bed house.
File Organizations March 2007R McFadyen ACS In SQL Server 2000 Tree terms root, internal, leaf, subtree parent, child, sibling balanced, unbalanced.
1 Query Processing: The Basics Chapter Topics How does DBMS compute the result of a SQL queries? The most often executed operations: –Sort –Projection,
8-1 Outline  Overview of Physical Database Design  File Structures  Query Optimization  Index Selection  Additional Choices in Physical Database Design.
By N.Gopinath AP/CSE. Two common multi-dimensional schemas are 1. Star schema: Consists of a fact table with a single table for each dimension 2. Snowflake.
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
Lecture 6 Indexing Part 2 Column Stores. Indexes Recap Heap FileBitmapHash FileB+Tree InsertO(1) O( log B n ) DeleteO(P)O(1) O( log B n ) Range Scan O(P)--
AN INTRODUCTION TO EXECUTION PLAN OF QUERIES These slides have been adapted from a presentation originally made by ORACLE. The full set of original slides.
Relational Database Performance CSCI 6442 Copyright 2013, David C. Roberts, all rights reserved.
CS 345: Topics in Data Warehousing Thursday, October 21, 2004.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 14 – Join Processing.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
Ashwani Roy Understanding Graphical Execution Plans Level 200.
ITCS 6163 Lecture 5. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.
Chapter 6 1 © Prentice Hall, 2002 The Physical Design Stage of SDLC (figures 2.4, 2.5 revisited) Project Identification and Selection Project Initiation.
Data Warehouse and the Star Schema CSCI 242 ©Copyright 2015, David C. Roberts, all rights reserved.
Using Special Operators (LIKE and IN)
Nimesh Shah (nimesh.s) , Amit Bhawnani (amit.b)
Programming in R SQL in R. Running SQL in R In this session I will show you how to: Run basic SQL commands within R.
C-Store: Data Model and Data Organization Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May 17, 2010.
Prof. Bayer, DWH, Ch.5, SS Chapter 5. Indexing for DWH D1Facts D2.
Indexes and Views Unit 7.
Marwan Al-Namari Hassan Al-Mathami. Indexing What is Indexing? Indexing is a mechanisms. Why we need to use Indexing? We used indexing to speed up access.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
A Guide to SQL, Eighth Edition Chapter Four Single-Table Queries.
Chapter 8 Views and Indexes 第 8 章 视图与索引. 8.1 Virtual Views  Views:  “virtual relations”. Another class of SQL relations that do not exist physically.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Query Processing and Query Optimization CS 157B Dennis Le Weishan Wang.
Manipulating Data Lesson 3. Objectives Queries The SELECT query to retrieve or extract data from one table, how to retrieve or extract data by using.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Chapter 10 The Basics of Query Processing. Copyright © 2005 Pearson Addison-Wesley. All rights reserved External Sorting Sorting is used in implementing.
1 Introduction to Database Systems, CS420 SQL Views and Indexes.
IFS180 Intro. to Data Management Chapter 10 - Unions.
All DBMSs provide variations of b-trees for indexing B-tree index
Indexes By Adrienne Watt.
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
Assignment 2 Due Thursday Feb 9, 2006
Dimensional Model January 14, 2003
Cse 344 APRIL 23RD – Indexing.
Physical Storage Indexes Partitions Materialized views March 2004
Physical Storage Indexes Partitions Materialized views March 2006
Physical Storage materialized views indexes partitions ETL CDC.
Minidimension Example
Physical Storage Indexes Partitions Materialized views March 2005
Data warehouse architecture CIF, DM Bus Matrix Star schema
Point-in-time balances Physical database Aggregation ETL Architecture
Retail Sales is used to illustrate a first dimensional model
Aggregate improvement Lost, shrunken, and collapsed Ralph Kimball
Table Suitable for Bitmap Index
Presentation transcript:

March 2010ACS-4904 Ron McFadyen1 Indexes B-tree index Bitmapped index Bitmapped join index A data warehousing DBMS will likely provide these, or variations, on these All DBMSs provide variations of b-trees for indexing See

March 2010ACS-4904 Ron McFadyen2 B-tree structures Most used access structure in database systems. There are b-trees, b+-trees, b*-trees, etc. B-trees and their variations are: balanced very good in environments with mixed reading and writing operations, concurrency, exact searches and range searches provide excellent performance when used to find a few rows in big tables are studied in ACS-3902

March 2010ACS-4904 Ron McFadyen3 Bitmapped index Consider the following Customer table Cust_idgenderprovincephone 22MAb(403) MMb(204) FSk(306) FSk(306) MMb(204) Province: Mb … rows 2, 5 Sk … rows 3, 4 Ab … row 1 Gender: M … rows 1, 2, 5 F … rows 3, 4

March 2010ACS-4904 Ron McFadyen4 Bitmapped index Suppose for a relation R the cardinality of attribute A is c and so we can represent the values existing for A as a 1, a 2, … a c Then, if we have a bitmap index for R on attribute A there are c bit arrays, b 1, b 2, … b c, one for each value of attribute A: b i is the bit array corresponding to value a i Consider b k if the i th row of R contains the value a K for attribute A, then the i th bit of b k is 1 otherwise the i th bit of b k is 0

March 2010ACS-4904 Ron McFadyen5 Bitmapped index If we construct a bitmapped index for Customer on Gender we would have two bit arrays of 5 bits each: m f

March 2010ACS-4904 Ron McFadyen6 Bitmapped index If we construct a bitmapped index for Customer on Province we would have three bit arrays of 5 bits each: Ab Mb Sk..... What values appear in this vector?

March 2010ACS-4904 Ron McFadyen7 Bitmapped index Consider a query Select Customer.name, Sum(s.amount) From Sales s Inner Join Customer c On ( … ) where c.gender =M and c.province = Mb Group by Customer.name How could the query optimizer utilize bit map indexes?

March 2010ACS-4904 Ron McFadyen8 Bitmapped index A query tree for Select Customer.name, Sum(s.amount) From Sales s Inner Join Customer c On ( … ) where c.gender =M and c.province = Mb Group by Customer.name Sort (to prepare for grouping) groups and sums ( to report name and sum) Join (inner join of left and right subtree) Selection (determine pertinent rows of Customer) Customer Relation Sales Relation

March 2010ACS-4904 Ron McFadyen9 Bitmapped index Consider the where clause that selects rows of Customer c.gender =M and c.province = Mb By anding the two bit arrays for gender=M and province=Mb, the dbms knows which rows of Customer to join to Sales In our case, two rows of Customer are involved instead of the whole Customer table. Mb M “and” 

March 2010ACS-4904 Ron McFadyen10 Bitmapped Join Index In general, a join index is a structure containing index entries (attribute value, row pointers), where the attribute values are in one table, and the row pointers are to related rows in another table Consider DateSales Customer

March 2010ACS-4904 Ron McFadyen11 Join Index Cust_idgenderprovincephone 22MAb(403) MMb(204) FSk(306) FSk(306) MMb(204) Cust_idStore_idDate_idAmount row Sales Customer

March 2010ACS-4904 Ron McFadyen12 Bitmapped Join Index In some database systems, a bitmapped join index can be created which indexes Sales using an attribute of Customer: home province. Here is an SQL statement to create the index: CREATE BITMAP INDEX cust_sales_bji ON Sales(Customer.province) FROM Sales, Customer WHERE Sales.cust_id = Customer.cust_id;

March 2010ACS-4904 Ron McFadyen13 Bitmapped Join Index There are three province values in Customer. The join index will have three entries where each has a province value and a bitmap: Mb Ab Sk The bitmap join index shows that rows 2, 4, 5 of the Sales fact table are rows for customers with province = Mb The bitmap join index shows that rows 1, 3 of the Sales fact table are rows for customers with province = Ab The bitmap join index shows that no rows of the Sales fact table are rows for customers with province = Sk

March 2010ACS-4904 Ron McFadyen14 Bitmapped Join Index A bitmap join index could be used to evaluate the following query. In this query, the CUSTOMER table will not even be accessed; the query is executed using only the bitmap join index and the Sales table. SELECT SUM(Sales.dollar_amount) FROM Sales, Customer WHERE Sales.cust_id = Customer.cust_id AND Customer.province = Mb; The bitmap index will show that rows 2, 4, 5 of the Sales fact table are rows for customers with province = Mb

March 2010ACS-4904 Ron McFadyen15 Bitmapped Join Index mentions star joins available in UDB and Oracle