Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries Niwan Wattanakitrungroj and Sirirut Vanichayobon Information Systems Technology and Applied Research Laboratory Department of Computer Science, Prince of Songkla University
Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion
Introduction - A data warehouse is a large repository of information accessed through OLAP application. A majority of requests for information from a data warehouse involve dynamic ad hoc queries. The ability to answer these queries quickly is a critical issue in the data warehouse environment.
Introduction To speed up query processing : Summary tables Indexes Parallel machines To speed up query processing :
Bitmap Index Characteristic : simple to represent uses less space Introduction : Bitmap Index simple to represent uses less space more CPU-efficient low-cost Boolean operations Characteristic :
Bitmap Index Introduction : Employee Table Name Gender Education Suda Select Count(*) From Employee Where Gender=“F”; RID 1 2 3 4 5 … Name Gender Education Suda F BS Wichai M Jonh MS Marry PhD Somsak … Answer : 2 Equality Query Select Name From Employee Where Gender=“M” and Education=“MS” RID 1 2 3 4 5 … F 1 … M 1 … BS 1 … MS 1 … PhD 1 … Answer : John Membership Query Select Name From Employee Where Education in {MS,PhD} Answer : John, Marry
Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion
Simple Bitmap Index Variations of Bitmap Index Related Work C = 15 15 bitmap vectors Let C be a number of distinct values of the indexed attribute(Cardinallity). Bitmap vectors : Query :
Interval Bitmap Index Variations of Bitmap Index Related Work C = 15 8 bitmap vectors Bitmap vectors : Query
Scatter Bitmap Index m = 5 Variations of Bitmap Index Related Work 8 bitmap vectors, Bitmap vectors : Query
Encoded Bitmap Index Variations of Bitmap Index Related Work C = 15 4 bitmap vectors Bitmap vectors : Mapping all Bitmap Vector Query :
Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion
Dual Bitmap Index Encoding Scheme of five bitmap indices Variations of Bitmap Index Dual Bitmap Index Encoding Scheme of five bitmap indices Need C bitmap vectors Need bitmap vectors Need bitmap vectors Need bitmap vectors Need bitmap vectors
Variations of Bitmap Index Dual Bitmap Index
Creation of Dual Bitmap Index Variations of Bitmap Index Creation of Dual Bitmap Index C =15 A = {0,1,2,…,14} Assign an increasing sequence of numbers to each of the distinct values of A (i.e., 0,1,…,C-1). n = 6 2. Calculate n : (The total number of bitmap vectors created ) 3. Calculate : (the highest value of C that can be represent by n bitmap vector) hiC = 15 4. For each value v on record at position i in A if i = r and s otherwise where and v is the value of an indexed attribute for any record.
Equality and Membership Queries Variations of Bitmap Index : Propose Bitmap Index Equality and Membership Queries “A = 2” 1. Find the sequence number of the searching value. 2. where and v is the value of an indexed attribute for any record.
Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion
Performance study
Performance study Number of bitmap vectors used to represent an attribute with cardinality C (Space) Simple Interval Scatter Dual Encoded Scatter Dual Encoded
Performance study
Performance study Space-Time Trade-off for five Bitmap Indices C=50, N=1,000,000 (The data sets from TPC-H Benchmark) Simple Interval Scatter Encoded Dual
Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion
Conclusion Simple Bitmap Index requires the most space. Encoded Bitmap Index’ s processing time is the worst. Dual bitmap index uses less space while maintaining query processing time for equality and membership queries. Dual Bitmap Index achieves this by representing each attribute value using only two bitmap vectors, and only the low-cost Boolean AND operation is used to answer equality query. Dual Bitmap Index has better space-time performance than the other bitmap indexing techniques.
Thank You Question & answer