Dual Bitmap Index: Space-Time Efficient Bitmap

Slides:



Advertisements
Similar presentations
Bitmap Index Design and Evaluation Ariel Noy Data representation and retrieval seminar By: Chee-Yong Chan Yannis E.Ioannidis.
Advertisements

Multimedia Database Systems
1 DynaMat A Dynamic View Management System for Data Warehouses Vicky :: Cao Hui Ping Sherman :: Chow Sze Ming CTH :: Chong Tsz Ho Ronald :: Woo Lok Yan.
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Pete Bohman Adam Kunk.  Introduction  Related Work  System Overview  Indexing Scheme  Ranking  Evaluation  Conclusion.
CPSC 335 Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Multidimensional Data Rtrees Bitmap indexes. R-Trees For “regions” (typically rectangles) but can represent points. Supports NN, “where­am­I” queries.
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows COMP9314 Lecture Notes.
Small-world Overlay P2P Network
Bitmap Index Buddhika Madduma 22/03/2010 Web and Document Databases - ACS-7102.
BTrees & Bitmap Indexes
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Physical Data Warehouse Design Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
BITMAP INDEXES Parin Shah (Id :- 207). Introduction A bitmap index is a special kind of index that stores the bulk of its data as bit arrays (commonly.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
By N.Gopinath AP/CSE. Two common multi-dimensional schemas are 1. Star schema: Consists of a fact table with a single table for each dimension 2. Snowflake.
Analysis and Avoidance of Cross-talk in on-chip buses Chunjie Duan Ericsson Wireless Communications Anup Tirumala Jasmine Networks Sunil P Khatri University.
Hystor : Making the Best Use of Solid State Drivers in High Performance Storage Systems Presenter : Dong Chang.
Fast vector quantization image coding by mean value predictive algorithm Authors: Yung-Gi Wu, Kuo-Lun Fan Source: Journal of Electronic Imaging 13(2),
1 Route Table Partitioning and Load Balancing for Parallel Searching with TCAMs Department of Computer Science and Information Engineering National Cheng.
Real-Time Concepts for Embedded Systems Author: Qing Li with Caroline Yao ISBN: CMPBooks.
Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.
An affinity-driven clustering approach for service discovery and composition for pervasive computing J. Gaber and M.Bakhouya Laboratoire SeT Université.
1 Problem Solving using computers Data.. Representation & storage Representation of Numeric data The Binary System.
Int. Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2005), Zeuthen, Germany, May 2005 Bitmap Indices for Fast End-User.
ITCS 6163 Lecture 5. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.
Bitmap Indices for Data Warehouse Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
10/10/2012ISC239 Isabelle Bichindaritz1 Physical Database Design.
Using Bitmap Index to Speed up Analyses of High-Energy Physics Data John Wu, Arie Shoshani, Alex Sim, Junmin Gu, Art Poskanzer Lawrence Berkeley National.
September, 2002 Efficient Bitmap Indexes for Very Large Datasets John Wu Ekow Otoo Arie Shoshani Lawrence Berkeley National Laboratory.
Efficient Semantic Based Content Search in P2P Network Heng Tao Shen, Yan Feng Shu, and Bei Yu.
Efficient OLAP Operations for Spatial Data Using P-Trees Baoying Wang, Fei Pan, Dongmei Ren, Yue Cui, Qiang Ding William Perrizo North Dakota State University.
Methodology – Physical Database Design for Relational Databases.
Sec 14.7 Bitmap Indexes Shabana Kazi. Introduction A bitmap index is a special kind of index that stores the bulk of its data as bit arrays (commonly.
BITMAP INDEXES Sai Priya Rama Gopal SJSU ID : Class ID: 125.
March, 2002 Efficient Bitmap Indexing Techniques for Very Large Datasets Kesheng John Wu Ekow Otoo Arie Shoshani.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
1Computer Sciences Department. 2 Advanced Design and Analysis Techniques TUTORIAL 7.
A Multicast Routing Algorithm Using Movement Prediction for Mobile Ad Hoc Networks Huei-Wen Ferng, Ph.D. Assistant Professor Department of Computer Science.
SERENA: SchEduling RoutEr Nodes Activity in wireless ad hoc and sensor networks Pascale Minet and Saoucene Mahfoudh INRIA, Rocquencourt Le Chesnay.
BITMAP INDEXES Barot Rushin (Id :- 108).
All DBMSs provide variations of b-trees for indexing B-tree index
Advanced Database Aggregation Query Processing
How To Build a Compressed Bitmap Index
3.5 Databases Relationships.
ITIS 5160 Indexing.
Chapter 11: File System Implementation
Operating Systems (CS 340 D)
Efficient Ranking of Keyword Queries Using P-trees
COMP 430 Intro. to Database Systems
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Chapter 11: File System Implementation
Chapter 11: File System Implementation
Lecture 15: Bitmap Indexes
Lecture 3: Main Memory.
BITMAP INDEXES E0 261 Jayant Haritsa Computer Science and Automation
Chapter 11: Indexing and Hashing
Chapter 14: File-System Implementation
Chapter 11: File System Implementation
第 九 章 影像邊緣偵測 9-.
Retrieval Performance Evaluation - Measures
Dynamic TIM and Page Segmentation
Table Suitable for Bitmap Index
MIS 451 Building Business Intelligence Systems
Presentation transcript:

Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries Niwan Wattanakitrungroj and Sirirut Vanichayobon Information Systems Technology and Applied Research Laboratory Department of Computer Science, Prince of Songkla University

Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion

Introduction - A data warehouse is a large repository of information accessed through OLAP application. A majority of requests for information from a data warehouse involve dynamic ad hoc queries. The ability to answer these queries quickly is a critical issue in the data warehouse environment.

Introduction To speed up query processing : Summary tables Indexes Parallel machines To speed up query processing :

Bitmap Index Characteristic :  simple to represent  uses less space Introduction : Bitmap Index  simple to represent  uses less space  more CPU-efficient  low-cost Boolean operations Characteristic :

Bitmap Index Introduction : Employee Table Name Gender Education Suda Select Count(*) From Employee Where Gender=“F”; RID 1 2 3 4 5 … Name Gender Education Suda F BS Wichai M Jonh MS Marry PhD Somsak … Answer : 2 Equality Query Select Name From Employee Where Gender=“M” and Education=“MS” RID 1 2 3 4 5 … F 1 … M 1 … BS 1 … MS 1 … PhD 1 … Answer : John Membership Query Select Name From Employee Where Education in {MS,PhD} Answer : John, Marry

Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion

Simple Bitmap Index Variations of Bitmap Index Related Work C = 15 15 bitmap vectors Let C be a number of distinct values of the indexed attribute(Cardinallity). Bitmap vectors : Query :

Interval Bitmap Index Variations of Bitmap Index Related Work C = 15 8 bitmap vectors Bitmap vectors : Query

Scatter Bitmap Index m = 5 Variations of Bitmap Index Related Work 8 bitmap vectors, Bitmap vectors : Query

Encoded Bitmap Index Variations of Bitmap Index Related Work C = 15 4 bitmap vectors Bitmap vectors : Mapping all Bitmap Vector Query :

Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion 

Dual Bitmap Index Encoding Scheme of five bitmap indices Variations of Bitmap Index Dual Bitmap Index Encoding Scheme of five bitmap indices Need C bitmap vectors Need bitmap vectors Need bitmap vectors Need bitmap vectors Need bitmap vectors

Variations of Bitmap Index Dual Bitmap Index

Creation of Dual Bitmap Index Variations of Bitmap Index Creation of Dual Bitmap Index C =15 A = {0,1,2,…,14} Assign an increasing sequence of numbers to each of the distinct values of A (i.e., 0,1,…,C-1). n = 6 2. Calculate n : (The total number of bitmap vectors created ) 3. Calculate : (the highest value of C that can be represent by n bitmap vector) hiC = 15 4. For each value v on record at position i in A if i = r and s otherwise where and v is the value of an indexed attribute for any record.

Equality and Membership Queries Variations of Bitmap Index : Propose Bitmap Index Equality and Membership Queries “A = 2” 1. Find the sequence number of the searching value. 2. where and v is the value of an indexed attribute for any record.

Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion

Performance study

Performance study Number of bitmap vectors used to represent an attribute with cardinality C (Space) Simple Interval Scatter Dual Encoded Scatter Dual Encoded

Performance study

Performance study Space-Time Trade-off for five Bitmap Indices C=50, N=1,000,000 (The data sets from TPC-H Benchmark) Simple Interval Scatter Encoded Dual

Outline Introduction Variations of Bitmap Index - Simple Bitmap Index - Interval Bitmap Index - Scatter Bitmap Index - Encoded Bitmap Index - Dual Bitmap Index Performance Study Conclusion

Conclusion Simple Bitmap Index requires the most space. Encoded Bitmap Index’ s processing time is the worst. Dual bitmap index uses less space while maintaining query processing time for equality and membership queries. Dual Bitmap Index achieves this by representing each attribute value using only two bitmap vectors, and only the low-cost Boolean AND operation is used to answer equality query. Dual Bitmap Index has better space-time performance than the other bitmap indexing techniques.

Thank You Question & answer