Bitmap Index Design and Evaluation Ariel Noy Data representation and retrieval seminar By: Chee-Yong Chan Yannis E.Ioannidis.

Slides:



Advertisements
Similar presentations
Copyright © 2004 Ramez Elmasri and Shamkant Navathe Elmasri/Navathe, Fundamentals of Database Systems, Fourth Edition Chapter 15-1 Query Processing and.
Advertisements

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 19 Algorithms for Query Processing and Optimization.
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
Query Optimization CS634 Lecture 12, Mar 12, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Presented by Russell Myers Paper by Ming-Chuan Wu and Alejandro P. Buchmann.
Bitmap Index Buddhika Madduma 22/03/2010 Web and Document Databases - ACS-7102.
Quick Review of Apr 17 material Multiple-Key Access –There are good and bad ways to run queries on multiple single keys Indices on Multiple Attributes.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
ITIS 5160 Indexing. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
Presented by Cathrin Weiss, Panagiotis Karras, Abraham Bernstein Department of Informatics, University of Zurich Summarized by: Arpit Gagneja.
1 Overview of Indexing Chapter 8 – Part II. 1. Introduction to indexing 2. First glimpse at indices and workloads.
...Looking back Why use a DBMS? How to design a database? How to query a database? How does a DBMS work?
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How are data stored? –physical level –logical level.
July, 2001 High-dimensional indexing techniques Kesheng John Wu Ekow Otoo Arie Shoshani.
1 SciCSM: Novel Contrast Set Mining over Scientific Datasets Using Bitmap Indices Gangyi Zhu, Yi Wang, Gagan Agrawal The Ohio State University.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
Hexastore: Sextuple Indexing for Semantic Web Data Management
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
Database Management 9. course. Execution of queries.
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
Int. Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2005), Zeuthen, Germany, May 2005 Bitmap Indices for Fast End-User.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
ITCS 6163 Lecture 5. Indexing datacubes Objective: speed queries up. Traditional databases (OLTP): B-Trees Time and space logarithmic to the amount of.
Data Warehouse Design Xintao Wu University of North Carolina at Charlotte Nov 10, 2008.
Multi-Field Range Encoding for Packet Classification in TCAM Author: Yeim-Kuan Chang, Chun-I Lee and Cheng-Chien Su Publisher: INFOCOM 2011 Presenter:
Physical Storage Organization. Advanced DatabasesPhysical Storage Organization2 Outline Where and How data are stored? –physical level –logical level.
Database Management COP4540, SCS, FIU Physical Database Design (ch. 16 & ch. 3)
Multi-Relational Data Mining: An Introduction Joe Paulowskey.
Indexes and Views Unit 7.
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
Variant Indexes. Specialized Indexes? Data warehouses are large databases with data integrated from many independent sources. Queries are often complex.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Supporting Ranking and Clustering as Generalized Order-By and Group-By Chengkai Li (UIUC) joint work with Min Wang Lipyeow Lim Haixun Wang (IBM) Kevin.
March, 2002 Efficient Bitmap Indexing Techniques for Very Large Datasets Kesheng John Wu Ekow Otoo Arie Shoshani.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Packet Classification Using Multi- Iteration RFC Author: Chun-Hui Tsai, Hung-Mao Chu, Pi-Chung Wang Publisher: 2013 IEEE 37th Annual Computer Software.
Oracle Announced New In- Memory Database G1 Emre Eftelioglu, Fen Liu [09/27/13] 1 [1]
Partitioned Sorting of Bitmap Indices Kyle Brooks.
1 CS122A: Introduction to Data Management Lecture #15: Physical DB Design Instructor: Chen Li.
Database System Architecture and Implementation Execution Costs 1 Slides Credit: Michael Grossniklaus – Uni-Konstanz.
Module 11: File Structure
How To Build a Compressed Bitmap Index
Indexes By Adrienne Watt.
Indexing Structures for Files and Physical Database Design
Record Storage, File Organization, and Indexes
ITIS 5160 Indexing.
Physical Database Design
Database Performance Tuning and Query Optimization
Introduction to Query Optimization
Chapter 15 QUERY EXECUTION.
KISS-Tree: Smart Latch-Free In-Memory Indexing on Modern Architectures
Introduction to Database Systems
Dual Bitmap Index: Space-Time Efficient Bitmap
Database Management System
BITMAP INDEXES E0 261 Jayant Haritsa Computer Science and Automation
Scalable Multi-Match Packet Classification Using TCAM and SRAM
Chapter 11 Database Performance Tuning and Query Optimization
Publisher : TRANSACTIONS ON NETWORKING Author : Haoyu Song, Jonathan S
Query Optimization.
Table Suitable for Bitmap Index
File Organizations and Indexing
Presentation transcript:

Bitmap Index Design and Evaluation Ariel Noy Data representation and retrieval seminar By: Chee-Yong Chan Yannis E.Ioannidis

Introduction Query performance issues On Line Transaction Processing. Read write database. Decision Support System. Read mostly environments, with high selectivity factor.

Bitmap In Simple Form Every value has it’s own column == bitmap. Value List Index

Advantages Compact size.Compact size. Efficient hardware support for bitmap operations (AND, OR, XOR, NOT).Efficient hardware support for bitmap operations (AND, OR, XOR, NOT). Fast search.Fast search. Multiple differentiate bitmap indexes for different kind of queries.Multiple differentiate bitmap indexes for different kind of queries.

Selection queries. Queries of the form “A op v”Queries of the form “A op v” A refers to indexed attribute. Op Range predicates Equality predicates

Space time tradeoff of bitmap indexes, for selection queries. Space optimal bitmap index.Space optimal bitmap index. Time optimal bitmap index under a given space constraint.Time optimal bitmap index under a given space constraint. Bitmap index withBitmap index with optimal space time tradeoff. Time optimal bitmapTime optimal bitmapindex.

Attribute Value Decomposition.

Bitmap Encoding Scheme Equality Encoding: bi bits one for each possible value, all 0, vi 1. Range Encoding: vi right most bits 0, rest 1.

Evaluation Algorithm for Range- Encoded Bitmap Indexes. RangeEval - O’Neil and QuassRangeEval - O’Neil and Quass RangeEval-Opt:RangeEval-Opt: –number bitmap operation 50% off –less bitmap scans for range predicate evaluation –caluclating only the requested bitmap –avoids the intermediate equality predicate evaluation by evaluating each range query in term only off <= based on: A < v == A<=v-1A < v == A<=v-1 A > v == ! (A v == ! (A<=v) A>=v == A =v == A<=v-1 –Working with only one bitmap B vs. working with at least two [Beq and ( Blt or Bge)]

Example: A<=864 using a 3 component base-10 index.A<=864 using a 3 component base-10 index. RaneEval-Opt:RaneEval-Opt: 4 operation 5 scans RangeEval:RangeEval: 10 operations 6 scans

Analytical Comparison

Cost Model for Space-Time Tradeoff Analysis Space(I)Space(I) Space metric is in term of number of bitmaps stored. Time(I)Time(I) Time metric is in term of expected number of bitmap scans for a selection query evaluation.

Comparison of Bitmap Encoding Scheme Equality encoded:Equality encoded: S(I) ~ C T(I) ~ n*b/2 Range encoded:Range encoded: S(I) ~ C-n T(I) ~ 2n

Space Optimal:Space Optimal: –number of bitmap in n-component space optimal = n(b-2) b~ –space efficiency is non-decreasing function of the number of components. –The ultimate optimal is when n=log(C) Time Optimal:Time Optimal: –the optimal base in n-component base is <2,2,2,…,C/2^N> –time efficiency is non-increasing function of the number of components. –The ultimate optimal is when n=1

Optimal Space-Time Tradeoff (knee). Based on experimental, guessing and guts filling. 2 component index The base of the most time-efficient 2-component space-optimal index is given by:

Time Optimal Bitmap Index Under Space Constraint

Bitmap Index Storage Schems Bitmap Level Storage (BS)Bitmap Level Storage (BS) each bitmap his own file Component Level Storage (CS)Component Level Storage (CS) each index component has its own file Index Level Storage (IS)Index Level Storage (IS) all together in one file

Compression of each file CS has the best Space(I) tradeoff after compression.CS has the best Space(I) tradeoff after compression. BS has the best Time(I) tradeoff after compression.BS has the best Time(I) tradeoff after compression.