Sec 14.7 Bitmap Indexes Shabana Kazi. Introduction A bitmap index is a special kind of index that stores the bulk of its data as bit arrays (commonly.

Slides:



Advertisements
Similar presentations
E LEMENTARY D ATA STRUCTURES. A RRAYS Sequence of elements of similar type. Elements are selected by integer subscripts. For example, signs: array[37:239]
Advertisements

Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Multidimensional Data Rtrees Bitmap indexes. R-Trees For “regions” (typically rectangles) but can represent points. Supports NN, “where­am­I” queries.
Modern Information Retrieval Chapter 8 Indexing and Searching.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
COMP 451/651 B-Trees Size and Lookup Chapter 1.
Bitmap Index Buddhika Madduma 22/03/2010 Web and Document Databases - ACS-7102.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Chapter 1.7 Storing Fractions. Excess Notation, continued… In this notation, "m" indicates the total number of bits. For us (working with 8 bits), it.
BTrees & Bitmap Indexes
Multiple-key indexes Index on one attribute provides pointer to an index on the other. If V is a value of the first attribute, then the index we reach.
Hash Table indexing and Secondary Storage Hashing.
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
M68K Assembly Language Programming Bob Britton Chapter 12 IEEE Floating Point Notice: Chapter slides in this series originally provided by B.Britton. This.
COMP 451/651 Multiple-key indexes
BITMAP INDEXES Parin Shah (Id :- 207). Introduction A bitmap index is a special kind of index that stores the bulk of its data as bit arrays (commonly.
Copyright © Cengage Learning. All rights reserved.
Bitmap Indexes.
CS 255: Database System Principles slides: Variable length data and record By:- Arunesh Joshi( 107) Id: Cs257_107_ch13_13.7.
 A binary number is a number that includes only ones and zeroes.  The number could be of any length  The following are all examples of binary numbers.
REMEMBER: A number can be negative or positive as shown in the number line below: Negative Numbers (-) Positive Numbers (+)
CSE Lectures 22 – Huffman codes
HOW TO ADD INTEGERS REMEMBER:
Lecture for Week Spring.  Numbers can be represented in many ways. We are familiar with the decimal system since it is most widely used in everyday.
Noiseless Coding. Introduction Noiseless Coding Compression without distortion Basic Concept Symbols with lower probabilities are represented by the binary.
Copyright © Cengage Learning. All rights reserved. CHAPTER 2 THE LOGIC OF COMPOUND STATEMENTS THE LOGIC OF COMPOUND STATEMENTS.
C-Store: A Column-oriented DBMS Speaker: Zhu Xinjie Supervisor: Ben Kao.
Number Systems Part 2 Numerical Overflow Right and Left Shifts Storage Methods Subtraction Ranges.
Signature files. Signature Files Important alternative to inverted indexes. Given a document, the signature is calculated as follows. - First, each word.
Numbering Systems. CSCE 1062 Outline What is a Numbering System Review of decimal numbering system Binary representation range Hexadecimal numbering system.
Paging Examples Assume a page size of 1K and a 15-bit logical address space. How many pages are in the system?
Basics of Data Compression Paolo Ferragina Dipartimento di Informatica Università di Pisa.
Prof. Yousef B. Mahdy , Assuit University, Egypt File Organization Prof. Yousef B. Mahdy Chapter -4 Data Management in Files.
Introduction n – length of text, m – length of search pattern string Generally suffix tree construction takes O(n) time, O(n) space and searching takes.
Data : The Small Forwarding Table(SFT), In general, The small forwarding table is the compressed version of a trie. Since SFT organizes.
7/11/20081 Today’s Agenda Friday 6 th Nov Revision of certain topics Floating point notation Excess Notation Exam format Interim Report.
Compiler Chapter# 5 Intermediate code generation.
Bitmap Indices for Data Warehouse Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY.
Logic (continuation) Boolean Logic and Bit Operations.
Positional Data Organization and Compression in Web Inverted Indexes Leonidas Akritidis Panayiotis Bozanis Department of Computer & Communication Engineering,
Quiz # 1 Chapters 1,2, & 3.
C-Store: Data Model and Data Organization Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May 17, 2010.
BITMAP INDEXES Sai Priya Rama Gopal SJSU ID : Class ID: 125.
Word : Let F be a field then the expression of the form a 1, a 2, …, a n where a i  F  i is called a word of length n over the field F. We denote the.
1 Chapter 7 Skip Lists and Hashing Part 2: Hashing.
Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’
The task of compression consists of two components, an encoding algorithm that takes a file and generates a “compressed” representation (hopefully with.
Introduction * Binary numbers are represented with a separate sign bit along with the magnitude. * For example, in an 8-bit binary number, the MSB is.
Comp 335 File Structures Data Compression. Why Study Data Compression? Conserves storage space Files can be transmitted faster because there are less.
Index construction: Compression of postings Paolo Ferragina Dipartimento di Informatica Università di Pisa Reading 5.3 and a paper.
D Nagesh Kumar, IIScOptimization Methods: M8L5 1 Advanced Topics in Optimization Evolutionary Algorithms for Optimization and Search.
จัดทำโดย นายชนากานต์ สันติคุณาภรณ์ นายธฤษพงศ์ ศิริบูรณ์ นางสาวศุภาภรณ์ ถ่านคำ.
ECE DIGITAL LOGIC LECTURE 2: DIGITAL COMPUTER AND NUMBER SYSTEMS Assistant Prof. Fareena Saqib Florida Institute of Technology Fall 2016, 01/14/2016.
Multidimensional Access Structures COMP3017 Advanced Databases Dr Nicholas Gibbins –
CS 257: Database System Principles Variable length data and record BY Govind Kalyankar Class Id: 107.
BITMAP INDEXES Barot Rushin (Id :- 108).
Chapter 5. Multidimensional Indexes
Department of Computer Science Georgia State University
Programmable Logic Controller
Module 11: File Structure
Multidimensional Access Structures
Ch. 8 File Structures Sequential files. Text files. Indexed files.
Fractal image compression
Functions Defined on General Sets
Paging Examples Assume a page size of 1K and a 15-bit logical address space. How many pages are in the system?
Student: Ying Hong Course: Database Security Instructor: Dr. Yang
BITMAP INDEXES E0 261 Jayant Haritsa Computer Science and Automation
Copyright © Cengage Learning. All rights reserved.
Everything is an Integer Countable and Uncountable Sets
1.6) Storing Integer: 1.7) storing fraction:
Presentation transcript:

Sec 14.7 Bitmap Indexes Shabana Kazi

Introduction A bitmap index is a special kind of index that stores the bulk of its data as bit arrays (commonly called "bitmaps"). A bitmap index for a field F is a collection of bit-vectors of length n. The vector for value v has 1 in position i if the i th record has v in field F, and it has 0 there if not. It answers most queries by performing bitwise logical operations on these bitmaps.

Example 1 Suppose a file consists of records with two fields, F and G, of type integer and string, respectively. The current file has six records, numbered 1 through 6, with the following values in order: NoFG 130foo 230bar 340baz 450foo 540bar 630baz

Example 1(contd…) A bitmap index for the first field, F, would have three bit-vectors, each of length 6 as shown in the table. In each case, the 1's indicate in which records the corresponding string appears. ValueVector

Example (contd…) A bitmap index for the first field, G, would have three bit-vectors, each of length 6 as shown in the table. In each case, the 1's indicate in which records the corresponding string appears. ValueVector FOO BAR BAZ001001

Motivation for Bitmap Indexes: Bitmap indexes can help answer range queries. Given is the data of a jewelry stores. The attributes considered are age and salary. NoAgeSalary

Motivation for Bitmap Indexes (contd…) A bitmap index for the first field Age, would have seven bit-vectors, each of length 12 as shown in the table. In each case, the 1's indicate in which records the corresponding string appears. AgeVector

Motivation of Bitmap Indexes(contd…) A bitmap index for the second field Salary, would have ten bit- vectors, each of length 12 as shown in the table. In each case, the 1's indicate in which records the corresponding string appears. SalaryVector

Motivation of Bitmap Indexes(contd…) If we want to find the jewelry buyers with an age in the range and a salary in the range The bit-vectors for the age values in this range are found : and , for 45 and 50, respectively. If we take their bitwise OR, we have a new bit-vector with 1 in position i if and only if the i th record has an age in the desired range. This bit-vector is

Motivation of Bitmap Indexes (contd…) Next, we find the bit-vectors for the salaries between 100 and 200 thousand. There are four, corresponding to salaries 100, 110, 120, and 140; their bitwise OR is The last step is to take the bitwise AND of the two bit- vectors we calculated by OR. That is: AND

Motivation for Bitmap Indexes (contd…) We thus find that only the fourth and fifth records, which are (50,100) and (50,120), are in the desired range.

Compressed Bitmaps If there is a bitmap index on field F of a file with n records and there are m different values for field F. The number of bits in all the bit- vectors is mn. In our example, n= 12 and m=7 NoAgeSalary

Compressed Bitmaps (contd..) If the block size is 4096 bytes, then bits can fit in one block. Therefore the number of blocks needed is mn/ If ‘m’ is large, then 1’s in a bit-vector will be rare. The probability that any can bit is 1 is 1/m. If the 1’s are rare, then the bit-vectors can be encoded so that they take less than n bits on average. A common approach is run-length encoding.

Compressed Bitmaps (contd..) Example :If i = 13, then j = 4; that is, we need 4 bits in the binary representation of i. Thus. the encoding for i begins with We follow with i in binary, or Thus, the encoding for 13 is

Compressed Bitmaps (contd..) Example: Let us decode the sequence Starting at the beginning. we find the first 0 at the 4th bit. So j = 4. The next 4 bits are 1101.so we determine that the first integer is 13. We are now left with to decode. Since the first bit is 0: we know the next bit represents the next integer by itself. This integer is 0.

Compressed Bitmaps (contd..) We find the first 0 in the second position, whereupon we conclude that the final two bits represent the last integer, 3. Our entire sequence of run-lengths is thus 13, 0, 3. From these numbers, we can reconstruct the actual bit-vector,

Compressed Bitmaps (contd..) Example: Convert the bit vectors for the three ages, 25 – – – For 25, the run-length sequence is (0,7) The bit vector is For 30, run-length sequence is (7). The bit vector is For 45, the run-length sequence is (1,7), Bit vector is

Thank you