BITMAP INDEXES Sai Priya Rama Gopal SJSU ID : Class ID: 125
Introduction A bitmap index is a special kind of index that stores the bulk of its data as bit arrays (commonly called "bitmaps"). A bitmap index is a special kind of index that stores the bulk of its data as bit arrays (commonly called "bitmaps"). It answers most queries by performing bitwise logical operations on these bitmaps. It answers most queries by performing bitwise logical operations on these bitmaps. The bitmap index is designed for cases where number of distinct values is low, in other words, the values repeat very frequently. The bitmap index is designed for cases where number of distinct values is low, in other words, the values repeat very frequently.
Example Suppose a file consists of records with two fields, F and G, of type integer and string, respectively. The current file has six records, numbered 1 through 6, with the following values in order: Suppose a file consists of records with two fields, F and G, of type integer and string, respectively. The current file has six records, numbered 1 through 6, with the following values in order: NoFG 130FOO 230BAR 340BAZ 450FOO 540BAR 630BAZ
Example (contd…) A bitmap index for the first field, F, would have three bit-vectors, each of length 6 as shown in the table. A bitmap index for the first field, F, would have three bit-vectors, each of length 6 as shown in the table. In each case, the 1's indicate in which records the corresponding string appears. In each case, the 1's indicate in which records the corresponding string appears. Table 2 Table 2 ValueVector
Example (contd…) A bitmap index for the first field, G, would have three bit-vectors, each of length 6 as shown in the table. A bitmap index for the first field, G, would have three bit-vectors, each of length 6 as shown in the table. In each case, the 1's indicate in which records the corresponding string appears. In each case, the 1's indicate in which records the corresponding string appears. Table 3 Table 3 ValueVector FOO BAR BAZ001001
Motivation for Bitmap Indexes Bitmap indexes can help answer range queries. Bitmap indexes can help answer range queries. Example: Example: Given is the data of a jewelry stores. The attributes considered are age and salary. Given is the data of a jewelry stores. The attributes considered are age and salary. Table 4 Table 4 NoAgeSalary
Motivation (contd…) A bitmap index for the first field Age, would have seven bit-vectors, each of length 12 as shown in the table. A bitmap index for the first field Age, would have seven bit-vectors, each of length 12 as shown in the table. In each case, the 1's indicate in which records the corresponding string appears. In each case, the 1's indicate in which records the corresponding string appears. Table 5 Table 5 ValueVector
Motivation (contd…) A bitmap index for the second field Salary, would have ten bit- vectors, each of length 12 as shown in the table. A bitmap index for the second field Salary, would have ten bit- vectors, each of length 12 as shown in the table. In each case, the 1's indicate in which records the corresponding string appears. In each case, the 1's indicate in which records the corresponding string appears. Table 5 ValueVector
Motivation (contd…) Suppose we want to find the jewelry buyers with an age in the range and a salary in the range Suppose we want to find the jewelry buyers with an age in the range and a salary in the range We first find the bit-vectors for the age values in this range; in this example there are only two: and , for 45 and 50, respectively. If we take their bitwise OR, we have a new bit-vector with 1 in position i if and only if the i th record has an age in the desired range. We first find the bit-vectors for the age values in this range; in this example there are only two: and , for 45 and 50, respectively. If we take their bitwise OR, we have a new bit-vector with 1 in position i if and only if the i th record has an age in the desired range. This bit-vector is This bit-vector is
Motivation (contd…) Next, we find the bit-vectors for the salaries between 100 and 200 thousand. Next, we find the bit-vectors for the salaries between 100 and 200 thousand. There are four, corresponding to salaries 100, 110, 120, and 140; their bitwise OR is There are four, corresponding to salaries 100, 110, 120, and 140; their bitwise OR is
Motivation (contd…) The last step is to take the bitwise AND of the two bit-vectors we calculated by OR. The last step is to take the bitwise AND of the two bit-vectors we calculated by OR. That is: That is: AND
Motivation (contd…) We thus find that only the fourth and fifth records, which are (50,100) and (50,120), are in the desired range. We thus find that only the fourth and fifth records, which are (50,100) and (50,120), are in the desired range.
Issues for Bitmap Indexes Managing Bitmap indexes Managing Bitmap indexes Memory requirement. Memory requirement.
Managing Bitmap indexes Finding Bit-Vectors Finding Bit-Vectors Finding Records Finding Records Handling modifications to the data file. Handling modifications to the data file. – Record numbers must remain fixed once assigned. – Changes to the data file require the bitmap index to change as well.