Bitmap indexes

Bitmap Indices, Advantages and Disadvantages

bitmap_indices,_advantages_and_disadvantages

Introduction

Consider regular example of student table with ID, NAME, ADDRESS, AGE, GENDER and Semester. This table will be loaded with fresh data at the beginning of academic year for new students and updated for the older student with next semester. Most of the time this table will be static and there will be comparatively less insert/ delete/update on it. At the same time there will lot of ad-hoc retrieval with this table i.e.; there would be different types of query fired on this table to get the detail of the students like list of all male/female students, students residing at particular area, or male/female students in particular semester etc. But what would be the size of this student table in a college database? Very huge, right? In this case user query should be very quick enough to get any kind of query result. Will our traditional file organization methods be faster to give us the results? Or can we think of any better method of storing and retrieving the data?

Yes there is better method to select the records from memory for static tables which is accessed multiple times only to see the table content.

Bitmap Indices

Bitmap indices are very powerful in retrieving the records for above cases. In this method, indexes for columns with less unique values are used in the form of bits. Let us try to understand one by one about this method using example.

In the above example, if we notice the column GENDER, it can have only two values - Male or Female. Compared to the whole STUDENT table, they are not unique value. Similarly, say we have only four semesters for the course, and then we can have only four values - sem1, sem2, sem3 and sem4. These types of columns are called less unique value columns or columns with less cardinality. Even though these columns have less frequent values, they are queried most often.

Bits - as everyone knows it is the smallest unit of data representation. It can have either 0 or 1 as value. Now what if we use these bits to represent these less unique value columns? But how? This technique of storing less frequently used columns in the form of bits is called bitmap indices.

This method is used for very large tables with less unique value columns and is accessed number of time with various retrieval queries. In this method we will have

  1. As many bits as the number of rows in the table for each of less unique value columns. For example, if the STUDENT table has 10K records, then we will have 10K bits - one bit for each row.
  2. Number of bitmap indices created on the column will be equal to number of distinct values in the column. For example, for GENDER column we will have two bitmap indices created - one for male and one for female, and for semester column we will have four bitmap indices created - 1, 2, 3, and 4.
  3. If we have any matching value in the column for the row, then that row bit will have '1', else '0'. That means, for GENDER column we will have 2 bitmap indices - one for male and one for female. The bit value for the male bit map index will be 1, if that row has GENDER as 'M', else '0'.

Imagine in our example of STUDENT table has only four records and has values as below.

STUDENT
Student_idStudent_nameaddressagegendersemester
100JohnAddress120M1
101AlenAddress221F1
102SaraAddress320F2
103ChrisAddress422F4
  1. As per rule 1, we will have four rows in the table and hence we will have bits - one bit for each row.
  2. The GENDER column has only two values - 'M' and 'F'. Hence we will have two bitmap indices - one for 'M' and one for 'F'.
  3. Now the bitmap index for GENDER column is as below. Here Bitmap index 'M' has value '1000' indicating first row has gender as 'M' and rest of the rows do not have gender as 'M' Similarly, bitmap index 'F' indicates first row is not 'F' and rest of all rows are having gender as 'F'.
M1 0 0 0
F0 1 1 1

Similarly bitmap index for Semester can be as follows:

SEMESTER
11 1 0 0
20 0 1 0
30 0 0 0
40 0 0 1

Suppose that we have to find the female students who are in the second semester. Here this query uses two columns to filter the records, where both of them have less unique value columns.

SELECT * FROM STUDENT WHERE GENDER = 'F' AND SEMESTER =2;
Bitmap Index for F:	0 1 1 1 	 
 	AND	 
Bitmap Index for SEM 2:	0 0 1 0 	 
Result:	                0 0 1 0     this implies third row has the result for the query.

Look at the table to check if it is correct. Yes, it has resulted in correct row. So the DBMS goes to the third row in the file and displays the result for the user.

Here fetching the bitmap index and performing the logical 'AND' operation to get the result is comparatively faster. Hence this method of storage is useful for this kind of data.

If we have to delete the record from the table, a temporary delete index will be generated. Then it will perform logical 'AND' operation on filter columns and temporary index to remove the data from the table.

Suppose we have to delete the female student from the 4th semester. Then steps involved in deleting this record is as follows.

DELETE FROM STUDENT WHERE GENDER = 'F' AND SEMESTER =4;
Bitmap Index for F:	0 1 1 1	 
 	AND	 
Bitmap Index for SEM 4:	0 0 0 1	 
 	AND	 
 	                1 0 0 0	         temporary delete index
Result: 	        0 0 0 0 	 this implies no rows, hence record is deleted

When to use Bitmap index

Bitmap indexes are only suitable for static tables and materialized views which are updated at night and rebuilt after batch row loading. If your tables experience multiple DML's per second, BE CAREFUL when implementing bitmap indexes!

  • 1 - 7 distinct key values - Queries against bitmap indexes with a low cardinality are very fast.
  • 8-100 distinct key values - As the number if distinct values increases, performance decreases proportionally.
  • 100 - 10,000 distinct key values - Over 100 distinct values, the bitmap indexes become huge and SQL performance drops off rapidly.
  • Over 10,000 distinct key values - At this point, performance is ten times slower than an index with only 100 distinct values.

You will want a bitmap index when:

  1. Table column is low cardinality - As a ROUGH guide, consider a bitmap for any index with less than 100 distinct values.
  2. The table has LOW DML - You must have low insert/update/delete activity. Updating bitmapped indexes take a lot of resources, and bitmapped indexes are best for largely read-only tables and tables that are batch updated nightly.
  3. Multiple columns - Your SQL queries reference multiple, low cardinality values in there where clause. Oracle cost-based SQL optimizer (CBO) will scream when you have bitmap indexes on .

As a general rule, a cardinality of under 1% makes a good candidate for a bitmap index

Problems when implementing bitmap indexes

Some of the most common problems when implementing bitmap indexes include:

  1. Small table - The CBO may force a full-table scan if your table is small!
  2. Bad stats - Make sure you always analyze the bitmap with dbms_stats right after creation:
    CREATE BITMAP INDEX 
    emp_bitmap_idx
    ON index_demo (gender);
    
    exec dbms_stats.gather_index_stats(OWNNAME=>'SCOTT', INDNAME=>'EMP_BITMAP_IDX');
  3. Test with a hint - To force the use of your new bitmap index, just use a Oracle INDEX hint:
    select /*+ index(emp emp_bitmap_idx) */ 
       count(*)
    from 
       emp, dept
    where 
       emp.deptno = dept.deptno;

Advantages of Bitmap Indices

  • As we have seen already, this method helps in faster retrieval of the records when there are less cardinality columns and those columns are most frequently used in the query. This method is efficient even if we have very big table.
  • This method is more efficient when the columns have least involved in insert/update/delete operations.
  • It allows to combine multiple bitmap indices together to fire the query as we have seen in above examples.

Disadvantages of Bitmap Indices

  • They are not suitable for small tables. In small tables, DBMS will force to use full table scan instead of using bitmap index.
  • When there is multiple insert/update/delete on the table from different users, it may cause deadlock on the tables. It will take time to perform the DML transaction and then to update the bitmap index. Hence when there is multiple DML transaction from different users, it will not be able perform transaction quickly, and causing the deadlock.
  • When there is large number of records, there is an overhead to maintain this bitmap indexes. Each time a new record is entered, we have to modify the bitmap index throughout, which is a tedious and time consuming.

Syntax

In SQL we can create bitmap index by using below query.

CREATE BITMAP INDEX Index_Name
ON Table_Name (Column_Name);

Our example above for creating bitmap index for Gender and Semester can be written as follows:

CREATE BITMAP INDEX IDX_GENDER
ON STUDENT (GENDER);

CREATE BITMAP INDEX IDX_ SEMESTER
ON STUDENT (SEMESTER);


If you like dEexams.com and would like to contribute, you can also write your article here or mail your article to admin@deexams.com . See your article appearing on the dEexams.com main page and help others to learn.


Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.


Python if , elif and else

Python Conditions and If statements

  • 0
Python for beginners

Learning Python Part 1

  • 3
Struct Alignment and Padding

Struct Alignment and Padding in C++ And C

  • 0
Friend function

Friend function C++

  • 0
Pointers

C++ Pointers

  • 0
Structures

C++ Structures

  • 0
Types of Inheritance in C++

Inheritance and access specifiers C++

  • 0
Java date pattern

Java Date Pattern Syntax

  • 0
Java Date and Calendar

Java Date formats

  • 0
JAVA Data Type

Data types in Java

  • 0
Java unreachable code

Unreachable Code Error in Java

  • 0

Post Comment

Comments(0)

WEB TECHNOLOGY

Articles

×

Forgot Password

Please enter your email address below and we will send you information to change your password.