Consider regular example of student table with ID, NAME, ADDRESS, AGE, GENDER and Semester. This table will be loaded with fresh data at the beginning of academic year for new students and updated for the older student with next semester. Most of the time this table will be static and there will be comparatively less insert/ delete/update on it. At the same time there will lot of ad-hoc retrieval with this table i.e.; there would be different types of query fired on this table to get the detail of the students like list of all male/female students, students residing at particular area, or male/female students in particular semester etc. But what would be the size of this student table in a college database? Very huge, right? In this case user query should be very quick enough to get any kind of query result. Will our traditional file organization methods be faster to give us the results? Or can we think of any better method of storing and retrieving the data?
Yes there is better method to select the records from memory for static tables which is accessed multiple times only to see the table content.
Bitmap indices are very powerful in retrieving the records for above cases. In this method, indexes for columns with less unique values are used in the form of bits. Let us try to understand one by one about this method using example.
In the above example, if we notice the column GENDER, it can have only two values - Male or Female. Compared to the whole STUDENT table, they are not unique value. Similarly, say we have only four semesters for the course, and then we can have only four values - sem1, sem2, sem3 and sem4. These types of columns are called less unique value columns or columns with less cardinality. Even though these columns have less frequent values, they are queried most often.
Bits - as everyone knows it is the smallest unit of data representation. It can have either 0 or 1 as value. Now what if we use these bits to represent these less unique value columns? But how? This technique of storing less frequently used columns in the form of bits is called bitmap indices.
This method is used for very large tables with less unique value columns and is accessed number of time with various retrieval queries. In this method we will have
Imagine in our example of STUDENT table has only four records and has values as below.
STUDENTStudent_id | Student_name | address | age | gender | semester |
---|---|---|---|---|---|
100 | John | Address1 | 20 | M | 1 |
101 | Alen | Address2 | 21 | F | 1 |
102 | Sara | Address3 | 20 | F | 2 |
103 | Chris | Address4 | 22 | F | 4 |
M | 1 0 0 0 |
F | 0 1 1 1 |
Similarly bitmap index for Semester can be as follows:
SEMESTER | |
---|---|
1 | 1 1 0 0 |
2 | 0 0 1 0 |
3 | 0 0 0 0 |
4 | 0 0 0 1 |
Suppose that we have to find the female students who are in the second semester. Here this query uses two columns to filter the records, where both of them have less unique value columns.
SELECT * FROM STUDENT WHERE GENDER = 'F' AND SEMESTER =2;
Bitmap Index for F: 0 1 1 1
AND
Bitmap Index for SEM 2: 0 0 1 0
Result: 0 0 1 0 this implies third row has the result for the query.
Look at the table to check if it is correct. Yes, it has resulted in correct row. So the DBMS goes to the third row in the file and displays the result for the user.
Here fetching the bitmap index and performing the logical 'AND' operation to get the result is comparatively faster. Hence this method of storage is useful for this kind of data.
If we have to delete the record from the table, a temporary delete index will be generated. Then it will perform logical 'AND' operation on filter columns and temporary index to remove the data from the table.
Suppose we have to delete the female student from the 4th semester. Then steps involved in deleting this record is as follows.
DELETE FROM STUDENT WHERE GENDER = 'F' AND SEMESTER =4;
Bitmap Index for F: 0 1 1 1 AND Bitmap Index for SEM 4: 0 0 0 1 AND 1 0 0 0 temporary delete index Result: 0 0 0 0 this implies no rows, hence record is deleted
Bitmap indexes are only suitable for static tables and materialized views which are updated at night and rebuilt after batch row loading. If your tables experience multiple DML's per second, BE CAREFUL when implementing bitmap indexes!
You will want a bitmap index when:
As a general rule, a cardinality of under 1% makes a good candidate for a bitmap index
Some of the most common problems when implementing bitmap indexes include:
CREATE BITMAP INDEX emp_bitmap_idx ON index_demo (gender); exec dbms_stats.gather_index_stats(OWNNAME=>'SCOTT', INDNAME=>'EMP_BITMAP_IDX');
select /*+ index(emp emp_bitmap_idx) */ count(*) from emp, dept where emp.deptno = dept.deptno;
In SQL we can create bitmap index by using below query.
CREATE BITMAP INDEX Index_Name ON Table_Name (Column_Name);
Our example above for creating bitmap index for Gender and Semester can be written as follows:
CREATE BITMAP INDEX IDX_GENDER ON STUDENT (GENDER); CREATE BITMAP INDEX IDX_ SEMESTER ON STUDENT (SEMESTER);
If you like dEexams.com and would like to contribute, you can write your article here or mail your article to admin@deexams.com . See your article appearing on the dEexams.com main page and help others to learn.