The IN and EXISTS are processed very very differently.
SELECT * FROM tbl1 WHERE a IN ( SELECT b FROM tbl2 );
is typically processed as:
SELECT * FROM tbl1, ( SELECT DISTINCT b FROM tbl2 ) tbl2 WHERE tbl1.a = tbl2.b;
The subquery is evaluated, distinct'ed, indexed (or hashed or sorted) and then joined to the original table -- typically.
Where as
SELECT * FROM tbl1 WHERE EXISTS ( SELECT NULL FROM tbl2 WHERE b = a )
is processed more like:
FOR x IN ( SELECT * FROM tbl1 ) LOOP IF ( EXISTS ( SELECT NULL FROM tbl2 WHERE b = x.a ) THEN OUTPUT THE RECORD END IF END LOOP;
It always results in a full scan of TBL1 whereas the first query can make use of an index on TBL1(a).
Lets say the result of the subquery
( SELECT b FROM tbl2 )
is "huge" and takes a long time. But the table TBL1 is relatively small and executing
( SELECT NULL FROM tbl2 WHERE b = x.a )
is very very fast (nice index on tbl2(b)), then the EXISTS will be faster as the time to full scan TBL1 and do the index probe into TBL2 could be less than the time to simply full scan TBL2 to build the subquery we need to distinct on.
Lets say the result of the subquery is small -- then IN is typicaly more appropriate.
If both the subquery and the outer table are huge -- either might work as well as the other -- depends on the indexes and other factors.
See also NOT IN vs NOT EXISTS
If you like dEexams.com and would like to contribute, you can write your article here or mail your article to admin@deexams.com . See your article appearing on the dEexams.com main page and help others to learn.