Thursday, April 4, 2013

Why my index is being ignored post hardware upgrade?

Here's a nice case study how a better hardware can make the existing (poor) code perform worse off. Consider the following query:
select /*+ parallel*/ * from t1 where hname = 'RARE VALUE' and id = 2;
The query plan on a slow machine looks like this:
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |     2 |    44 |     6   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1     |     2 |    44 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | T1_IDX |     2 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
and on the fast like this:
--------------------------------------------------------------------------------------------------------------
| Id  | Operation            | Name     | Rows  | Bytes | Cost (%CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |          |     2 |    44 |     4   (0)| 00:00:01 |        |      |            |
|   1 |  PX COORDINATOR      |          |       |       |            |          |        |      |            |
|   2 |   PX SEND QC (RANDOM)| :TQ10000 |     2 |    44 |     4   (0)| 00:00:01 |  Q1,00 | P->S | QC (RAND)  |
|   3 |    PX BLOCK ITERATOR |          |     2 |    44 |     4   (0)| 00:00:01 |  Q1,00 | PCWC |            |
|*  4 |     TABLE ACCESS FULL| T1       |     2 |    44 |     4   (0)| 00:00:01 |  Q1,00 | PCWP |            |
--------------------------------------------------------------------------------------------------------------
The difference that matters between the two machines is the number of CPUs. On the old one it is 16 and on the new one 80 (with HT enabled). Therefore the default degree of parallelism on the new machine is 2 (parallel_threads_per_cpu) * 80 (cpu_count) = 160, which reduces the cost of full scan of the table to a value small enough to "beat" an index access, in terms of cost of course. In reality it wasn't as fast as an index access in my case, alas.
In case you want to reproduce, here's a script to create the T1 table:
create table t1 as
   select ceil(dbms_random.value(1, 100000)) as id
         ,case 
             when mod(rownum, 1000) = 0 then cast('RARE VALUE' as varchar2(100))
             when mod(rownum, 3) = 0    then cast('COMMON VALUE 1' as varchar2(100))
             when mod(rownum, 3) = 1    then cast('COMMON VALUE 2' as varchar2(100))
             when mod(rownum, 3) = 2    then cast('COMMON VALUE 3' as varchar2(100))
          end hname
         ,'x' value
   from dual connect by level <= 500000;

<>

create index t1_idx on t1 (id, hname) pctfree 50;
The plans above were created on Oracle 11.2.0.3 on Linux x64