drop table t1 purge;
drop table t2 purge;
create table t1
as
SELECT MOD(rownum, 25) + 1 AS filter,
MOD(rownum, 30) + 1 AS join1,
lpad(rownum, 10) v1,
rpad('x', 100) padding
FROM dual
CONNECT BY LEVEL <= 10000
;
create table t2
as
SELECT MOD(rownum, 50) + 1 as filter,
MOD(rownum, 40) + 1 as join2,
lpad(rownum, 10) v2,
rpad('x', 100) padding
FROM dual
CONNECT BY LEVEL <= 10000
;
DECLARE
l sys.odcivarchar2list := sys.odcivarchar2list('T1', 'T2');
BEGIN
FOR j IN l.first .. l.last
LOOP
dbms_stats.gather_table_stats(ownname => USER, tabname => l(j),
cascade => TRUE,
method_opt => 'for all columns size 1');
END LOOP;
END;
/
Now set autotrace and run the following query:
set autotrace traceonly explain
SELECT t1.v1,
t2.v2
FROM t1,
t2
WHERE t1.join1 = t2.join2
AND t1.filter = 1
AND t2.filter = 1;
As a result you will see a plan like the one below:
Execution Plan
----------------------------------------------------------
Plan hash value: 2959412835
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2000 | 68000 | 61 (2) | 00:00:01 |
|* 1 | HASH JOIN | | 2000 | 68000| 61 (2) | 00:00:01 |
|* 2 | TABLE ACCESS FULL| T2 | 200 | 3400 | 30 (0) | 00:00:01 |
|* 3 | TABLE ACCESS FULL| T1 | 400 | 6800 | 30 (0) | 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."JOIN1"="T2"."JOIN2")
2 - filter("T2"."FILTER"=1)
3 - filter("T1"."FILTER"=1)
As you can see the cardinality of the join is 2000 (highlighted). Now let's do the math:
JoinCard = Card(T2) * Card(T1) / Greatest(NDV(T1.Join1), NDV(T2.Join2))
JoinCard = 200 * 400 / Greatest(30, 40) = 2000
Which matches what we can see in the execution plan above.
Now let's re-set up our demo (or tweak statistics) in the way that the Number of Distinct Values on the join column in either of tables is 100 rather then 30 or 40 as it is in the initial example, and run the query again. In my case the plan looks like this:
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 808 | 27472 | 61 (2) | 00:00:01 |
|* 1 | HASH JOIN | | 808 | 27472 | 61 (2) | 00:00:01 |
|* 2 | TABLE ACCESS FULL| T2 | 200 | 3400 | 30 (0) | 00:00:01 |
|* 3 | TABLE ACCESS FULL| T1 | 400 | 6800 | 30 (0) | 00:00:01 |
---------------------------------------------------------------------------
As you can see the cardinality has changed and now is 808 (I'm using 10.2.0.4 on Linux by the way). Now let's again do the math:
JoinCard = 200 * 400 / Greatest(40, 100) = 800
Obviously the original formula doesn't really work anymore.
I've done the same for NDV 1000 and 10000 and got cardinality of 236 and 200 respectively where expected to get 80 and 8.
Apparently Oracle uses some mechanism to reduce the selectivity (that 1 divided by Greatest of NDVs thing) when it "approaches" the least of two filtered cardinalities, and most importantly never lets it ge less then the least of two cardinalities (in our case 200). Effectively Oracle assumes that the data is aligned the way even after the filtering you would still have a match for each key from the table with lest cardinality.
So how it can be harmful? In order to answer this question let's set up another demo:
create table t1
as
SELECT mod(rownum, 100) AS filter,
rownum AS join1,
lpad(rownum, 10) v1,
rpad('x', 100) padding
FROM dual
CONNECT BY LEVEL <= 1000000;
create table t2
as
SELECT rownum as join2,
lpad(rownum, 10) v2,
rpad('x', 100) padding
FROM dual
CONNECT BY LEVEL <= 30000;
create table t3
pctfree 90 pctused 10
as
SELECT rownum as join3,
lpad(rownum, 10) v3,
rpad('x', 100) padding
FROM dual
CONNECT BY LEVEL <= 100000;
create index t3_idx on t3(join3);
<<< Gather the stats without histogram >>>
I've created three tables and an index on the third one. The last table is also created with pctfree 90 to make the actual table quite big. Also I have clustered the join columns in tables T1 and T2 the way that after the filtering of T1, there is little overlap of the join columns:
SQL> SELECT t1.v1,
1 t2.v2
2 FROM t1,
3 t2
4 WHERE t1.join1 = t2.join2
5 AND t1.filter = 1;
300 rows selected.
Now let's run the following query with autotrace on:
SELECT /* leading(t1 t2 t3) use_hash(t1 t2) use_nl(t3) index(t3)*/
t1.v1,
t2.v2,
t3.v3
FROM t1,
t2,
t3
WHERE t1.join1 = t2.join2
AND t1.filter = 1
AND t3.join3 = t2.join2;
The result would be as follows (I've removed irrelevant bits of output):
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9874 | 491K | 5532 (1) | 00:01:18 |
|* 1 | HASH JOIN | | 9874 | 491K | 5532 (1) | 00:01:18 |
|* 2 | HASH JOIN | | 9874 | 337K | 2975 (2) | 00:00:42 |
|* 3 | TABLE ACCESS FULL | T1 | 9874 | 183K | 2888 (2) | 00:00:41 |
| 4 | TABLE ACCESS FULL | T2 | 30011 | 468K | 86 (2) | 00:00:02 |
| 5 | TABLE ACCESS FULL | T3 | 100K | 1562K | 2555 (1) | 00:00:36 |
----------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
16617 consistent gets
14640 physical reads
0 redo size
12373 bytes sent via SQL*Net to client
559 bytes received via SQL*Net from client
21 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
300 rows processed
As you can see, Oracle thinks it would be almost a ten thousand rows as a result of the joining T1 to T2 when in fact there are only 300. If we do the math the we'll see that JoinCard = 30k * 10k / Greatest(30k, 1m) = 300 which way smaller then the deemed 10k. As a result Oracle chooses Full Scan on the T3 table and hash join which amounts to 16.5k of consistent gets.
Now let's run the query with the hint (put plus after /*) and see the results:
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9874 | 491K | 22750 (1)| 00:05:19 |
| 1 | TABLE ACCESS BY INDEX ROWID| T3 | 1 | 16 | 2 (0) | 00:00:01 |
| 2 | NESTED LOOPS | | 9874 | 491K | 22750 (1)| 00:05:19 |
|* 3 | HASH JOIN | | 9874 | 337K | 2975 (2) | 00:00:42 |
|* 4 | TABLE ACCESS FULL | T1 | 9874 | 183K | 2888 (2) | 00:00:41 |
| 5 | TABLE ACCESS FULL | T2 | 30011 | 468K | 86 (2) | 00:00:02 |
|* 6 | INDEX RANGE SCAN | T3_IDX | 1 | | 1 (0) | 00:00:01 |
--------------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
9549 consistent gets
8881 physical reads
0 redo size
12373 bytes sent via SQL*Net to client
559 bytes received via SQL*Net from client
21 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
300 rows processed
We've got 9.5k of consistent gets which in this particular example is nearly two fold less.
The things can get even worse (and in case with my production system got) if you have more tables and Oracle starts doing full scans on all of them. Therefore, watch out.
No comments:
Post a Comment