Friday, September 14, 2012

ANSI SQL Bug

I’ve never been a fan of using ANSI SQL syntax in Oracle for different reasons. Here is one more to the list.
Have a look at the following query:
 SELECT dummy 
  FROM dual t1, 
       dual t2, 
       dual t3 
 WHERE t2.dummy = t1.dummy 
   AND t3.dummy = t1.dummy; 
If you run it you will get ORA-00918: column ambiguously defined, cause indeed DUMMY from which table Oracle is supposed to return? Let's rewrite the query using the ANSI SQL:
 SELECT dummy 
  FROM dual t1 
  JOIN dual t2 
    ON (t1.dummy = t2.dummy) 
  JOIN dual t3 
    ON (t1.dummy = t3.dummy);
If you run it on Oracle 10.2.0.4 (or older), you will see the following output:
 SQL> SELECT dummy 
  2    FROM dual t1 
  3    JOIN dual t2 
  4      ON (t1.dummy = t2.dummy) 
  5    JOIN dual t3 
  6      ON (t1.dummy = t3.dummy);

D 
- 
X

SQL>
Oracle does execute it without any errors. So which DUMMY has it actually taken then? Let's have a look at the Unparsed Query section of 10053 trace to find the answer (after some formatting):

SQL:******* UNPARSED QUERY IS ******* 
SELECT "T1"."DUMMY" "QCSJ_C000000000300000",
       "T2"."DUMMY" "QCSJ_C000000000300001" 
  FROM "SYS"."DUAL" "T1",
       "SYS"."DUAL" "T2" 
 WHERE "T1"."DUMMY"="T2"."DUMMY"

Query block (0x2aaaac888538) unchanged 
Query block (0x2aaaac887918) before join elimination: 
SQL:******* UNPARSED QUERY IS ******* 
SELECT "from$_subquery$_003"."QCSJ_C000000000300000_0" "QCSJ_C000000000300000",
       "from$_subquery$_003"."QCSJ_C000000000300001_1" "QCSJ_C000000000300001",
       "T3"."DUMMY" "DUMMY" 
  FROM  (SELECT "T1"."DUMMY" "QCSJ_C000000000300000_0",
                "T2"."DUMMY" "QCSJ_C000000000300001_1"
           FROM "SYS"."DUAL" "T1",
                "SYS"."DUAL" "T2" 
          WHERE "T1"."DUMMY"="T2"."DUMMY") "from$_subquery$_003",
        "SYS"."DUAL" "T3" 
 WHERE "from$_subquery$_003"."QCSJ_C000000000300000_0"="T3"."DUMMY"
Apparently Oracle joins first two tables and combins them into an inline view (first query block above). At this point it must take all the rows from both tables cause it doesn’t know which of them would be used later on for further join(s) or in the result set of columns. Ergo, in case some columns might be (and in my case are) named identically, Oracle gives them unique aliases.
Then it joins the third table (btw that’s why with two tabs the error doesn’t happen) (see the second query block in the snippet). Because this is the last table to be joined and all the previous tables have got unique aliases, Oracle doesn’t give unique aliases to t3 table’s columns. Finally it restricts the result set columns list as requested by user. At this point we've got only one dummy column, from table t3.
This behavior looks like a really nasty bug as it conceals errors in development. I bumped into it (like many others) when migrated to Oracle 11g where the bug is fixed and the ansi query like this starts failing. A quick search on metalink revealed a note ID 835701.1 saying that it's been fixed in 10.2.0.5 and afterwards.