Thursday, September 30, 2010


Recently I've stuck in the problem when the Streams checkpoint wasn't updating the required_checkpoint_scn and applied_scn fields in dba_capture. This caused a lot of problems as RMAN couldn't remove the processed logs from the system (it uses the required_checkpoint_scn to determine whether the Streams still needs those logs). As a result our DBA has raised an SR
with Oracle, which has resulted into th advice to reset the aq_tm_processes parameter.

It's turned out that starting from 10g aq_tm_processes sets the number of queue monitor slave processes dedicated to monitor the PERSISTENT queues in the system rather than the buffered ones. If you set the value for this parameter as 10 than ALL of the salve monitors will look at the persistent queues and NONE of then will monitor the buffered queues. That was the case
in our system.

Oracle advises to remove that parameter from init.ora/spfile at all and bounce the instance. That would enable Oracle to manage automatically how many processes will watch persistent/buffered queues. If instance bouncing isn't an option in your environment, Oracle advises setting it to 1 which can be done dynamically. Pay attention that manual setting the parameter to 0 will totally disable the queue monitoring and is highly unrecommended (haven't checked what it could lead to though).

Friday, August 27, 2010

Data Change Tracking via the Streams

Recently I’ve got an assignment to implement a facility which would capture the changes in a table once it’s appeared there and propagate it further on through EMS to the subscribers. Here I’d like to describe the way we implemented such a mechanism.

We’ve considered a bunch of solutions we found useful like Change Notification, Change Data Capture, Trigger based mechanism, etc., but decided to stop on the Streams as it was the winner in correlation between speed, maintainability, and resource consummation.

In a nutshell

Capture catches the changes satisfying the basic table filter rules. Afterwards it pushes them down to Apply which uses manually written DML Handler. The Handler is written in the way, when it takes LCR, extracts the necessary information (like PK ID or so), piles it up into a user defined queue, and simply ignores the LCR without any further processing. In case with more complicated structures (which is the case for me), I put the caught changes into an interim cache collection and publish them into the queue using a pre-commit handler (also manually written).

The paragraph above might seem a tad mouthful for a person who isn’t really into the Streams, ergo see further example to comprehend (if you’re interested).


First of all the schema you want to the Streams code to work under must be a granted with Streams administrator permissions:
  dbms_streams_auth.grant_admin_privilege(grantee => 'SCHEMA_NAME');
Secondly the user must be granted with the DBA role explicitly. Once it’s done, we can start setting up the test environment.

Now let’s create the test table the changes in which we’re going to catch:
  id NUMBER(15) PRIMARY KEY pk_t1,
  padding rpad('x', 500)
I’m going to show you how to catch all the unique values in the f1 field for the table above. I will publish all the f1 values into the queue below:
  dbms_aqadm.create_queue_table(queue_table        => 'ct_t1_tbl',
                                queue_payload_type => '$_jms_text_message',
                                multiple_consumers => FALSE);
  dbms_aqadm.create_queue(queue_name  => 'ct_t1',
                          queue_table => 'ct_t1_tbl');
  dbms_aqadm.start_queue(queue_name => 'ct_t1');

The messages from that queue can be picked up from within either Oracle or Java code.

The Queue

The Streams uses a (buffered) queue to pass the changes over between capture and apply. Let’s create it:
  dbms_streams_adm.set_up_queue(queue_table => 'STRM_QT',
                                queue_name  => 'STRM_Q',
                                queue_user  => 'STREAMS_ADMIN', -- put here your user
                                COMMENT     => 'The Streams queue');

The Capture

Before we create the capture, we need to instantiate the table:
  dbms_capture_adm.prepare_table_instantiation(table_name => 'T1');
The step above is necessary. Oracle scans the table and jots down the SCN starting from which it can capture the changes. Also it will automatically add supplemental log groups for the PK. To check if the table is capture ready examine the dictionary dba_capture_prepared_tables.

As we’re going to capture the f1 field which isn’t present in the PK we have to add manually a supplemental log group for that column so it would appear in the redo log every time DML occurs against the table:

To check if the column is being logged, examine the views user_log_groups and user_log_group_columns.

Now we can create the capture having built the logminer dictionary before of course:
  l_first_scn NUMBER;
  dbms_capture_adm.create_capture(queue_name                => 'STRM_Q',
                                  capture_name              => 'CPT01',
                                  start_scn                 => l_first_scn,
                                  first_scn                 => l_first_scn,
                                  checkpoint_retention_time => 1);
  dbms_streams_adm.add_table_rules(table_name   => 'T1',
                                         streams_type => 'CAPTURE',
                                         streams_name => 'CPT01',
                                         queue_name   => 'STRM_Q',
                                         include_dml  => TRUE,
                                         include_ddl  => FALSE);

The DML and Pre-Commit Handlers

The whole peculiar logic is concluded in the next package:
  PROCEDURE dml_handler(p_lcr IN sys.anydata);
  PROCEDURE precommit_handler(commit_number IN NUMBER);
END strms;


  -- the cache structure for LCRs
  l_cache t_cache;


  PROCEDURE publish2aq
    p_entity_id IN NUMBER,
    p_msg       IN VARCHAR2
  ) IS
    l_msg $_jms_text_message;
    l_queue_options dbms_aq.enqueue_options_t;
    l_msg_props     dbms_aq.message_properties_t;
    l_queue_name    VARCHAR2(64) := 'STRM_Q';
    msg_id          RAW(16);
    -- Ensure what is sent will be a JMS message
    l_msg :=$_jms_text_message.construct();
    l_msg.set_string_property('entity', p_entity_id);
    dbms_aq.enqueue(queue_name         => l_queue_name,
                    enqueue_options    => l_queue_options,
                    message_properties => l_msg_props,
                    payload            => l_msg,
                    msgid              => msg_id);
  END publish2aq;


  PROCEDURE dml_handler(p_lcr IN sys.anydata) IS
    lcr sys.lcr$_row_record;
    rc  PLS_INTEGER;
    l_owner           VARCHAR2(30);
    l_object          VARCHAR2(30);
    l_entity          NUMBER(15);
    l_id              NUMBER(15);
    l_table_id        NUMBER(15);
    l_hierarchy_level NUMBER(2);
    l_operation       VARCHAR2(1);
    res               PLS_INTEGER;
    rc := p_lcr.getobject(lcr);
    l_operation := substr(lcr.get_command_type(), 1, 1);
    l_owner     := lcr.get_object_owner();
    l_object    := lcr.get_object_name();
    -- if the proc has fired against another object, just returning
    IF l_owner != 'STERAMS_ADMIN' OR l_object != 'T1' THEN
    END IF;
    res := sys.anydata.getnumber(lcr.get_value(value_type  => CASE l_operation WHEN 'D' THEN 'OLD' ELSE 'NEW' END,
                                               column_name => 'F1'),
    -- Putting into the cache
    IF NOT l_cache.exists(l_id) THEN
      -- in my case I store the operation performed on the row. In case when within a transaction
      -- it was performed couple of operations against one row I calculated the result operation.
      -- This example isn't supposed to be so complicated, so I haven't included such code
      l_cache(l_id) := l_operation;
    END IF;
  END dml_handler;


  PROCEDURE precommit_handler(commit_number IN NUMBER) IS
    idx PLS_INTEGER;
    -- If there's nothing to publish just returning
    IF l_cache.count = 0 THEN
    END IF;
    idx := l_cache.first;
      -- May be any other publishing way
      publish2aq(idx, idx);
      idx :=;
  END precommit_handler;

END strms;

The Apply

Now let’s create the apply process and tie it with the handlers:
  dbms_apply_adm.create_apply(queue_name     => 'STRM_Q',
                              apply_name     => 'APL01',
                              apply_user     => 'STERAMS_ADMIN',
                              apply_captured => TRUE);
  dbms_apply_adm.alter_apply(apply_name        => 'APL01',
                             precommit_handler => 'STERAMS_ADMIN..strms.PRECOMMIT_HANDLER');
  dbms_apply_adm.set_parameter(apply_name => 'APL01',
                               parameter  => 'disable_on_error',
                               VALUE      => 'N');
  FOR i IN 1 .. 4 LOOP
    dbms_apply_adm.set_dml_handler(object_name    => 'T1',
                                   object_type    => 'TABLE',
                                   operation_name => CASE i WHEN 1 THEN 'INSERT'
                                                            WHEN 2 THEN 'UPDATE'
                                                            WHEN 3 THEN 'DELETE'
                                                            ELSE 'LOB_UPDATE' END,
                                   user_procedure => 'STREAMS_ADMIN..strms.DML_HANDLER',
                                   assemble_lobs  => TRUE);
It’s left only to start the Streams:
  dbms_capture_adm.start_capture(capture_name => 'CPT01');
  dbms_apply_adm.start_apply(apply_name => 'APL01');

Monday, February 22, 2010

Oracle Range Join Cardinality Latest Changes

In this post I’d like to describe the latest changes in the cardinality calculation for joins of tables.  All the changes are checked for the Oracle version  The version of Oracle 10g which I cite to throughout the passage is The ideas of how it works in Oracle 10g were taken from the book “Cost-Based Oracle Fundamentals” by Jonathan Lewis.

t2.join1 > t1.join1
Let’s take a glance at the next query:
SELECT t1.v1, t2.v1
  FROM t1, t2
 WHERE t1.filter = 1 -- 25 unique values
   AND t2.join1 > t1.join1 -- 40 and 30 unique values correspondingly
   AND t2.filter = 1 -- 50 unique values
This is an example from J. Lewis’ book. In the comments you can see the numbers of unique values in conformable columns. Both tables have 10 000 rows, so after filtering t2 has 200 rows, and t1 has 400 rows. The t2.join1 column has 40 unique values from 1 to 40, and the t1.join1 column has correspondingly 30 unique values from 1 to 30. According to the book, Oracle 10g uses static selectivity 5% to calculate the cardinality of this query. Here’s the execution plan:
| Id  | Operation           | Name | Rows  | Bytes | Cost  | 
|   0 | SELECT STATEMENT    |      |  4000 |   132K|   155 | 
|   1 |  MERGE JOIN         |      |  4000 |   132K|   155 | 
|   2 |   SORT JOIN         |      |   200 |  3400 |    77 | 
|*  3 |    TABLE ACCESS FULL| T2   |   200 |  3400 |    30 | 
|*  4 |   SORT JOIN         |      |   400 |  6800 |    78 | 
|*  5 |    TABLE ACCESS FULL| T1   |   400 |  6800 |    30 | 
Predicate Information (identified by operation id): 
   3 - filter("T2"."FILTER"=1) 
   5 - filter("T1"."FILTER"=1)
As you can see, the cardinality is 4000, which is equal to 200 * 400 * 0.05. Now let’s take a look at the execution plan for the same query executed in Oracle 11g:
| Id  | Operation           | Name | Rows  | Bytes | Cost  | 
|   0 | SELECT STATEMENT    |      | 49000 |  1626K|    82 | 
|   1 |  MERGE JOIN         |      | 49000 |  1626K|    82 | 
|   2 |   SORT JOIN         |      |   200 |  3400 |    38 | 
|*  3 |    TABLE ACCESS FULL| T2   |   200 |  3400 |    28 | 
|*  4 |   SORT JOIN         |      |   400 |  6800 |    39 | 
|*  5 |    TABLE ACCESS FULL| T1   |   400 |  6800 |    28 | 
Predicate Information (identified by operation id): 
   3 - filter("T2"."FILTER"=1) 
   5 - filter("T1"."FILTER"=1)
As you can see, Oracle 11g thinks it’s going to be 49 000 rows returned which is way much more then 10g’s assumption.
Let me explain how I think Oracle calculates this number and the cardinality for such queries in general.

Boring Math
Let’s take a look at the picture. You can see there two lines which represent unique values in columns t2.join1 (left bold vertical line) and t1.join1 (right bold vertical line). Also you can notice than the high_value in t2.join1 is greater than the high_value in t1.join1, that the low_value of t2.join1 is greater than the low_value in t1.join1, and finally that the low_value in t2.join1 is lower than the high_value in t1.join1. The horizontal intermittent lines, drawn through high_values and low_values, mark out three sectors: A, B, and C. This is the most common case of correlation between low_lalues and high_values I could make up as there’re all three segments. If you move those two lines against each other you can achieve existing of two adjacent sectors or even only one of them at all. As you can foresee, the cardinality in every sector is being calculated separately.

Sector A
Logically, for every t2 value for the sector A there should be picked out every value from t1.join1:
Cardinality of A = (high_value(t2) – high_value(t1)) * num_rows(t1) * num_rows(t2) / num_distinct(t2)
Of course it’s relevant only if the condition high_value(t2) – high_value(t1) > 0 is met.

Sector B
For simplicities sake, I’ve separated calculation of cardinality below the sector A in two different sectors even though its not quite right. Why it’s not, because essentially it’s left to count the cardinality for the only segment high_value(t2) – low_value(t2). On the other hand, intersecting that segment with the t1’s values line we’ll get two sectors: one with common values (sector B), and one without (Sector C). Apparently, for each t2.jon1 value there should be counted some number of t1.join1 values from sector B plus all the t1.join1 values from sector C.
Let’s count the B’s cardinality as if there’s no sector C at all (the number of values for t2.join1 values between high_value(t2) and low_value(t2) in terms of values from t1.join1 that are in the sector B). In that case for every value of t2.join1 there should be picked out all the values of t1.join1 that are less that the current t2.join1 value. For instance, if high_value(t1) = 5, low_value(t2) = 1, then for t2.join1 = 5 the t1.join1 values are 4, 3, 2, and 1 (there’re four of them). For t2.join1 = 4 there are one fewer of t1.join1 values (three of them), and so forth. Ultimately, we have a descending number of values from t1 for t2’s values with one fewer on each next step. The whole number of t1.join1’s values can be calculated by the formula for summing of arithmetic progressions:
Sn = ((An + A1) / 2) * n
Where An is the nth element of progression, and A1 – first element. In our case A1 always equals to 1, while An = high_value(t1) – low_value(t2). Pay attention, that in that case we are interested in the NUMBER of values between high and low values rather that in the exact values. We need to know how many times every value will be counted during the final cardinality estimation. Correspondingly, n can be calculated in that way:
n = high_value(t1) – low_value(t2)
Knowing all the stuff above we are able to calculate the cardinality for the sector B:
Cardinality on B = (num_rows(t2) / num_distinct(t2)) * (num_rows(t1) / num_distinct(t1)) * (high_value(t1) – low_value(t2)) * (high_value(t1) – low_value(t2) + 1) / 2
that is, the number of rows for every value from t2.join1 for that sector multiply the number of rows for one value for t1.join1 multiply the number of such “entries”, which is the sum of the progression discussed above.

Sector C
If that sector exists, for every t2.join2 value from sector B you should also include all the values of t1.join1 which are below the low_value of t2:
Cardinality of C = (low_value(t2) – low_value(t1)) * (high_value(t1) – low_value(t2) + 1) * num_rows(t2) / num_distinct(t2) * num_rows(t1) / num_distinct(t1)

The Final Cardinality
Finally, let’s assemble the cardinality formula:
Cardinality of Join = Cardinality of A + Cardinality of B + Cardinality of C
To save space, let’s derive the selectivity formula dividing the sum by num_rows(t2) * num_rows(t1):
Selectivity of Join = (high_value(t2) – high_value(t1)) / num_distinct(t2) + 1 / num_distinct(t2)) * 1 / num_distinct(t1)) * (high_value(t1) – low_value(t2)) * (high_value(t1) – low_value(t2) + 1) / 2 + (low_value(t2) – low_value(t1)) * (high_value(t1) – low_value(t2) + 1) * 1 / num_distinct(t2) * 1 / num_distinct(t1)
After beautifying and generalizing we can get the final formula:
Cardinality of join = Filtered cardinality (t1) * Filtered Cardinality (t2) * Selectivity of Join
Selectivity of Join = (A + (B + C) / num_distinct(t1)) / num_distinct(t2)
A = high_value(t2) – high_value(t1), where high_value(t2) > high_value(t1). Otherwise A = 0
B = (least(high_value(t2), high_value(t1)) – great(low_value(t1), low_value(t2)) * (least(high_value(t2), high_value(t1)) – great(low_value(t1), low_value(t2)) + 1) / 2
C = (great(low_value(t1), low_value(t2)) – least(low_value(t1), low_value(t2))) * (least(high_value(t2), high_value(t1)) - great(low_value(t1), low_value(t2)) + 1), where low_value(t2) > low_value(t1). Otherwise C = 0
t2 is the table which is on the left side of the sigh “>”, and t1 is the table on the right side correspondingly

The Example
Let’s get back to the example in the beginning of the article and perform manual calculation of cardinality:
A = 40 – 30 = 10
B = (30 – 1) * (30 – 1 + 1) / 2 = 29 * 30 / 2 = 435
C = 0
Selectivity of Join = (10 + 435 / 30) / 40 = 0.6125
Cardinality of Join = 200 * 400 * 0.6125 = 49 000

t2.join1 >= t1.join1
I can’t finish this article without bringing up the “>=” range join. Essentially, addition of the equality sign should add num_rows(t2) / num_distinct(t2) * num_rows(t1) / num_distinct(t1) rows into the final cardinality. If you want not to forget about it, you can change the formula and add 1 to A if you have the equality sign. For instance, let’s apply it to the test query:
A = 40 – 30 + 1 = 11
B = … = 435
C = 0
Selectivity of Join = (11 + 435 / 30) / 40 = 0.6375
Cardinality of Join = 200 * 400 * 0.6375 = 51 000
Here’s the execution plan:
| Id  | Operation            | Name | Rows  | Bytes | Cost  | 
|   0 | SELECT STATEMENT     |      | 51000 |  1693K|    83 | 
|   1 |  WINDOW BUFFER       |      | 51000 |  1693K|    83 | 

|   2 |   MERGE JOIN         |      | 51000 |  1693K|    83 | 
|   3 |    SORT JOIN         |      |   200 |  3400 |    38 | 
|*  4 |     TABLE ACCESS FULL| T2   |   200 |  3400 |    28 | 
|*  5 |    SORT JOIN         |      |   400 |  6800 |    39 | 
|*  6 |     TABLE ACCESS FULL| T1   |   400 |  6800 |    28 | 
Predicate Information (identified by operation id): 
   4 - filter("T2"."FILTER"=1) 
   6 - filter("T1"."FILTER"=1)

That is all I wanted to cover in this article. In a next one I am going to cover the between operation.

Wednesday, January 27, 2010

How CBO calculates cardinality (Part 2)

This is the continuation of my previous post which you can read here.

column > (<) const
According to the book the cardinality for such a condition can be calculated with the formula below:
Cardinality = num_rows * (needed_range_length / total_range_length + 1 / num_dist*)
* in case if the range is closed (>= or <=)

For example lets assume we have a condition column > 3 and high_value = 12 in that column. In that case the needed_range_length will be high_value – const = 12 – 3 = 9. The total_range_length is being calculated as high_value – low_value from user_tab_col_statistic. In case when we have the “>=” sign, we have have to add 1 / num_distinct.

The formula above is valid for a constant value between the lowest and highest values. In case we use an out-of-range value for the constant the cardinality can be calculated in two ways depending on a condition which is easier to explain on an example:
  1. const < low_value and column < const or const > high_value and column > const
  2. const < low_value and column > const or const > high_value and column < const
In case with the first example, everything is quite obvious, the cardinality will be equal to num_rows. The second case on the other hand is a bit more interesting. Apparently the cardinality of such a query should be (from the humane point of view) equal to 0, but CBO uses the same “wearing off” formula as it does in case with the condition column = const where const is out of range. In other words, for the second example CBO will calculate cardinality as if it’s the equality condition (column = const), which I described in the previous post.

In order to finish the description of inequalities it’s left to cover only the case with bind variables. Here’s the formula for that purpose:
Cardinality = num_rows / num_dist + 1
So it’s the cardinality of condition column = :1 but increased by 1.

column between Am and An
The between condition is an extended inequality condition, but instead of high_value (or low_value depending on the sign) CBO uses Am and An to calculate needed_range_length. Here’s the basic formula for such a condition:
Cardinality = num_rows * ((An - Am) / (high_value low_value) + 1 / num_dist)
It’s valid for An > Am and both An and Am are within the range between low and high values from user_tab_col_statistic. But what if either Am or An is out of the range? The answer is simple (at least for Oracle and above where I tested it). If the range Am-An has common elements with the range low_value - high_value, than CBO calculates the cardinality using the same formula having substituted an out-of-range edge (either An or Am) with corresponding edge value (high_value if An is out of bound or low_value if Am if out bound) from user_tab_col_statistic. In case if these two ranges have nothing in common, Oracle results the cardinality as the condition is column = :1 (equal to num_rows / num_dist).

Now let’s take a look at bind variables. What would be the cardinality for the condition column between :1 and :2? According to Lewis Oracle prior version 10.2 takes 0.0025 of num_rows and uses the rounded result as the cardinality. Oracle does something similar but not exactly:
Cardinality = num_rows / (num_dist * num_dist) + 1
Finally, here’s the formula for Oracle
Cardinality = num_rows / num_dist + 1

column > const AND column <= const
In case with constants both versions ( and calculate cardinality in the same way (at least with the same result). Both of them have realized that no rows should be returned and therefore added the filter NULL is NOT NULL into the plans.

But once you substitute the constant with a bind variable everything turns upside down. Neither of databases realizes that the result should be no rows at all. Moreover, 11g calculates cardinality in usual for it way as num_rows / num_dist + 1, whilst 10g does usual for it thing and calculates the cardinality as num_rows / (num_dist * num_dist) + 1.