Sunday, November 29, 2009

The Shy Objects

Hi

This post will be a bit different from the previous ones. No big experiments or useful tips, just an issue you will probably never encounter. I stumbled across it by accident while I was writing a script (that will be published in the near future).

I have some experience with SQL Server, and there you can find many strange things you would not expect, stuff that seems like the QA team did not think about.
I never thought I will encounter such issue with DB2, but then again, maybe I'm over reacting. This is not a bug, just a scenario I would not expect to "work by design".

So what am I talking about? Have you ever tried to create a table with no creator? The obvious question is "why would I?", but, this is in the sake of science.

I came across a record in my SYSIBM.SYSTABLES table which had a blank creator, that is, I got the records by running:

Select creator,name
from sysibm.systables
where creator = ''

I was astonished to find two such tables so I tried to create such a table myself. This is the output from my SPUFI session:

---------+---------+---------+---------+---------+---------+----
SET CURRENT SQLID='';
---------+---------+---------+---------+---------+---------+----
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 0
---------+---------+---------+---------+---------+---------+----
SELECT CURRENT SQLID FROM SYSIBM.SYSDUMMY1;
---------+---------+---------+---------+---------+---------+----

---------+---------+---------+---------+---------+---------+----

DSNE610I NUMBER OF ROWS DISPLAYED IS 1
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100
---------+---------+---------+---------+---------+---------+----
CREATE TABLE MOMI (COL1 INT);
---------+---------+---------+---------+---------+---------+----
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 0
---------+---------+---------+---------+---------+---------+----
SELECT * FROM DB2T.SYSIBM.SYSTABLES
WHERE CREATOR=''
WITH UR;
---------+---------+---------+---------+---------+---------+---------+---------+
NAME CREATOR TYPE DBNAME TSNAME DBID OBID COLCOUNT
---------+---------+---------+---------+---------+---------+---------+---------+
MOMI T DSNDB04 MOMI 4 1011 1
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100
---------+---------+---------+---------+---------+---------+---------+---------+

But now for the big question, how would I query this table without setting my current sqlid to '' ?

Well, this did the trick
SELECT * FROM " ".MOMI;

Regarding what happens when you perform
CREATE TABLE " " (COL1 INT)

I'll let you check it yourself...

That’s it for now…

Monday, November 23, 2009

UNION ALL Performance and Behavior

This time I am going to talk about UNION ALL.
I am sure numerous posts were already dedicated to the UNION vs. UNION ALL issue, but just to make sure everyone knows the difference, UNION ALL returns all rows while UNION returns only distinct rows (which will require DB2 to sort the result set and remove duplicate rows before it is returned).

What I wanted to do is to check how DB2 process the UNION ALL clause, so I have created two simple tables with a single column named ID, each table has 10 rows with values between 1 and 10.

I started my tests with this simple query:

Select 'table 1'
From table1
Union ALL
Select 'table 2'
From table2

As expected, the results were composed of 20 rows, 10 rows for each table, when the 'table 1' rows appeared first.

Can you guess what the following query returns?

Select 'table 1'
From table1
Union ALL
Select 'table 2'
From table2
Fetch first 10 rows only

The result set contains only the 10 records from the first table. To get things a bit more interesting I stopped the tablespace that contains the second table (using the stop database dsn command). Running the above query again returned the same results, that is, DB2 did not try to access the second table at all.
Obviously, running the same query with UNION instead of UNION ALL returned SQLCODE -904 with reason code 00C90081, which means the tablespace is in stopped mode.

And now for the really nice part; I left the second tablespace in stop mode and executed the following query:

Select 'table 2'
From table2
Where :hv=0
Union ALL
Select 'table 1'
From table1

When :hv contains 0, then SQLCODE -904 is received, but, when :hv contains 1, we get the rows from table1 in the result set, which means, DB2 is smart enough to evaluate :hv=0 before it actually access the table (I have tried this for both integer and character data types).

Now for the unfortunate results, the query

Select 'table 2'
From table2
Where :hv>0
Union ALL
Select 'table 1'
From table1

Always tries to access both tables (and thus returns SQLCODE -904 when the first tablespace is stopped). I don't know if this behavior is intentional or not, since this query

Select 'table 2'
From table2
Where 0>0
Union ALL
Select 'table 1'
From table1

Also access both tables in order to evaluate 0>0, even though DB2 is able to tell this predicate always returns false when it parses the SQL statement.

Another interesting issue I came across which is partially relevant; when an index is defined on col1, col2 of a table with 2 columns, this query

Select *
From table
Where col1 in (1)
Order by col2

Will use the index and will not need a sort for the order by clause, but this query

Select *
From table
Where col1 in (1,1)
Order by col2

Will have a single column matching index scan, but will not be able to avoid the sort for the ORDER BY. Rewriting the above as an equal predicate sounds like a stupid thing to ask for, but since there are many tools that generate SQL on the fly, maybe it would be better to incorporate such logic into the SQL parser itself.

So what have we learned today?
1. By using FETCH FIRST X ROWS we can make sure DB2 executes the second subquery in the UNION ALL query only if it has to.
2. In order to selectively bypass some of the subqueries In a UNION ALL query, we can add predicates of the form :hv = 1 and set the variable value accordingly. This can be viewed like accessing only relevant partitions of a partitioned table by adding predicates on the partitioning key.

Till next time
Momi

Sunday, November 15, 2009

REORG SHRLEVEL CHANGE – aiming at 24x7 – Part II

In the previous post I started talking about the REORG utility with the shrlevel change option. I told you about two problems that one would probably encounter when running online REORG with the default parameter values:
1. The default value of MAXRO is 300 seconds, while the default timeout interval (IRLMRWT) is 60 seconds which means, once DB2 will get to the last log iteration, most likely some concurrent transactions will time out.
2. If we have a long running reader and we are running a concurrent REORG, on top of problem number 1, the REORG might time out waiting for the DRAINING of the tablespace (since the long reader holds a claim), either leaving the tablespace in RO state, preventing other transactions from updating the table, or terminating completely causing all the work it did to be rolled back (it is also important to note that when the REORG is waiting for the DRAIN, other transactions wait as well).

So what can we do? The REORG utility provides several parameters to allow us to deal with the problem:
MAXRO               - maximum time for last log apply iteration
DRAIN                 - specifies drain behavior when the last iteration is to be applied
DRAIN_WAIT     - number of seconds the utility will wait when draining
RETRY                - number of retries if the drain failed
RETRY_DELAY - number of seconds to wait between retries

By setting MAXRO to a low value, we can make sure no users transactions (which try to update the table) will be timed out during the last log apply iteration. By setting DRAIN_WAIT we make sure no users transactions (which try to read the table) will be timed out during the switch phase (update activity is not allowed either way).

If DB2 is unable to drain the object, the utility will abend and leave the objects in UTRO access, so update operations fail with –904 reason 00C90083.

So this gets us to the million dollar question: How can we be sure that if the REORG utility managed to get the DRAIN for the last log iteration, it will also be able to get the DRAIN for the SWITCH phase, and complete the processing as fast as possible?
Or put differently – how can we minimize the impact of the REORG utility on other transactions.

Another parameter which will take part in our solution is the DRAIN parameter. This parameter tells the REORG utility which type of DRAIN to ask for before the last log iteration phase. The default value is writers, meaning, only write activity should be drained. By setting this parameter to ALL, we tell REORG to drain all activity to the tablespace before entering the last log iteration. Once the REORG utility gets the DRAIN, the last log iteration will take no more than MAXRO seconds (or at least it shouldn't), and the SWITCH phase should be rather instantaneous (since no DRAIN is necessary).

If on the other hand we have a long running UOW, the REORG utility will not get the DRAIN and will never start the last log apply iteration and abend, leaving the objects with UTRW status, and allow concurrent user activity.

By setting the DRAIN_WAIT parameter we tell REORG how long to wait for that drain, so we can make sure no concurrent transactions will time out because the DRAIN request.

And now, for some test results…
I wrote 2 REXX scripts, one of which performs insert operations in a loop, and the other one fetches a single record from the database with RS isolation, and then goes to sleep for a given interval.

In the first test case, I executed both REXX scripts concurrently on a table with row level locking. Both of them were running, and then I tried to execute REORG SHRLEVEL CHANGE with the following parameters:

MAXRO 10
DRAIN ALL
DRAIN_WAIT 10
RETRY 5
RETRY_DELAY 5

The REORG utility SYSPRINT looked like this

DSNU394I -DB2T DSNURBXA - SORTBLD PHASE STATISTICS - NUMBER OF KEYS=80 FOR INDE
DSNU391I DSNURPTB - SORTBLD PHASE STATISTICS. NUMBER OF INDEXES = 2
DSNU392I DSNURPTB - SORTBLD PHASE COMPLETE, ELAPSED TIME = 00:00:00
DSNU1122I -DB2T DSNURLOG - JOB REORGJOB PERFORMING REORG
WITH UTILID REOUTIL UNABLE TO DRAIN DBMOMI.TSMOMI.
RETRY 1 OF 5 WILL BE ATTEMPTED IN 5 SECONDS
DSNU1122I -DB2T DSNURLOG - JOB REORGJOB PERFORMING REORG
WITH UTILID REOUTIL UNABLE TO DRAIN DBMOMI.TSMOMI.
RETRY 2 OF 5 WILL BE ATTEMPTED IN 5 SECONDS
DSNU1122I -DB2T DSNURLOG - JOB REORGJOB PERFORMING REORG
WITH UTILID REOUTIL UNABLE TO DRAIN DBMOMI.TSMOMI.
RETRY 3 OF 5 WILL BE ATTEMPTED IN 5 SECONDS
DSNU1122I -DB2T DSNURLOG - JOB REORGJOB PERFORMING REORG
WITH UTILID REOUTIL UNABLE TO DRAIN DBMOMI.TSMOMI.
RETRY 4 OF 5 WILL BE ATTEMPTED IN 5 SECONDS
DSNU1122I -DB2T DSNURLOG - JOB REORGJOB PERFORMING REORG
WITH UTILID REOUTIL UNABLE TO DRAIN DBMOMI.TSMOMI.
RETRY 5 OF 5 WILL BE ATTEMPTED IN 5 SECONDS
DSNU590I -DB2T DSNURDRN - RESOURCE NOT AVAILABLE, REASON=X'00C200EA', ON DBMOMI
DSNT360I -DB2T ***********************************
DSNT361I -DB2T * DISPLAY DATABASE SUMMARY
* GLOBAL CLAIMERS
DSNT360I -DB2T ***********************************
DSNT362I -DB2T DATABASE = DBMOMI STATUS = RW
DBD LENGTH = 197856
DSNT397I -DB2T
NAME TYPE PART STATUS CONNID CORRID CLAIMINFO
-------- ---- ----- ----------------- -------- ------------ --------
TSMOMI TS RW,UTRW DB2CALL DB2RXINS (WR,C)
TSMOMI TS RW,UTRW DB2CALL DB2RXINS (CS,C)
TSMOMI TS RW,UTRW DB2CALL DB2RXLCK (CS,C)
******* DISPLAY OF DATABASE DBMOMI ENDED **********************
DSN9022I -DB2T DSNTDDIS 'DISPLAY DATABASE' NORMAL COMPLETION
DSNT500I DSNUGBAC - RESOURCE UNAVAILABLE
REASON 00C200EA
TYPE 00000200
NAME DBMOMI .TSMOMI

The REXX script that performed the insert loop kept on going without interruption, so, this combination of parameters provided a solution to the long running reader problem.

The following tests were conducted in order to decide which values one should choose for the above mentioned parameters. In these tests, I executed the REXX that performs inserts concurrently using 5 jobs, thus creating a greater load on the table.

The table that was used was created with the following SQL:
CREATE TABLE MOMI.MOMITEST (
COL1 INT NOT NULL,
COL2 TIMESTAMP NOT NULL,
COL3 CHAR(40) NOT NULL
) IN DBMOMI.TSMOMI;

The table had 5 indexes on COL1 (clustered), COL1+COL2, COL1+COL2+COL3, COL2+COL3, COL1+COL3

After several seconds of insert processing (by the 5 jobs running the insert script), I executed the REORG utility with the following parameters:
MAXRO 1 (this is 1 intentionally, not 10)
DRAIN ALL
DRAIN_WAIT 10
RETRY 5
RETRY_DELAY 5

During the time of this test, the insert rate was (each task performed 200k inserts):

Task ID Min Inserts/Sec Avg Inserts/Sec Max Inserts/Sec
1 44194 564
2 44 194 557
3 16 194 566
4 23 194 570
549194573

So by average, the table had 973 inserts per second (although, in later testing I was able to get an insert rate of about 1600 records per second).

Here are the results of the REORG utility:

Records in Table
Log records To apply
# Log iterations
Log phase Elapsed time
152369
13141
2
00:00:07
184795
18170
3
00:00:23
273933
27400
3
00:00:13
354644
70771
4
00:00:35
445782
82947
4
00:00:43
531495
87132
3
00:00:36
742060
138428
5
00:01:13
1275055
256657
5
00:01:55
1535205
148244
4
00:01:03
1760847*
232682
6
00:01:50
2001291**
216785
5
00:01:23
2218251**
405000
5
00:02:54
2637665*
320874
7
00:01:37
2965924*
407336
8
00:01:52

During the tests, none of the inserting processes timed out.

* - this REORG was executed with MAXRO 1, and received the LONGLOG message
** - this REORG was executed with MAXRO 3, and received the LONGLOG message

The LONGLOG message (DSNU377I) indicates REORG believes it will not be able to catch up with the log, that is, the rate at which records are being updated is higher than the rate REORG is able to apply them. If such a situation occurs, you can take several actions in order to fix it (although it might just solve itself eventually) like increasing the size of MAXRO.

When identifying a LONGLOG condition, the REORG utility will write a message to SYSPRINT and will continue to execute indefinitely (this is the default behavior which can be changed by the alter utility command or by specifying other behavior using the LONGLOG parameter of the REORG control statement).
Conclusion; Even though the insert rate that was tested was not very high, it does prove that the REORG utility can be executed online without causing other transactions to time out. For higher insert rates, one can increase the value of MAXRO, or use partitioning in order to reduce the load on any single file.

I hope this post helped clearing things out,
Till next time…

Sunday, November 8, 2009

REORG SHRLEVEL CHANGE – aiming at 24x7 – Part I

As I have seen in sites where I worked, there is some confusion regarding the parameters of the REORG utility. Everybody would like to have true 24x7 systems using online REORG SHRLEVEL CHANGE, but in order to be on the safe side, they used SHRLEVEL REFERENCE instead.

When DB2 performs a REORG with SHRLEVEL CHANGE, it builds the shadow files the same way as it does with REORG SHRLEVEL REFERENCE, but, it has an additional steps in which it needs to apply the log records of the updates that took place during the REORG itself.

Obviously, at some point, DB2 has to restrict update access to the table, in order to make a final synchronization (applying the updates to the shadow files). Additionally, at some point, all access to the tablespace must be blocked in order to switch the shadow files and the actual files (the SWITCH phase).

The phase in which DB2 applies the accumulated log records to the shadow files is called the LOG APPLY phase. This phase is done in several iterations. In each iteration, DB2 applies some log records, while allowing full access to the tablespace.
At some point DB2 decides to go into the last log iteration. At this point the tablespace will be available for read only access and all the remaining log records will be applied.

In order to decide which iteration will be the last, DB2 uses the MAXRO parameter. This parameter tells DB2 how many seconds the tablespace can be in read only mode. DB2 will apply log records iteratively until it estimates the amount of log records to apply is small enough, to be applied in less than MAXRO seconds. At this point DB2 will DRAIN all write activity and go into the last log iteration phase.

This gets us to the first problem. The default value of MAXRO is 300 seconds, while the default timeout interval (IRLMRWT) is 60 seconds which means, once DB2 will get to the last log iteration, most likely some concurrent transactions will time out.

After the last iteration, DB2 must block all access to the tablespace in order to perform the switch phase. In order to set this restrictive state DB2 must DRAIN concurrent read activity (write activity was already drained before the last log apply iteration) to the tablespace.

This gets us to the second problem. Let's assume we have a long running unit of work that only reads data, running concurrently with short user transactions that update data. When the REORG utility decides to perform the last log iteration, it will drain all write activity (short transactions). Since the user transactions are short, it will be able to drain the tablespace and change it to read only access for the last iteration. At this point user transactions that try to update the tablespace are in lock wait status.

Once the last iteration is finished, DB2 will try to drain all readers and perform the SWITCH phase. Since we have a long reader, the REORG utility will not be able to drain the read activity. The number of seconds the REORG utility will wait for the DRAIN is controlled by the DRAIN_WAIT parameter. The default value of this parameter is IRLMRWT*UTIMOUT (zparms) which is 300 seconds by default (60 seconds lock wait time, 6 for utility timeout). So, during the time REORG waits for the drain, more user transactions will probably timeout.

After DRAIN_WAIT seconds the drain will fail and the REORG utility will terminate, leaving the tablespace in read only mode (leading to many resource unavailable conditions for the short user transactions). At this point we can either terminate the utility in order to open the tablespace for write activity, (but this will mean all the processing we done is lost, and all the user transactions timed out needlessly), or we can kill the long running unit of recovery and restart the utility which will finish rapidly. In both situations, we had many user transactions that timed out because of our REORG. This of course is unacceptable.

In the second part of this post, I will tell you how I used the parameters of the REORG utility to get pass those 2 problems.

To Be Continued…

Sunday, November 1, 2009

DB2 V7 Restart Identity Column

DB2 V7? is this for real?

Well, as sad as it sounds, some people still use DB2 v7, or DB2 v8 compatibility mode. It's true; I even have a client that use it.
That client had a problem; a programmer copied a table with an identity column from the production system to the test system using dsn1copy. Since the production table was bigger, when trying to insert records in the test system he received sqlcode -803 because the generated identity value was too low.

Before v8 new function mode, there is no way to restart an identity column without dropping and recreating the table.
One possible solution in such a case is to copy the data from the production system using unload / load. In the load phase, remove the identity column from the into table specification and let DB2 generate new values.
Unfortunately, another table had a foreign key to that identity column, so this solution was not possible.

What I ended up doing is writing a REXX script that performs inserts into the table until the identity column value lines up with the values already in the table. Luckily for me, identity column values are being consumed even if the transaction is rolled back, so all I had to do is calculate the amount of rows I need to insert, insert them, and eventually roll back.

You can find this REXX script here.
The script receives three parameters - the DB2 system name, the table creator and the table name.

Comments are welcomed as always.

Momi