Two posts in one week? Christmas came early this year… :-)
There is a web site called experts exchange where people ask questions and other people answer. I spend some time answering questions there and today someone asked for a solution for the following problem:
He has a number and he needs to convert it to text, for example, convert 10 to ten.
This is required for some check handling application.
The guy was referred to some code written in RPG, but I decided this can be a nice SQL challenge.
So, after a bit of fooling around, this is what I came up with:
First thing to do is to setup two tables that will contain some metadata. You don't have to actually create these tables; you can use a CTE if your version of DB2 supports it.
CREATE TABLE MOMI.FACTORS(START INT, LENGTH INT, CAPTION VARCHAR(20));
INSERT INTO MOMI.FACTORS VALUES(3, 3, '');
INSERT INTO MOMI.FACTORS VALUES(6, 3, 'THOUSAND');
INSERT INTO MOMI.FACTORS VALUES(9, 3, 'MILLION');
INSERT INTO MOMI.FACTORS VALUES(12, 3, 'BILLION');
INSERT INTO MOMI.FACTORS VALUES(15, 3, 'TRILLION');
CREATE TABLE MOMI.NUMS (VAL INT, CAPTION VARCHAR(20));
INSERT INTO MOMI.NUMS VALUES( 0, 'ZERO');
INSERT INTO MOMI.NUMS VALUES( 1, 'ONE');
INSERT INTO MOMI.NUMS VALUES( 2, 'TWO');
INSERT INTO MOMI.NUMS VALUES( 3, 'THREE');
INSERT INTO MOMI.NUMS VALUES( 4, 'FOUR');
INSERT INTO MOMI.NUMS VALUES( 5, 'FIVE');
INSERT INTO MOMI.NUMS VALUES( 6, 'SIX');
INSERT INTO MOMI.NUMS VALUES( 7, 'SEVEN');
INSERT INTO MOMI.NUMS VALUES( 8, 'EIGHT');
INSERT INTO MOMI.NUMS VALUES( 9, 'NINE');
INSERT INTO MOMI.NUMS VALUES( 10, 'TEN');
INSERT INTO MOMI.NUMS VALUES( 11, 'ELEVEN');
INSERT INTO MOMI.NUMS VALUES( 12, 'TWELVE');
INSERT INTO MOMI.NUMS VALUES( 13, 'THIRTEEN');
INSERT INTO MOMI.NUMS VALUES( 14, 'FOURTEEN');
INSERT INTO MOMI.NUMS VALUES( 15, 'FIFTEEN');
INSERT INTO MOMI.NUMS VALUES( 16, 'SIXTEEN');
INSERT INTO MOMI.NUMS VALUES( 17, 'SEVENTEEN');
INSERT INTO MOMI.NUMS VALUES( 18, 'EIGHTEEN');
INSERT INTO MOMI.NUMS VALUES( 19, 'NINETEEN');
INSERT INTO MOMI.NUMS VALUES( 20, 'TWENTY');
INSERT INTO MOMI.NUMS VALUES( 30, 'THIRTY');
INSERT INTO MOMI.NUMS VALUES( 40, 'FORTY');
INSERT INTO MOMI.NUMS VALUES( 50, 'FIFTY');
INSERT INTO MOMI.NUMS VALUES( 60, 'SIXTY');
INSERT INTO MOMI.NUMS VALUES( 70, 'SEVENTY');
INSERT INTO MOMI.NUMS VALUES( 80, 'EIGHTY');
INSERT INTO MOMI.NUMS VALUES( 90, 'NINETY');
And now for the query itself:
SELECT
CASE WHEN T3.VAL > 0 THEN ' '||T3.CAPTION||' HOUNDRED' ELSE '' END||
CASE WHEN T4.VAL IS NOT NULL THEN
CASE WHEN T3.VAL>0 AND T2.VAL=0 THEN ' AND '
WHEN COMPNAME = '' THEN ' AND '
ELSE ' ' END ||T4.CAPTION
WHEN T2.VAL = 0 THEN ' AND '||T1.CAPTION
WHEN T1.VAL > 0 AND T2.VAL >= 2 THEN ' '||T2.CAPTION||
CASE WHEN COMPNAME='' THEN ' AND '
ELSE ' ' END ||
T1.CAPTION END || ' '||T.COMPNAME
FROM (
SELECT
INT(LEFT(RIGHT(REPEAT('0',16-LENGTH(T1.TXT))||T1.TXT,
T2.START),T2.LENGTH)) COMPONENT, T2.START ,
T2.CAPTION COMPNAME
FROM (SELECT '17431103114512' AS TXT FROM SYSIBM.SYSDUMMY1) T1
, MOMI.FACTORS T2
) T
JOIN MOMI.NUMS T1 ON MOD(T.COMPONENT,10) = T1.VAL
JOIN MOMI.NUMS T2 ON MOD(T.COMPONENT/10, 10)*10 = T2.VAL
JOIN MOMI.NUMS T3 ON MOD(T.COMPONENT/100, 10) = T3.VAL
LEFT JOIN MOMI.NUMS T4 ON MOD(T.COMPONENT, 100) = T4.VAL
ORDER BY START DESC
Your first response should be what the @#!#@ ?
But after I'll explain, I hope you will feel differently.
What this query does is dividing the number into groups of 3 digits that represents the amount of each magnitude (ones, thousands, millions, trillions etc…).
Then, each group of 3 digits is translated to a string that represents it and then concatenated to the text that describes the magnitude.
This can obviously be scaled up to any number, but since this is an application that deals with money, I figured trillions should do the job.
The only problem with this query is that the result is not contained in a single line, but in multiple lines.
One can overcome this problem by creating a stored procedure that runs the above query and concatenate all the results. Another way to do it is use the above query in a CTE and join it to itself several times.
The advantages of having this done using a query over some other non-SQL language are:
1) It will work in every DB2 on every platform (I used pretty simple syntax, no special functions)
2) It will probably run faster
3) It is easier to maintain for a DBA
4) It can be executed against many values with no extra effort (just join to a table with the values you wish to convert)
Comments are welcomed as always.
Good Day
Thursday, December 17, 2009
Monday, December 14, 2009
Copy DB2 Statistics – Modeling Production Systems
This post is somewhat related to a previous post about comparing access path selection between test and production systems.
In the previous post I talked about the problem that verifying a package has optimal access paths in the test system, will not guarantee it will have the same access paths in the production system. The post also had a link to a REXX script that compares plan_tables between DB2 systems and reports if any differences exist for a given package.
So how can one minimize the number of times such differences will appear? The best approach is to try modeling production systems in the test systems.
What does modeling production systems means? Trying to have a test system with as similar configuration as possible to the production system.
Most organizations will not spend the same amount of money on computing resources for the test system as they do on the production system for obvious reasons, but that does not mean all hope is lost. DB2 will choose access paths for queries based on its own statistics as well as the hardware resources that are available to it.
According to what I have seen mostly, the DB2 statistics have far more influence on query access path selection than the available hardware resources, thus, by duplicating the statistics from your production system to your test system, you can increase the probability that DB2 will choose the same access path in both systems. This method is very efficient since you do not have to actually copy the data from the production system to the test system, you only copy the statistics.
So what can we copy?
In the Utility Guide and Reference manual, under the RUNSTATS utility documentation you can find a list of columns in the catalog which are used by the optimizer for access path selection. Most of these columns are updateable by SQL statements so there is no problem to update them manually.
I have written a REXX script (DB2CPSTS) which receives the following parameters and produces the appropriate statements that are required to copy the statistics from one system to another:
DB2SYS – DB2 system where the REXX will execute
DB2SRC - DB2 system from which we get the statistics
DB2CPY – Name of database for which we want to copy statistics (like pattern)
TS2CPY – Name of tablespace for which we want to copy statistics (like pattern)
There are some additional parameters which are not implemented yet.
The REXX also access the following DD files which have to be defined in the JCL:
OUTSQL – this file will eventually contain the SQL statements
SCMAP – contains mapping of schema / creator names
OBMAP – contains mapping of object names
DBMAP – contains mapping of database names
SPMAP – contains mapping of tablespaces / indexspaces names
The purpose of the mapping files is to provide a way of altering the objects names between systems, so for example, if your schema contains the letter P in the production system and the letter T in the test system, you can translate the names in order to make sure the appropriate objects are updated. Also, if you have several copies of the same schema in a single system, you can also use this script in order to copy statistics between these schemas.
The format of the mapping rules are:
Source_Pattern Target_Pattern
Each file can contain several mapping rules which are evaluated by their order. The first rule that will match will be used and later rules will not be checked, so for example, having these rules:
A B
B C
Will translate A to B and stop.
The rules
SYSTABLE% MOMTABLE%
SYSTABLES MOMI
Will translate SYSTABLES to MOMTABLES
Both patterns can use % and _ as used with the like predicate.
Not all patterns are legal, I implemented the translation according to what I find logical. In my algorithm translation is done in the following way:
A single non special character must match a single non-special character
The _ character must match either a regular character or a _ character
The % character must match the % character
If a special character is used, the number of characters up to that character must be identical.
Here are some examples for mapping rules that work:
%M %D
SY%BM AY%XD
S_SI% SSSB%
___I% DUPI%
___I% ___B%
SYS% DUP%
SYSTABLES SYSTEMTABS (static translation)
Names that don't match any rule will not be translated.
In addition to this REXX script, I wrote another script (DB2PCKCS) which receives these parameters:
DB2SYS – DB2 system where the REXX will execute
DB2SRC - DB2 system from which we get the statistics
COLLID – collection id of a package
PACK – package name
The REXX also need to have the same DD cards as the DB2CPSTS script. This REXX script will invoke the previous one for all the tablespaces that contains tables upon which the specified package depends (according to SYSIBM.SYSPACKDEP), so if you want to copy all the statistics that are relevant for a specific package, the second REXX will do the trick for you).
I also uploaded a sample job (CPSTSJOB) that executes DB2CPSTS.
You should note that even after you copy the statistics from your production system to your test system, you might get different access paths since access paths also depends on:
• The Processor model
• Number of processors (effects parallelism)
• Buffer pool sizes
• RID pool sizes
• Services levels at each system
• Values of some system parameters such as PARAMDEG, NPGTHRSH, STARJOIN
• Distribution statistics in SYSCOLDIST
Some future improvements that might be implemented into this script and thing to keep in mind:
• The script was tested on DB2 V8 compatibility mode. V9 has new types of statistics such as histograms that are not included in this script.
• The script does not generate commit statements so if a very large number of statements were generated, you might create a long contention on the DB2 catalog when running these statements.
• Column distribution statistics are inserted into SYSCOLDIST and not updated, you might need to delete any previous rows from this table.
• The scripts update both access path statistics and space related statistics. In the future, maybe I will implement a parameter that will govern this behavior.
• In the future, maybe a parameter will be added that will indicate if the SQL statements should be written to file or executed immediately.
Any thoughts / ideas are more than welcomed.
Hope you find this useful.
Take care
Momi
In the previous post I talked about the problem that verifying a package has optimal access paths in the test system, will not guarantee it will have the same access paths in the production system. The post also had a link to a REXX script that compares plan_tables between DB2 systems and reports if any differences exist for a given package.
So how can one minimize the number of times such differences will appear? The best approach is to try modeling production systems in the test systems.
What does modeling production systems means? Trying to have a test system with as similar configuration as possible to the production system.
Most organizations will not spend the same amount of money on computing resources for the test system as they do on the production system for obvious reasons, but that does not mean all hope is lost. DB2 will choose access paths for queries based on its own statistics as well as the hardware resources that are available to it.
According to what I have seen mostly, the DB2 statistics have far more influence on query access path selection than the available hardware resources, thus, by duplicating the statistics from your production system to your test system, you can increase the probability that DB2 will choose the same access path in both systems. This method is very efficient since you do not have to actually copy the data from the production system to the test system, you only copy the statistics.
So what can we copy?
In the Utility Guide and Reference manual, under the RUNSTATS utility documentation you can find a list of columns in the catalog which are used by the optimizer for access path selection. Most of these columns are updateable by SQL statements so there is no problem to update them manually.
I have written a REXX script (DB2CPSTS) which receives the following parameters and produces the appropriate statements that are required to copy the statistics from one system to another:
DB2SYS – DB2 system where the REXX will execute
DB2SRC - DB2 system from which we get the statistics
DB2CPY – Name of database for which we want to copy statistics (like pattern)
TS2CPY – Name of tablespace for which we want to copy statistics (like pattern)
There are some additional parameters which are not implemented yet.
The REXX also access the following DD files which have to be defined in the JCL:
OUTSQL – this file will eventually contain the SQL statements
SCMAP – contains mapping of schema / creator names
OBMAP – contains mapping of object names
DBMAP – contains mapping of database names
SPMAP – contains mapping of tablespaces / indexspaces names
The purpose of the mapping files is to provide a way of altering the objects names between systems, so for example, if your schema contains the letter P in the production system and the letter T in the test system, you can translate the names in order to make sure the appropriate objects are updated. Also, if you have several copies of the same schema in a single system, you can also use this script in order to copy statistics between these schemas.
The format of the mapping rules are:
Source_Pattern Target_Pattern
Each file can contain several mapping rules which are evaluated by their order. The first rule that will match will be used and later rules will not be checked, so for example, having these rules:
A B
B C
Will translate A to B and stop.
The rules
SYSTABLE% MOMTABLE%
SYSTABLES MOMI
Will translate SYSTABLES to MOMTABLES
Both patterns can use % and _ as used with the like predicate.
Not all patterns are legal, I implemented the translation according to what I find logical. In my algorithm translation is done in the following way:
A single non special character must match a single non-special character
The _ character must match either a regular character or a _ character
The % character must match the % character
If a special character is used, the number of characters up to that character must be identical.
Here are some examples for mapping rules that work:
%M %D
SY%BM AY%XD
S_SI% SSSB%
___I% DUPI%
___I% ___B%
SYS% DUP%
SYSTABLES SYSTEMTABS (static translation)
Names that don't match any rule will not be translated.
In addition to this REXX script, I wrote another script (DB2PCKCS) which receives these parameters:
DB2SYS – DB2 system where the REXX will execute
DB2SRC - DB2 system from which we get the statistics
COLLID – collection id of a package
PACK – package name
The REXX also need to have the same DD cards as the DB2CPSTS script. This REXX script will invoke the previous one for all the tablespaces that contains tables upon which the specified package depends (according to SYSIBM.SYSPACKDEP), so if you want to copy all the statistics that are relevant for a specific package, the second REXX will do the trick for you).
I also uploaded a sample job (CPSTSJOB) that executes DB2CPSTS.
You should note that even after you copy the statistics from your production system to your test system, you might get different access paths since access paths also depends on:
• The Processor model
• Number of processors (effects parallelism)
• Buffer pool sizes
• RID pool sizes
• Services levels at each system
• Values of some system parameters such as PARAMDEG, NPGTHRSH, STARJOIN
• Distribution statistics in SYSCOLDIST
Some future improvements that might be implemented into this script and thing to keep in mind:
• The script was tested on DB2 V8 compatibility mode. V9 has new types of statistics such as histograms that are not included in this script.
• The script does not generate commit statements so if a very large number of statements were generated, you might create a long contention on the DB2 catalog when running these statements.
• Column distribution statistics are inserted into SYSCOLDIST and not updated, you might need to delete any previous rows from this table.
• The scripts update both access path statistics and space related statistics. In the future, maybe I will implement a parameter that will govern this behavior.
• In the future, maybe a parameter will be added that will indicate if the SQL statements should be written to file or executed immediately.
Any thoughts / ideas are more than welcomed.
Hope you find this useful.
Take care
Momi
Tuesday, December 8, 2009
Converting RBA to Timestamp
Hi
This time I want to address an issue I see many people ask about in forums – how can I convert a RBA value to Timestamp.
Well, first we should note that DB2 uses two methods to keep log records sequence.
In non Data Sharing systems, each log record is associated with an RBA (Relative Byte Address) which is the 6 byte offset of the log record, from the beginning of the DB2 logs.
In a Data Sharing system, each log record also has an associated LRSN (Log Record Sequence Number) which is a 6 byte value that is derived from the system clock. This value is used to ensure that each log record that affects a data page can be uniquely identified since the RBA may overlap between the Data Sharing members.
Since LRSN is derived from the system clock, converting LRSN values to timestamps is as simple as:
SELECT TIMESTAMP(LRSN || X'0000') FROM SYSIBM.SYSDUMMY1
Since RBA values are not derived values, but actually addresses of log records in the DB2 log, there is no way of converting a RBA value to timestamp.
So what is this post all about? We just said it's impossible…
There are several methods that I know of that can be used in order to convert RBA values to timestamp values (or the other way around). Usually this task is required when one wants to perform point in time recovery.
One must note that all the methods are not guaranteed to give you the exact answer you are looking for, but will allow you to determine a general time frame in which a log record with a specific RBA was written.
Method 1: The DISPLAY LOG command
If you know that you will need at some point to translate between RBA and Timestamp values, or you are a person who likes to be on the safe side, you can just issue a DISPLAY LOG command every X seconds. The output of the DISPLAY LOG command provides you with the current RBA, so by combining this value with the time at which the command is executed provides a nice way of converting RBA to Timestamp or the other way around. Sample output of the DISPLAY LOG command:
DSNJ370I -DB2T DSNJC00A LOG DISPLAY
CURRENT COPY1 LOG = DB2T.LOGCOPY1.D01 IS 55% FULL
CURRENT COPY2 LOG = DB2T.LOGCOPY1.D01 IS 55% FULL
H/W RBA = 0491CE036138
H/O RBA = 0491C4188FFF
FULL LOGS TO OFFLOAD = 0 OF 6
OFFLOAD TASK IS (AVAILABLE)
DSNJ371I -DB2T DB2 RESTARTED 23:06:14 DEC 3, 2009
RESTART RBA 048FA92C7000
CHECKPOINT FREQUENCY 5000 LOGRECORDS
LAST SYSTEM CHECKPOINT TAKEN 15:13:50 DEC 7, 2009
DSN9022I -DB2T DSNJC001 '-DIS LOG' NORMAL COMPLETION
You would like to take the H/W RBA (high written), and not he H/O (high offloaded which means what was the RBA of the last log record that was archived).
Pros:
1) Rather simple
2) One can control the resolution of the RBA to Timestamp conversion by changing the frequency at which this command is executed
3) Always available – if you need to convert RBA to timestamp fast (for example to perform point in time recovery), by using this approach, the data is available (in some table where you store it) and you do not have to waste valuable time on gathering this data.
Cons:
1) Will require writing some kind of a program to extract the information
2) Will require to maintain a table that will store this information
3) Will require monitoring of the reoccurring process that saves the data
Method 2: Use DSNJU004 + DSN1LOGP
The DSNJU004 will print the content of the BSDS dataset. The output of this utility will provide you with the range of RBA values and the range of timestamp values that correspond to log records that are contained in each active and archived log file. By using this output, you will be able to determine which log file / files contains the log records that describes evens that happened in a certain range of RBA / timestamp values.
Once you have this list of log files, you can use the DSN1LOGP utility in order to read these log files. By using the summary report you can get information about certain events that took place (like transactions). The output for these events will contain both RBA and timestamp values.
Pros:
1) Can provide resolution of up to 1 second with
2) No prior preparation required
Cons:
1) Rather difficult to perform
2) Time consuming – output contains massive amounts of data
If you have some kind of log analysis tool, this task can be much simpler, depending on the features of the tool.
Method 3: The REPORT utility
The report utility is one of my favorite utilities. You provide it with an object and it displays information on that object from SYSLGRNX (among other things). This information can provide many insights (which will be discussed in some future post), and it provides a RBA to Timestamp conversion values (and vice versa).
The output looks like this:
DSNU583I -DB2M DSNUPPLR - SYSLGRNX ROWS FROM REPORT RECOVERY FOR TABLESPACE DSNDB06.SYSDBASE
UCDATE UCTIME START RBA STOP RBA START LRSN STOP LRSN
112009 00011932 C596199665E8 C5961A7E620C C596199665E8 C5961A7E620C
112009 00073206 C5961AEA4EA8 C5961C532F30 C5961AEA4EA8 C5961C532F30
112009 02172341 C5961DF26B2C C59628CFC9DA C5961DF26B2C C59628CFC9DA
112009 02412968 C59640071768 C596401675B8 C59640071768 C596401675B8
112009 03161335 C59651D11C0E C596541347FE C59651D11C0E C596541347FE
112009 03252530 C596545F0D87 C59654C906F2 C596545F0D87 C59654C906F2
112009 03281476 C59654E47580 C596584D20DD C59654E47580 C596584D20DD
112009 05023682 C596586D2C77 C596586D83D3 C596586D2C77 C596586D83D3
112009 05180635 C59658701871 C59663DD493D C59658701871 C59663DD493D
112009 05575559 C596640CB6B3 C5966443C714 C596640CB6B3 C5966443C714
So, this output is practically begging us to load it into a table… :-)
Note that if you are running a non Data Sharing system, the LRSN values in the output of the report utility are not real values. Converting them to timestamps using the method mentioned above yields wrong results.
Pros:
1) Easy to execute
2) Provides information for a rather large time frame (depends on how often you run the MODIFY utility)
3) Can be executed at the object level
Cons:
1) Data range is limited since SYSLGRNX gets cleaned as the MODIF utility is executed.
2) Data resolution depends on two zparms: PCLOSEN, PCLOSET (which determines how often DB2 will pseudo-close a file that has not been updated).
3) This method is impractical for a very busy tablespace (if it is constantly being updated, the file will always remain open for updates).
Method 4: SYSIBM.SYSCOPY
The SYSIBM.SYSCOPY system table contains information about utilities that were executed against DB2 objects. Some records, for example, records for the COPY utility (or other records representing inline copies) contain values in both the timestamp and start_rba column.
Combining data from these two columns can help in converting RBA to Timestamp values. Note that for each utility (determined by ictype), a different RBA is kept (with respect to the exact utility phase), so these values do not provide direct conversion (even though this might be a good RBA to recover to, the timestamp column contains the timestamp at which the record was inserted into the table, so obviously, the values will usually not reflect direct conversion).
Pros:
1) Data is readily available with no prior preparation
2) Data is easily accessible through SQL
3) Data may span a long period (depends on the execution frequency of the MODIFY utility)
Cons:
1) No direct conversion
2) Data resolution depends on the frequency of maintenance tasks, thus, may not be sufficient
Method 4: Be a DBA Ninja!
According to Wikipedia, In Japanese history, a ninja is a warrior specially trained in a variety of unorthodox arts of war. I remember I saw a movie that said a Ninja has to work with what is available to him in his surroundings.
What does this have to do with the task at hand?
Sometimes you might find information in places you did not consider looking in, for example, if you are using IBM Data Propagator in order to replicate data, the data propagator has a unit of works table (IBMSNAP_UOW) that contains information on transactions. This table contains for each transaction the RBA (IBMSNAP_COMMITSEQ) and timestamp (IBMSNAP_LOGMARKER) of the commit statement that ended the transaction.
If for example you have a table that has a timestamp column that is always updated to the current timestamp at the time of an update statement (a LAST_UPDATED type of column), you might locate a record with the desired timestamp and then use DSN1PRNT in order to extract the last update RBA from the page header. Finding the page that contains this specific row can be done by printing the index leaf page that point to this record, or simply by creating a lock contention on that record and use some monitoring tool (such as Omegamon) to see which page is locked.
Keep your eyes and ears open to things that happen in your DB2 system, and you might find what you are looking for without too much effort.
Pros:
1) Valuable information may be readily available
2) Effort might be minimal
Cons:
1) You have to be creative… (this can be counted as a pro, depends on your perspective)
Conclusions:
Converting RBA to Timestamp is not possible by calculation, but we can use several sources of information in order to do it, or to get close.
As method 1 shows, it is always better to prepare in advance as this is the only method that gives you the ability to store unlimited history and to truly control the resolution of the information, but even if you are caught with your pants down, there are still other methods you can use, some are as simpler as querying SYSIBM.SYSCOPY while others a bit more complex and involves executing some DB2 utilities.
Good Luck!!
This time I want to address an issue I see many people ask about in forums – how can I convert a RBA value to Timestamp.
Well, first we should note that DB2 uses two methods to keep log records sequence.
In non Data Sharing systems, each log record is associated with an RBA (Relative Byte Address) which is the 6 byte offset of the log record, from the beginning of the DB2 logs.
In a Data Sharing system, each log record also has an associated LRSN (Log Record Sequence Number) which is a 6 byte value that is derived from the system clock. This value is used to ensure that each log record that affects a data page can be uniquely identified since the RBA may overlap between the Data Sharing members.
Since LRSN is derived from the system clock, converting LRSN values to timestamps is as simple as:
SELECT TIMESTAMP(LRSN || X'0000') FROM SYSIBM.SYSDUMMY1
Since RBA values are not derived values, but actually addresses of log records in the DB2 log, there is no way of converting a RBA value to timestamp.
So what is this post all about? We just said it's impossible…
There are several methods that I know of that can be used in order to convert RBA values to timestamp values (or the other way around). Usually this task is required when one wants to perform point in time recovery.
One must note that all the methods are not guaranteed to give you the exact answer you are looking for, but will allow you to determine a general time frame in which a log record with a specific RBA was written.
Method 1: The DISPLAY LOG command
If you know that you will need at some point to translate between RBA and Timestamp values, or you are a person who likes to be on the safe side, you can just issue a DISPLAY LOG command every X seconds. The output of the DISPLAY LOG command provides you with the current RBA, so by combining this value with the time at which the command is executed provides a nice way of converting RBA to Timestamp or the other way around. Sample output of the DISPLAY LOG command:
DSNJ370I -DB2T DSNJC00A LOG DISPLAY
CURRENT COPY1 LOG = DB2T.LOGCOPY1.D01 IS 55% FULL
CURRENT COPY2 LOG = DB2T.LOGCOPY1.D01 IS 55% FULL
H/W RBA = 0491CE036138
H/O RBA = 0491C4188FFF
FULL LOGS TO OFFLOAD = 0 OF 6
OFFLOAD TASK IS (AVAILABLE)
DSNJ371I -DB2T DB2 RESTARTED 23:06:14 DEC 3, 2009
RESTART RBA 048FA92C7000
CHECKPOINT FREQUENCY 5000 LOGRECORDS
LAST SYSTEM CHECKPOINT TAKEN 15:13:50 DEC 7, 2009
DSN9022I -DB2T DSNJC001 '-DIS LOG' NORMAL COMPLETION
You would like to take the H/W RBA (high written), and not he H/O (high offloaded which means what was the RBA of the last log record that was archived).
Pros:
1) Rather simple
2) One can control the resolution of the RBA to Timestamp conversion by changing the frequency at which this command is executed
3) Always available – if you need to convert RBA to timestamp fast (for example to perform point in time recovery), by using this approach, the data is available (in some table where you store it) and you do not have to waste valuable time on gathering this data.
Cons:
1) Will require writing some kind of a program to extract the information
2) Will require to maintain a table that will store this information
3) Will require monitoring of the reoccurring process that saves the data
Method 2: Use DSNJU004 + DSN1LOGP
The DSNJU004 will print the content of the BSDS dataset. The output of this utility will provide you with the range of RBA values and the range of timestamp values that correspond to log records that are contained in each active and archived log file. By using this output, you will be able to determine which log file / files contains the log records that describes evens that happened in a certain range of RBA / timestamp values.
Once you have this list of log files, you can use the DSN1LOGP utility in order to read these log files. By using the summary report you can get information about certain events that took place (like transactions). The output for these events will contain both RBA and timestamp values.
Pros:
1) Can provide resolution of up to 1 second with
2) No prior preparation required
Cons:
1) Rather difficult to perform
2) Time consuming – output contains massive amounts of data
If you have some kind of log analysis tool, this task can be much simpler, depending on the features of the tool.
Method 3: The REPORT utility
The report utility is one of my favorite utilities. You provide it with an object and it displays information on that object from SYSLGRNX (among other things). This information can provide many insights (which will be discussed in some future post), and it provides a RBA to Timestamp conversion values (and vice versa).
The output looks like this:
DSNU583I -DB2M DSNUPPLR - SYSLGRNX ROWS FROM REPORT RECOVERY FOR TABLESPACE DSNDB06.SYSDBASE
UCDATE UCTIME START RBA STOP RBA START LRSN STOP LRSN
112009 00011932 C596199665E8 C5961A7E620C C596199665E8 C5961A7E620C
112009 00073206 C5961AEA4EA8 C5961C532F30 C5961AEA4EA8 C5961C532F30
112009 02172341 C5961DF26B2C C59628CFC9DA C5961DF26B2C C59628CFC9DA
112009 02412968 C59640071768 C596401675B8 C59640071768 C596401675B8
112009 03161335 C59651D11C0E C596541347FE C59651D11C0E C596541347FE
112009 03252530 C596545F0D87 C59654C906F2 C596545F0D87 C59654C906F2
112009 03281476 C59654E47580 C596584D20DD C59654E47580 C596584D20DD
112009 05023682 C596586D2C77 C596586D83D3 C596586D2C77 C596586D83D3
112009 05180635 C59658701871 C59663DD493D C59658701871 C59663DD493D
112009 05575559 C596640CB6B3 C5966443C714 C596640CB6B3 C5966443C714
So, this output is practically begging us to load it into a table… :-)
Note that if you are running a non Data Sharing system, the LRSN values in the output of the report utility are not real values. Converting them to timestamps using the method mentioned above yields wrong results.
Pros:
1) Easy to execute
2) Provides information for a rather large time frame (depends on how often you run the MODIFY utility)
3) Can be executed at the object level
Cons:
1) Data range is limited since SYSLGRNX gets cleaned as the MODIF utility is executed.
2) Data resolution depends on two zparms: PCLOSEN, PCLOSET (which determines how often DB2 will pseudo-close a file that has not been updated).
3) This method is impractical for a very busy tablespace (if it is constantly being updated, the file will always remain open for updates).
Method 4: SYSIBM.SYSCOPY
The SYSIBM.SYSCOPY system table contains information about utilities that were executed against DB2 objects. Some records, for example, records for the COPY utility (or other records representing inline copies) contain values in both the timestamp and start_rba column.
Combining data from these two columns can help in converting RBA to Timestamp values. Note that for each utility (determined by ictype), a different RBA is kept (with respect to the exact utility phase), so these values do not provide direct conversion (even though this might be a good RBA to recover to, the timestamp column contains the timestamp at which the record was inserted into the table, so obviously, the values will usually not reflect direct conversion).
Pros:
1) Data is readily available with no prior preparation
2) Data is easily accessible through SQL
3) Data may span a long period (depends on the execution frequency of the MODIFY utility)
Cons:
1) No direct conversion
2) Data resolution depends on the frequency of maintenance tasks, thus, may not be sufficient
Method 4: Be a DBA Ninja!
According to Wikipedia, In Japanese history, a ninja is a warrior specially trained in a variety of unorthodox arts of war. I remember I saw a movie that said a Ninja has to work with what is available to him in his surroundings.
What does this have to do with the task at hand?
Sometimes you might find information in places you did not consider looking in, for example, if you are using IBM Data Propagator in order to replicate data, the data propagator has a unit of works table (IBMSNAP_UOW) that contains information on transactions. This table contains for each transaction the RBA (IBMSNAP_COMMITSEQ) and timestamp (IBMSNAP_LOGMARKER) of the commit statement that ended the transaction.
If for example you have a table that has a timestamp column that is always updated to the current timestamp at the time of an update statement (a LAST_UPDATED type of column), you might locate a record with the desired timestamp and then use DSN1PRNT in order to extract the last update RBA from the page header. Finding the page that contains this specific row can be done by printing the index leaf page that point to this record, or simply by creating a lock contention on that record and use some monitoring tool (such as Omegamon) to see which page is locked.
Keep your eyes and ears open to things that happen in your DB2 system, and you might find what you are looking for without too much effort.
Pros:
1) Valuable information may be readily available
2) Effort might be minimal
Cons:
1) You have to be creative… (this can be counted as a pro, depends on your perspective)
Conclusions:
Converting RBA to Timestamp is not possible by calculation, but we can use several sources of information in order to do it, or to get close.
As method 1 shows, it is always better to prepare in advance as this is the only method that gives you the ability to store unlimited history and to truly control the resolution of the information, but even if you are caught with your pants down, there are still other methods you can use, some are as simpler as querying SYSIBM.SYSCOPY while others a bit more complex and involves executing some DB2 utilities.
Good Luck!!
Subscribe to:
Posts (Atom)