In today's ever-changing business environment, you need to keep ahead of the competition. From an IT and database perspective, this means that you need to be able to access and present business data in new, insightful ways and you need to be able to deliver these capabilities immediately! The DB2 for i5/OS V6R1 enhancements include several new features that speed up your databases and let you quickly and easily deliver new solutions.
One of the strengths of SQL as a programming language is its set-oriented nature, which allows a tremendous amount of data processing to occur with a single statement. The power of a single SQL statement takes a step forward with the V6R1 enhancements. A great example is the support for Grouping Sets and the CUBE and ROLLUP OLAP expressions that let you aggregate data by different dimensions and different levels with one SQL statement (for more information about this support, read "Multidimensional Grouping Made Easy" on page ProVIP 28). For example, these new features allow a single query to return sales data to the user, and let the user view and store those sales results at a store, region, or country level.
The ability to include an INSERT statement on a FROM clause is another enhancement that lets you process more on a single SQL statement. A number of i5/OS developers have taken advantage of DB2's ability to automatically generate key values with the identity column clause. With identity columns, it's a challenge to access the key value (e.g., order number) that DB2 generates because it requires an additional SQL statement to be executed. The INSERT statement is executed to add the new row followed by an extra SELECT statement with the Identity_Val_Local function to retrieve the generated value. This two-statement solution is also problematic when a block of rows is inserted with a blocked INSERT, because the Identity_Val_Local function returns the generated key value for only the last row inserted.
DB2 for i5/OS addresses these challenges in V6R1 by enhancing the SELECT statement to allow an INSERT statement to be specified on the FROM clause, as Figure 1 shows. With the syntax in Figure 1, the INSERT statement on the FROM clause is executed first, and then the outer SELECT statement returns the values that DB2 generated for the order_id identity column. Notice that the FINAL TABLE keyword is required when referencing an INSERT statement on the FROM clause.
The second INSERT on FROM example in Figure 1 demonstrates how you can use this new support to access the generated identity values for a blocked insert operation. The ORDER BY INPUT SEQUENCE clause lets you get the generated values back for the order_id column in the order in which DB2 generated them. Not only does this new INSERT on FROM syntax solve the issue of retrieving generated values for blocked inserts, but by allowing a single DB2 request to perform multiple database operations, it also lets your application run slightly faster.
FROM clause. The FROM clause has also been improved with the ability to specify a VALUES clause. With this support, you can dynamically declare, populate, and reference a temporary table on a SELECT statement, as the following example shows:
SELECT * FROM ( VALUES(77,'New Department')) AS tmp(c1,c2) UNION ALL SELECT deptnum, deptname FROM org WHERE deptnum <20
This new support is helpful when you want to include data from memory tables or data structures in a result set without the overhead of creating a real temporary table object. The VALUES result set can also be joined with data from other DB2 tables. The VALUES in FROM support definitely gives you more flexibility when you construct queries and reports.
SQL tool set enhancements. SQL developers in i5/OS always benefit from new additions to the toolbox of built-in SQL functions, and V6R1 is no exception. Data encryption continues to be an area of interest, and many companies focus on data privacy, so DB2 V6R1 has added support for the popular AES encryption algorithm with the delivery of the ENCRYPT_AES function. The i5/OS applications that deal with date and time values also benefit from enhancements added to the VARCHAR_FORMAT and TIMESTAMP_FORMAT functions, as well as the new functions MONTHS_BETWEEN, ROUND_TIMESTAMP, and TRUNC_TIMESTAMP.
The SKIP LOCKED DATA clause might be the best addition to the SQL developer's tool set in V6R1. You can use this clause to increase throughput of applications with heavy concurrent activity. One common misconception about DB2 locking is that if you're running a query against a DB2 table that other jobs and users are changing, DB2 automatically skips over any row changes that have not yet been committed. Actually, the opposite occurs. If a row matches the search criteria of the query, DB2 stops and waits to see whether it can acquire the lock on that row. Let's walk through an example to better understand this behavior. In this example, all the SQL requests run with the Cursor Stability (CS) isolation level.
Assume that you're searching the Internet for vacation flights to Honolulu. Two seconds before you submit your search on the travel agency's website, the following update is performed on its flight database, changing the departure time from 4:30 p.m. to 5:25 p.m. as part of a large batch of updates to its flight schedules.
UPDATE flights SET departTime='05:25' WHERE departTime = '4:30' AND flightNum=331 AND destCity='HNL'
Your flight search request issues the following SELECT statement to find a flight that leaves after 5 p.m. for Honolulu.
SELECT * FROM flights WHERE departTime >= '05:00' AND destCity='HNL'
When the SELECT statement processes the row that was just updated, the SELECT statement finds a match because the newly updated departure time is 5:25 p.m., which meets the specified selection criteria.
At this point, DB2 does not know that the row has been locked, so it attempts to acquire a read lock on this row as dictated by the CS isolation level. The UPDATE statement has not yet been committed because it's part of the batch update, so it holds an update lock on the row. Therefore, the update lock conflicts with the requested read lock, so the SELECT statement stops and waits until the row lock is released. If the batch update process in this example is long-running, the SELECT statement can fail with a lock timeout error (the default record wait time is 60 seconds on i5/OS). The fact that the SELECT statement stops and waits for an update lock on the row to be released may come as a surprise to many developers who believe that DB2 will just skip over the locked row and search for other rows in the table meeting the selection criteria. This may be good or bad news depending on your application's requirements.
Some good news in V6R1 is that you have a choice of behaviors. You can continue to run with the default behavior of DB2 waiting for locks, or you can have DB2 skip over locked rows. The skipping behavior is activated with the SKIP LOCKED DATA clause in the following example:
SELECT * FROM flights WHERE departTime >= '05:00' AND destCity='HNL' SKIP LOCKED DATA
With this clause specified, instead of waiting to acquire the row lock and possibly fail with a lock timeout error, the SELECT statement now skips over any rows in which it encounters a lock conflict. You can also specify the SKIP LOCKED DATA clause on UPDATE and DELETE statements. In addition, the clause is honored only with the CS and Read Stability isolation levels.
UPDATE statements. In V6R1, UPDATE statements also have another new feature called extended indicators. Extended indicators let you code a single, generic UPDATE statement instead of coding a different UPDATE statement for each combination of columns that needs to be updated. Figure 2 demonstrates this statement reusability. Notice the WITH EXTENDED INDICATORS clause on the cursor declaration. This clause allows extended indicator variables to be passed on positioned UPDATE operations associated with the referenced cursor (i.e., cur1). Extended indicators are then set to a defined set of special values to tell the UPDATE statement whether the column should be included in the update operation. In this example, the inds2 extended indicator variable is set to a value of -7, which tells DB2 to omit the quantity column from the positioned UPDATE. Thus, only two columns (shipdate and status) are actually changed by the embedded UPDATE statement, even though three columns are referenced on the SET clause. You can also use extended indicators with INSERT statements.
Run SQL Statement. The Run SQL Statement (RUNSQLSTM) command is a popular tool with i5/OS developers who need to execute SQL scripts from a CL program or command line interface. With V6R1, the RUNSQLSTM command can now process and execute SQL statements stored in a stream file (previously, the SQL had to be stored in a source physical file member). This new stream file support can help developers in several different ways. Most important, your SQL statements no longer have to be limited to 80 characters when stored in a stream file. The 80-character limit for statements stored in a source file member made SQL coding difficult and tedious for many i5/OS developers. In addition, a stream file can accommodate larger, more complex SQL scripts because stream files have a size limit of 1 TB, which is much larger than the 16 MB limit of a source file member. Finally, this support enables your i5/OS SQL scripts to be managed by a stream-file-based change management system.
Precompilers. The ILE SQL precompilers have also been enhanced with this source stream file capability in V6R1. Furthermore, IBM continued the momentum of improved SQL and RPG integration from previous releases with additional enhancements to the ILE RPG SQL precompiler. One of the key improvements in V6R1 is the precompiler's ability to scope variables at a procedure level. This new capability lets the SQL precompiler properly process and execute code similar to the RPG code in Figure 3.
In Figure 3, the same variable name (outArray) is used in the two RPG procedures to declare data structures with slightly different attributes. The first procedure declares the outArray data structure with two fields, and the second procedure contains three fields within the outArray declaration. In addition, each procedure contains an SQL FETCH statement that references the duplicated variable name. In prior releases, the SQL RPG precompiler was unable to scope the variable name to the procedure, which resulted in headaches for RPG developers. As a result of this deficiency, the SQL RPG precompiler would either fail with an error flagging the duplicate variables names as illegal (even though it's valid RPG syntax), or unpredictable results would occur at runtime because the SQL RPG precompiler was incorrectly sharing one definition of the duplicated variable across multiple procedures. The V6R1 SQL RPG precompiler eliminates this hurdle by correctly scoping the variables at a procedure level.
IBM plans to provide a PTF in 2008 to make this variable scoping enhancement also available on V5R4. Other V6R1 enhancements to the RPG SQL precompiler include the ability to define variables based on SQLCA variables (SQLSTATE) using LIKE, and improved WDSC integration for source code error resolution.
Not to be overlooked, Cobol programmers will benefit from the Unicode support added to the ILE Cobol SQL precompiler.
System i development shops that use industry-standard data access methods such as JDBC, SQL Call Level Interface (CLI), and ADO.NET will also benefit from V6R1 because, along with other miscellaneous improvements, Java data access improves with the JDBC Version 4.0 enhancements. The DB2 for i5/OS .NET data provider has been enhanced to exploit version 2.0 of ADO.NET, as well as to provide support for distributed transactions and multi-row inserts.
IBM continues to see strong interest from software vendors wanting to port their solutions to i5/OS. These porting activities are driven by vendors that want access to the System i market and by existing System i customers who want the latest applications and tools to support i5/OS. To expedite these porting projects, IBM has enhanced the SQL functionality of DB2 for i5/OS in V6R1.
Both Microsoft SQL Server and MySQL tables include support for a hidden or automatic timestamp column that was difficult to emulate when porting to DB2 for i5/OS before V6R1. When a table in one of these database products includes a timestamp column, the database engine automatically updates that timestamp column each time a row is inserted or updated. This automatic update behavior gives applications an easy mechanism to determine whether a table row that was read earlier has changed before the applications attempt to update that same row in the table. To determine whether the row has changed, applications compare the value of the timestamp column when the row was read with the current value of the timestamp column in that same row of data. This timestamp comparison is central to applications that use an optimistic locking approach, and the comparison can now be easily implemented with the DB2 hidden timestamp column support available in V6R1. Figure 4a shows how a hidden timestamp column is defined and how it performs during typical I/O to the table.
The CREATE TABLE statement displays the new syntax for hiding a column with the IMPLICITLY HIDDEN clause on the ticket_ts column definition. This hidden clause indicates that the column will not be visible on any SQL statements unless it's explicitly referenced by name, as in Figure 4b, which contains the output for the first SELECT statement in Figure 4a (i.e., SELECT * FROM tickets). The ticket_ts column is not part of the result set because it was not explicitly referenced on the SELECT statement.
The FOR EACH ROW ON UPDATE AS ROW CHANGE TIMESTAMP clause signifies that DB2 will change the ticket_ts column value each time a row is inserted or updated in the table. Figure 4c contains the result of the final SELECT statement in Figure 4a, which explicitly includes the ticket_ts column along with the unhidden columns in the Tickets table. Notice that the value of the ticket_ts column has been assigned and changed automatically by DB2, even though the INSERT and UPDATE statements in Figure 4a contained no references to the ticket_ts column.
With the addition of several new data types, table definitions are also more portable to DB2 for i5/OS. The National Character data types (i.e., NCHAR, NVARCHAR, and NCLOB) provide a standard way to define a Unicode column with the UTF-16 encoding. The Decimal Float data type, DECFLOAT, is a new numeric data type that has been added for compatibility with the other DB2 products. DECFLOAT combines the attributes of decimal and float with extended accuracy.
Porting databases and applications that support the other DB2 family members is also easier with support for Full Outer Join syntax and a new feature that lets DB2 for i5/OS ignore unsupported syntax. Quite often, the SQL scripts for other DB2 databases contain low-level configuration syntax (e.g., tablespace options) on the SQL data definition language statements. This syntax prevented the scripts from executing on DB2 for i5/OS until the unsupported syntax was removed from the database scripts. As a result, DB2 for i5/OS in V6R1 will start to ignore the syntax and options that will never be supported on DB2 for i5/OS. The syntax can be ignored because the DB2 for i5/OS database engine and operating system automatically handle many low-level database administration tasks that require manual configuration with the other DB2 products. Figure 5 shows examples of statements with the ignored syntax highlighted. An SQL warning is returned to flag ignored syntax on DB2 for i5/OS. Porting is faster without the need to delete unnecessary syntax from database script files.
When porting applications to DB2 for i5/OS, you will frequently find that SQL CLI is the data access programming interface used by the application. The portability of these SQL CLI applications also increases in V6R1 with the addition of CLI wide-character API support. The CLI wide-character functions occur in applications that support Unicode or Double-byte data.
The SQL Query Engine (SQE), first introduced in V5R2, has been one of IBM's key initiatives for elevating SQL performance on DB2 for i5/OS to new heights. This performance advancement for SQL workloads continues in V6R1 with the elimination of two major roadblocks that prevented the usage of the SQE: functions relying on low-level translation function and National Language Sort Sequences (NLSS). The low-level translation capability is required by commonly used SQL built-in functions (BIFs), such as Upper and Lower. The NLSS support in i5/OS is frequently used in overseas markets in which the application requires that the character data is sorted in a manner that matches the local language instead of the default *HEX ordering, which closely mirrors the sorting of the English alphabet. With these key functions now supported by SQE, the most common items that prevent the usage of SQE in V6R1 are
To enable more use of the SQE, another change that IBM made in V6R1 is a new default value for the Ignore_Derived_Index QAQQINI parameter. This QAQQINI parameter was first added to V5R3 so that the SQE could be used in environments in which SQL statements reference DB2 objects created with DDS. Before V6R1, the default SQE optimizer behavior was to reroute execution of any SQL request to the Classic Query Engine (CQE) anytime a derived logical file was encountered during the query optimization process. You can override this default behavior by setting the Ignore_Derived_Index QAQQINI parameter to *YES. This value lets the SQE query optimizer ignore any keyed logical files that contain select/omit criteria or key field derivations it encounters during optimization. With V6R1, the default value of the Ignore_Derived_Index QAQQINI parameter is now *YES instead of the *NO value used on prior releases.
SQL Query Engine enhancements. Although eliminating barriers to SQE usage is important, the most interesting V6R1 enhancements are the new capabilities that leverage the extensible architecture of the SQE. A prime example is DB2 for i5/OS's first foray into self-learning query optimization. A self-learning query optimizer can analyze a poorly performing query plan, dynamically adjust its internal algorithms based on feedback, and select a better query plan on future executions. In V6R1, the SQE optimizer automatically analyzes poorly performing query plans to examine the I/O characteristics (as well as record retrieval patterns) and compare them with the values used during optimization of the query. If the query optimizer detects significant mismatches, the optimizer alters its assumptions and algorithms during the next execution of that SQL request to generate a better plan. To complement the learning optimizer, the DB2 runtime engine is also equipped with adaptive technologies that let the plan of a currently running query be modified on the fly to improve the efficiency of the query for the remainder of the run.
In addition to the self-learning enhancements, the SQE optimizer can also improve SQL performance with new indexing technologies. First, the SQE query optimizer can dramatically improve the processing speed of the Sum and Average aggregate functions with Encoded Vector Indexes (EVI) whenever an EVI exists over the columns referenced on the function call. The second key advancement in DB2 for i5/OS indexing technology is a feature called SQL derived indexes. In some ways, you can view this as the i5/OS SQL interface catching up with DDS by allowing SQL indexes to be created with expressions and selection criteria, just as i5/OS and OS/400 developers have been doing for years on keyed logical file definitions. Although that is partly true, SQL derived indexes take these capabilities even further with the ability to specify SQL BIFs as part of the key expression, as in the following code:
CREATE INDEX upper_cname ON customers(UPPER(company_name))
The Upper function mentioned in this section will probably be one of the more popular SQL BIFs used in SQL derived indexes. The Upper function is often used by application developers to implement case-insensitive searches, such as the following query:
SELECT customer_ID, customer_phone FROM customers WHERE UPPER(company_name)= 'ACME'
By forcing all company names to be converted to upper case, the query returns all occurrences of ACME, even if the data entry clerks entered that company name differently (e.g., Acme or acme).
Although this type of case-insensitive search had no functional issues on DB2 for i5/OS, performance problems could arise because the UPPER function prevented the query optimizer from using an index to speed up this search. The new SQL-derived index support allows the index in the key expression code to be created and the SQE query optimizer to use this index to quickly retrieve the customer data. It can do so because the key expression on the index exactly matches the column expression in the case-insensitive query.
Some customers have successfully used shared-weight sort sequences to solve the performance problems associated with case-insensitive searches. However, that approach requires more complex configuration when you create your database and applications. In addition, you can't use the sort sequences approach with more complex expressions, such as
WHERE UPPER(CONCAT(FirstName,LastName)) = 'ALBERTYOUNG'
Just about any SQL BIF can be included in the key definition of an SQL derived index. The following code is an example of an SQL derived index containing selection criteria.
CREATE INDEX cust_ix1 ON customers(customer_id) WHERE activefld='A'
This index is known as a sparse index because the index does not contain a key value for every row in the table. Although this sparse SQL index can be created on V6R1, the SQE query optimizer currently lacks the ability to use a sparse index in a query implementation. The SQE query optimizer can choose only SQL derived indexes with key expressions. In the future, IBM plans to enhance the SQE query optimizer with the ability to use sparse indexes.
SQL workloads on i5/OS should also run faster on V6R1 due to the streamlining of the SQL full open code path. Full open is the processing that DB2 has to perform the first and second time an SQL statement is run within a connection (or job) some tests have shown a 10 percent performance improvement. Applications using stored procedure calls will also receive a boost from improved caching of repeated stored procedure calls.
It can be challenging to tune the performance of DB2 for i5/OS when SQL requests are being submitted from remote clients such as browsers and .NET applications. A system administrator can easily detect i5/OS server jobs that are consuming extra resources because of a long-running SQL statement. However, it's quite difficult to trace back to determine which remote client or program submitted the SQL request. Thus, in V6R1 IBM introduces the following special registers that applications can use to ease this problem determination exercise:
CURRENT CLIENT_ACCTNG
CURRENT CLIENT_APPLNAME
CURRENT CLIENT_PROGRAMID
CURRENT CLIENT_USERID
CURRENT CLIENT_WRKSTNNAME
Application developers can build their applications to set these values programmatically with one of these interfaces:
After the application has set these registers, administrators can retrieve the settings by collecting database monitor data or by using the System i Navigator SQL Details for Jobs task by right-clicking the Database object in the navigation tree. Figure 6 shows output from the System i Navigator task. In prior releases, this Navigator task was known as Current SQL for a Job.
Examine the SQL Details for Jobs output in Figure 6 closely and you will notice that IBM has instrumented some of its own tools and drivers to assign values to these new client special registers. The UPDATE statement was run by user KMILL from the System i Navigator Run SQL Scripts interface.
The usability of the SQL Performance Monitors in System i Navigator took a major step forward in V5R4, and that continues with V6R1. The Analyze task for the SQL Monitors makes it easy to identify a set of SQL statements with poor performance. However, it was difficult to share those SQL statements with an administrator or performance analyst, because your sharing options were either a primitive copy-and-paste or telling the administrator or analyst how to run the same analysis report. V6R1 simplifies this operation with a new Save Results task (Figure 7) that lets you save the results directly in a wide variety of output formats. This feature not only lets you save the report data into a shareable format but also gives you the option to launch the PC application associated with this file type (Microsoft Excel in this case).
Improved data readability. Column customization is another V6R1 enhancement that makes System i Navigator output easier to read and analyze. The SQL Performance Monitor analysis reports are very thorough and return every piece of information that could be affecting DB2 performance. The downside of these complete reports is that the numerous columns of data returned can overwhelm you some reports return almost 100 data columns. In addition, the default ordering of the columns in the reports can slow your ability to find the most performance information. For example, some reports require the user to scroll two or three screens to the right to find the SQL statement text. The column customization support in V6R1 solves both of these problems by providing the ability to reorder the columns and remove columns from the output display. This customization is accomplished through the Select Columns window (Figure 8). This column customization is also available on other interfaces, such as the Plan Cache, Show Indexes, and Index Advisor. After an interface has been customized, the reporting tools remember those settings on future reports.
With V6R1, IBM also improved the data that the database monitor collects. One example of data that the monitor collects is the new client special registers (e.g., CURRENT CLIENT_APPLNAME) that I discussed previously. An even bigger enhancement is the monitor's capability to collect detailed data for Query/400 and OPNQRYF requests. With this capability, for the first time, you can use the System i Navigator Visual Explain tool and SQL Performance Monitor analysis reports to analyze and tune the performance of Query/400 and OPNQRYF requests.
The usability of the index analysis tools in System i Navigator is also improved with V6R1. Customers love the realtime index advice that the systemwide Index Advisor in V5R4 provides. However, the interface for creating indexes based on the advice has not been as well received. If the DB2 for i5/OS advisor suggested 10 different indexes that you wanted to create from that interface, you manually had to click each advised index to create the index. The V6R1 Index Advisor interface lets you quickly accomplish this task. Simply highlight a group of advised indexes and select Show SQL to generate all the CREATE INDEX statements with a single operation.
V6R1 also simplifies the index evaluation capabilities of the Show Indexes task when a group of DB2 objects is involved. The Show Indexes task has been a customer favorite since its introduction in V5R3; this tool lets you easily evaluate which indexes and logical files (LFs) are being used by DB2 for i5/OS, which index and LF objects are just wasting disk space, and the system resources used to maintain them. The drawback with this tool is that it could be executed only at a table level. For example, when required to perform index evaluation for all 50 tables (physical files) in a schema (library), you had to launch the Show Indexes task for each table level. Luckily, that's no longer the case with V6R1. You can execute the Show Indexes task at a schema level by right-clicking the Tables object in a schema and selecting this task (Figure 9).
The final System i Navigator enhancement to highlight is the V6R1 modifications to the Health Center. The Health Center has been enhanced to start reporting on DB2 runtime limits, such as the number of Large Object (LOB) locators or CLI handles per connection (or job).
System i Navigator is not the only IBM tool that simplifies the day-to-day work of i5/OS developers and administrators. The IBM Rational Data Architect (Version 7) provides improved data modeling support for DB2 for i5/OS databases with Fixpack 7.0.0.3, including enhancements for DDS-created databases.
System i developers and customers will also benefit from the multiple enhancements that IBM is planning to add to DB2 Web Query in 2008. First, IBM plans to add more flexible pricing terms to accommodate both end users and developers running reports. In addition, IBM is working on new interfaces to make it easier for developers to leverage and integrate DB2 Web Query reports into their applications. These improvements will include report scheduling and automated report distribution.
V6R1 lifts DB2 for i5/OS to new heights especially when you consider that this article offers only the DB2 V6R1 highlights and is not a complete list. From the great additions to SQL that enable rapid solution development to the sizzling DB2 performance gains, the DB2 for i5/OS V6R1 enhancements provide an excellent foundation for delivering business data in a way that lets your company stay ahead of the pack. So run, don't walk, to V6R1.
Kent Milligan is a Senior DB2 for i5/OS Specialist on IBM’s ISV Enablement team for System i. Kent spent the first eight years of his IBM career as a member of the DB2 development group in Rochester. He speaks and writes regularly about relational database topics. You can reach him at kmill@us.ibm.com.