Learn how to prepare for the new DB2 for i5/OS query engine.
With V6R1 DB2 for i5/OS, the transition of SQL workloads from Classic Query Engine (CQE) to the new SQL Query Engine (SQE) has been basically completed. So far, the SQE functionality has been delivered in six waves:
- V5R2 GA
- Mid-V5R2 (Check APAR II13486)
- V5R3 GA
- mid-V5R3
- V5R4 GA
- V6R1 GA
The first stage, shipped in V5R2, allowed a limited set of read-only queries to run in SQE. As the code hardened and matured, each new stage opened SQE to an ever-wider scope of SQL statements. V6R1 delivers a number of important enhancements so that almost all classes of SQL queries can now take advantage of this modern and extremely capable query execution work horse.
The Need for SQE
The SQL Query Engine constitutes a complete redesign of the DB2 for i5/OS query engine. The main goal of this ambitious project was to create a new framework that would easily accommodate new advancements in the database optimization technology. The two cornerstones of the new design are query rewrite capability and sophisticated cost-based optimization. A query rewrite can modify the original request into a more efficient equivalent format. The cost-based optimization determines the most efficient data access method for a given query. The redesign takes advantage of the object-oriented technology. It uses a graph representation of a query, where each node is an independent and reusable component. You can think of these components as Lego blocks that can be used to construct robust execution plans for requests of any level of complexity. Thus, SQE shines, especially in the area of complex query optimization and execution. Here's a list of most important features of the new query engine:
- The major part of the query optimizer has been moved below the Machine Interface (MI) for more efficient processing.
- A new, separate component called Statistics Manager gathers and maintains database statistics that are vital for the cost-based optimization.
- The Index Advisor continuously generates, collects, and groups index advice based upon the queries being executed, thus allowing the database administrators to fine-tune the performance in real time.
- The SQE Plan Cache is an internal structure that caches query implementation plans (access plans) for queries optimized by the SQE Optimizer. It allows more reuse of existing plans for identical SQL statements, regardless of interface used to submit them.
- A range of powerful optimization and execution strategies minimize query input/output signature and maximize its performance. For example, the Look-ahead Predicate Generation (LPG) strategy uses the engine's query rewrite capabilities to generate local selection predicates where none were specified in the original request.
- The Query Dispatcher routes the query request to either SQE or CQE, depending on the type of the query.
The concepts covered in this section are illustrated in Figure 1.
Figure 1: The DB2 for i5/OS architecture (Click images to enlarge.)
V6R1 SQE Under the Hood
In V6R1, most of the remaining limitations that would cause the Query Dispatcher to select CQE for query execution have been eliminated. There are just a couple of remaining query types that continue to be routed to CQE:
- Queries submitted through non-SQL interfaces such as QQQQry API, Query/400, and OPNQRYF
- SQL requests that reference on the FROM clause files created using the Create Logical File (CRTLF) command (e.g., logical files)
All other SQL requests are now processed through SQE. Let's have a closer look at the most important SQE enhancements delivered in V6R1.
CCSID/NLSS Translation Support
Do you use *LANGSHRID sort sequence setting for your jobs to force the DB2 to perform case-insensitive string comparisons? Or maybe you use a country-specific sort sequence to sort character data based on your local character set? Well, if your answer for one of these questions was positive, then so far you have not taken advantage of the SQE capabilities. V6R1 adds a robust support for code page and national language support into the new query engine. So, now you can use sort sequences and scalar functions such as UCASE and LCASE and still be able to execute the queries through SQE. Consequently, a larger number of applications will now execute a larger set of SQL requests through the new query engine. Therefore, it is imperative to have a basic understanding of SQE technology and know how to prepare for it. This topic will be tackled in the section "How to Prepare for SQE
Self-Learning and Adapting Optimizer
SQE continues to exploit its modern object-oriented architecture to further improve performance and efficiency. One of the most intriguing areas is its self-learning and self-tuning optimizer. In V6R1, the engine watches the performance signature of the running queries and can automatically rebuild an existing plan if it believes that the new plan would yield better performance. Currently, there are two such scenarios that may result in a query re-plan:
- Cold I/O to Warm I/O Mitigation: Cold I/O occurs when the DB2 runtime needs to read the table's data from disk devices into the main store. Conversely, warm I/O occurs when the data requested by the runtime already resides in main memory. The optimizer assumes a certain amount of paging for tables referenced in the query. If the optimizer is noticing that the plan is not paging in data during execution, it concludes that the table must be residing somewhere in main memory. In this case, the optimizer will autonomically reoptimize the query with the assumption that the table will normally not incur paging of data. The newly created plan may not differ from the original, but the optimizer tried to change its default behavior. The warm I/O assumption may cause table scans to look more attractive to the optimizer since a table scan method would not need to worry about physical I/O, given that all the rows are basically in memory already.
- First I/O to All I/O Mitigation: For access plans built with the First I/O optimization goal, the optimizer will attempt reoptimization if it encounters an access plan that continually reads all of the rows returned in the result set. Since this is All I/O behavior rather than First I/O behavior, the optimizer will re-plan the query with the All I/O optimization goal to determine if a better-performing plan can be built. This is equivalent to the SQL programmer changing the statement by adding OPTIMIZE FOR ALL ROWS.
To illustrate these advanced capabilities, I ran a fairly simple test on a V6R1 system. Using an ODBC client application, I executed the following query:
select part, type from part_dim where retailprice <= 1600
The PART_DIM table contains 2 million rows. The search condition defined in the query (retailprice <= 1600) returns 1,193,413 rows, hence the query has low selectivity (returns large percentage of all rows). There is a radix tree index built over the PART_DIM table with the following definition:
CREATE INDEX PART_DIM_INX200
ON PART_DIM ( RETAILPRICE ASC , "PART" ASC , "TYPE" ASC ) ;
Note that the leading key in the above index matches the column defined in the equal predicate of the query (retailprice). The default optimization goal for queries submitted through ODBC is First I/O (i.e., a plan that is the fastest for identifying and returning the first 30 rows of the result set). Accordingly, for this query, the optimizer builds an access plan that is best for returning the first batch of rows to the client. The plan uses the PART_DIM_INX200 index to quickly navigate to the first handful of rows. The Visual Explain graph for this initial plan is shown in Figure 2.
Figure 2: The Visual Explain graph for the initial access plan
In Figure 2, the optimization goal reported by the optimizer is, as expected, First I/O.
I coded the ODBC client so that it retrieves all the rows from the result set, not just the first screenful. In other words, the client reads the data until the End Of File (EOF) marker is returned from the server. The fact that each execution of the query results in reaching the EOF is noticed by the DB2 engine, and the optimizer gets notified.
Now, if I execute the same query 10 times in the same job, the First I/O-to-All I/O mitigation logic kicks in and the optimizer reoptimizes the query with the optimization goal changed to All I/O. The new access plan changes the access method for the PART_DIM table from index probe to table scan. The new plan is shown in Figure 3.
Figure 3: Visual Explain for a new plan after First I/O mitigation
The new plan is, as expected, cheaper from the elapsed CPU time point of view and faster for delivering the entire result set back to the client. The Visual Explain reports the run time of 1,401 ms for the initial plan and 628 ms for the mitigated (rebuilt) plan.
You may now wonder how a performance analyst would know that one of the mitigation algorithms did in fact kick in and for which queries. In V6R1, the detailed SQL performance monitor has been enhanced to collect these two new optimizer-initiated plan rebuilds. To be specific, in the database monitor traces, column QQC15 (Hard Close Reason Code) in record type 1000 for the operation code (QQC21) 'HC' (Hard Close) is set to 'A' (Refresh error). Additionally, the optimization record type 3006 reports the following:
3006 Reason Codes and Sub Codes
|
||
|
Reason Code QQRCOD |
Sub Code hex(QQC21) |
Cold I/O to Warm I/O Mitigation |
B2 |
2 |
First I/O to All I/O Mitigation |
B2 |
3 |
Consult the DB2 for i5/OS Performance and Query Optimization handbook available on the Infocenter Web site for a detailed description of the database monitor and the format of its traces.
Derived Key Index Support
On any platform, good database performance depends on good design. And good design includes a solid understanding of indexes and column statistics: the quantity to build, their structure and complexity, and their maintenance requirements.
DB2 for i5/OS provides two types of indexes: namely, radix trees and encoded vector indexes (EVIs). The optimizer can effectively use both types of indexes to create efficient access plans. However, until V6R1, you could only specify a list of actual key columns in an index definition. In certain cases, this limits the usefulness of indexes. Consider the following SQL statement:
SELECT ExtendedPrice*(1-Discount)*(1+Tax) AS FinalPrice FROM item_fact
WHERE Decimal(ExtendedPrice*(1-Discount)*(1+Tax),15,2) >= 110000
In the example shown above, the WHERE clause contains a search criteria based on a data derivation. The optimizer may select an access plan that uses table scan, where each row in the table is first read and then the expression is evaluated and compared to 110000. Maybe, to eliminate the table scan, you created an index over columns ExtendedPrice, Discount, and Tax. Now the optimizer may choose an index scan access method, where each index entry is read and the retrieved key values are used to evaluate the expression. For a table with many rows, both of these methods can be quite expensive. The sample query is actually very selective. It retrieves only 22 rows out of 6 million rows. So, the most efficient access method would be index probe, where an index is used to quickly navigate to the matching key. But didn't I just claim that you cannot create indexes over expressions? Well, in V6R1 you can! Therefore, to speed up the query processing, I created the following derived key index:
CREATE INDEX ITAM_FACT_INX202 ON ITEM_FACT
( EXTENDEDPRICE*(1-DISCOUNT)*(1+TAX) ASC )
Since the index definition matches the expression in the WHERE clause predicate, it is in fact used by the optimizer to implement the index probe access method, as shown in Figure 4.
Figure 4: Derived Key Index used to implement index probe
The advantages of the derived key index become evident when we compare the elapsed CPU runtimes for the table scan access method (7234 ms = 7.234 sec) to the index probe access method (37 ms).
In addition to the local selection, the derived key indexes can now be chosen by the optimizer to implement other operations, such as join, grouping, and ordering. The index definition can contain both key derivations and non-derived key columns. Consider the following sample query:
SELECT Country, SUM(ExtendedPrice * (1 -Discount) * (1 + Tax)) AS FinalPrice from ITEM_FACT i, supp_dim s
WHERE i.suppkey = s.suppkey
AND s.continent= 'AMERICA'
AND i.year = 1997 AND (ExtendedPrice * (1 -Discount) * (1 + Tax)) >= 110000
GROUP BY country;
From the ITEM_FACT table access point of view, the query contains the equal local selection predicate i.year = 1997, the unequal selection predicate (ExtendedPrice * (1 -Discount) * (1 + Tax)) >= 110000, and the join predicate i.suppkey = s.suppkey. According to the indexing best practices described in the "Indexing and statistics strategies for DB2 for i5/O"' white paper, a perfect index that can be used to implement both the local selection and the join should be defined as shown below:
CREATE INDEX ITEM_FACT _INX201 ON ITEM_FACT
( YEAR , SUPPKEY , EXTENDEDPRICE*(1-DISCOUNT)*(1+TAX))
The applicable best practice rule is as follows: Order the columns in an index, starting with equal predicates for local selection, followed by equal join predicates, followed by one non-equal predicate. The Visual Explain image shown in Figure 5 confirms that the perfect index is in fact used by the optimizer in the sample query access plan.
Figure 5: Derived Key Index used in join and local selection implementation
In both examples, I use radix tree indexes, but keep in mind that the new derived key index support is also applicable to EVIs. The optimizer will consider the EVIs for local selection and join implementation. By their very nature, EVIs cannot be used for order by implementation.
When reviewing the V6R1 documentation, you'll notice that the derived index definition also supports the WHERE clause. In this case, a derived index would be similar in its characteristics to select/omit logical files used by traditional (e.g., RPG, COBOL) applications. In other words, when you include the WHERE clause on the index definition, DB2 for i5/OS builds a sparse index, an index that only references rows that match the WHERE clause condition(s). Be advised that as of V6R1 the SQE optimizer will not take advantage of such defined indexes. Currently, this type of SQL-created index will be used only by the native database access interfaces.
The additional consideration is the cost of maintaining the indexes. The more indexes built over a particular table, the more system resources (roughly linear correlation) required to maintain them. Tables with a large number of indexes also experience slower insert/update/delete performance. So, be prudent in your judgment on when an index is really necessary.
Other Functional and Performance Improvements
The complete list of all the performance and functional enhancements would go beyond the scope of this article and beyond your attention span, so let me just highlight a few that, from my vantage point, are the most important:
- User Defined Table Function (UDTF) support: A UDTF is a user-defined function that returns a table to the SQL statement in which it is invoked. A UDTF reference is valid only in a FROM clause of a SELECT statement. In V6R1, statements that refer UDTFs can be optimized and executed in SQE. The performance of such statements can be significantly improved because SQE can invoke UDTFs that were registered as NOT FENCED in the main thread of execution. CQE will always spawn a secondary thread for references to UDTFs (and UDFs for that matter). Spawning a temporary secondary thread is a fairly costly operation that can be eliminated with the new SQE support.
- Optimization time and full open improvements: The process of creating access plans has been streamlined by using the so-called "one pass tree costing" strategy. As a result, I observed faster optimization times, especially for complex joins. DB2 implements an access plan by creating internal structures that are used by the database runtime to execute a given request. These constructs are jointly referred to as open data path (ODP). You can think of an ODP as a pipe that connects physical data stored in data spaces with your application. The process of creating a new ODP requires a lot of system resources. In V6R1, the ODP creation code path has been scrubbed to make it more efficient. This should result in faster full open times.
- Feedback enhancements:Every time an SQL statement is executed on DB2, the database engine collects statistics that describe the performance signature of a given statement. This data can then be presented to the system administrator through one of various feedback mechanisms. Here's a short list of feedback enhancements:
- In addition to the list of perfect indexes that could be used to implement a given query, the Index Advisor will now also advise the "basic" indexes in case they are missing. An example of a "basic" index would be an index built only over join columns.
- The database monitor adds new filters for Query Governor and TCP/IP ports and also provides enhanced host variable and parameter marker value collection in the 3010 record format.
- Visual Explain adds a new option, Explain while running, that allows you to refresh the Visual Explain diagram while the query is executed so that you can watch the progress of the query. This option will be invaluable for longer-running SQL statements.
- The Plan Cache adds an event monitor that records changes in the plan cache. The event monitor captures access plan entries as they are removed from the plan cache. By combining the event monitor output with a plan cache snapshot, you can gain a comprehensive view of the cache efficiency for an arbitrarily selected period of time.
How to Prepare for SQE
Running your SQL workload through SQE will in most cases significantly improve the performance. Generally, change is good, provided you are prepared for it. Before diving into V6R1, I recommend that you evaluate your system's performance, as well as the query and reporting mechanisms. This involves implementing a proper indexing strategy and evaluating the potential for proactive statistics collection. The "Indexing and statistics strategies for DB2 for i5/OS" white paper that I mentioned earlier is an excellent source of information on how SQE works under the covers. It also provides a field-proven methodology for index and statistics management. I also recommend that you collect system performance data (such as Performance Collection or PEX stats) and database monitor traces before you upgrade to the new release. This holds true for any upgrade (not just an upgrade to V6R1). Having the before and after system and database traces will allow you to validate the effectiveness of the indexing strategy and quantify the performance improvements as well as identify the areas that may require additional tuning.
Additional Material
The following publications can be helpful to those who want to learn more about the topics covered in this article:
"Indexing and statistics strategies for DB2 for i5/OS," IBM white paper
"i5/OS DBA: New Derived Key Indexes in DB2 for i5/OS V6R1," IBM Database Magazine
Preparing for and Tuning the SQL Query Engine on DB2 for i5/OS, IBM ITSO Redbook
LATEST COMMENTS
MC Press Online