V6R1 SQL Query Engine Delivers on Its Promise

Typography

Smaller Small Medium Big Bigger
Default Helvetica Segoe Georgia Times
Reading Mode

Learn how to prepare for the new DB2 for i5/OS query engine.

With V6R1 DB2 for i5/OS, the transition of SQL workloads from Classic Query Engine (CQE) to the new SQL Query Engine (SQE) has been basically completed. So far, the SQE functionality has been delivered in six waves:

V5R2 GA
Mid-V5R2 (Check APAR II13486)
V5R3 GA
mid-V5R3
V5R4 GA
V6R1 GA

The first stage, shipped in V5R2, allowed a limited set of read-only queries to run in SQE. As the code hardened and matured, each new stage opened SQE to an ever-wider scope of SQL statements. V6R1 delivers a number of important enhancements so that almost all classes of SQL queries can now take advantage of this modern and extremely capable query execution work horse.

The Need for SQE

The SQL Query Engine constitutes a complete redesign of the DB2 for i5/OS query engine. The main goal of this ambitious project was to create a new framework that would easily accommodate new advancements in the database optimization technology. The two cornerstones of the new design are query rewrite capability and sophisticated cost-based optimization. A query rewrite can modify the original request into a more efficient equivalent format. The cost-based optimization determines the most efficient data access method for a given query. The redesign takes advantage of the object-oriented technology. It uses a graph representation of a query, where each node is an independent and reusable component. You can think of these components as Lego blocks that can be used to construct robust execution plans for requests of any level of complexity. Thus, SQE shines, especially in the area of complex query optimization and execution. Here's a list of most important features of the new query engine:

The major part of the query optimizer has been moved below the Machine Interface (MI) for more efficient processing.
A new, separate component called Statistics Manager gathers and maintains database statistics that are vital for the cost-based optimization.
The Index Advisor continuously generates, collects, and groups index advice based upon the queries being executed, thus allowing the database administrators to fine-tune the performance in real time.
The SQE Plan Cache is an internal structure that caches query implementation plans (access plans) for queries optimized by the SQE Optimizer. It allows more reuse of existing plans for identical SQL statements, regardless of interface used to submit them.
A range of powerful optimization and execution strategies minimize query input/output signature and maximize its performance. For example, the Look-ahead Predicate Generation (LPG) strategy uses the engine's query rewrite capabilities to generate local selection predicates where none were specified in the original request.
The Query Dispatcher routes the query request to either SQE or CQE, depending on the type of the query.

The concepts covered in this section are illustrated in Figure 1.

Figure 1: The DB2 for i5/OS architecture (Click images to enlarge.)

V6R1 SQE Under the Hood

In V6R1, most of the remaining limitations that would cause the Query Dispatcher to select CQE for query execution have been eliminated. There are just a couple of remaining query types that continue to be routed to CQE:

Queries submitted through non-SQL interfaces such as QQQQry API, Query/400, and OPNQRYF
SQL requests that reference on the FROM clause files created using the Create Logical File (CRTLF) command (e.g., logical files)

All other SQL requests are now processed through SQE. Let's have a closer look at the most important SQE enhancements delivered in V6R1.

CCSID/NLSS Translation Support

Do you use *LANGSHRID sort sequence setting for your jobs to force the DB2 to perform case-insensitive string comparisons? Or maybe you use a country-specific sort sequence to sort character data based on your local character set? Well, if your answer for one of these questions was positive, then so far you have not taken advantage of the SQE capabilities. V6R1 adds a robust support for code page and national language support into the new query engine. So, now you can use sort sequences and scalar functions such as UCASE and LCASE and still be able to execute the queries through SQE. Consequently, a larger number of applications will now execute a larger set of SQL requests through the new query engine. Therefore, it is imperative to have a basic understanding of SQE technology and know how to prepare for it. This topic will be tackled in the section "How to Prepare for SQE

Self-Learning and Adapting Optimizer

SQE continues to exploit its modern object-oriented architecture to further improve performance and efficiency. One of the most intriguing areas is its self-learning and self-tuning optimizer. In V6R1, the engine watches the performance signature of the running queries and can automatically rebuild an existing plan if it believes that the new plan would yield better performance. Currently, there are two such scenarios that may result in a query re-plan:

Cold I/O to Warm I/O Mitigation: Cold I/O occurs when the DB2 runtime needs to read the table's data from disk devices into the main store. Conversely, warm I/O occurs when the data requested by the runtime already resides in main memory. The optimizer assumes a certain amount of paging for tables referenced in the query. If the optimizer is noticing that the plan is not paging in data during execution, it concludes that the table must be residing somewhere in main memory. In this case, the optimizer will autonomically reoptimize the query with the assumption that the table will normally not incur paging of data. The newly created plan may not differ from the original, but the optimizer tried to change its default behavior. The warm I/O assumption may cause table scans to look more attractive to the optimizer since a table scan method would not need to worry about physical I/O, given that all the rows are basically in memory already.

First I/O to All I/O Mitigation: For access plans built with the First I/O optimization goal, the optimizer will attempt reoptimization if it encounters an access plan that continually reads all of the rows returned in the result set. Since this is All I/O behavior rather than First I/O behavior, the optimizer will re-plan the query with the All I/O optimization goal to determine if a better-performing plan can be built. This is equivalent to the SQL programmer changing the statement by adding OPTIMIZE FOR ALL ROWS.

To illustrate these advanced capabilities, I ran a fairly simple test on a V6R1 system. Using an ODBC client application, I executed the following query:

select part, type from part_dim where retailprice <= 1600

The PART_DIM table contains 2 million rows. The search condition defined in the query (retailprice <= 1600) returns 1,193,413 rows, hence the query has low selectivity (returns large percentage of all rows). There is a radix tree index built over the PART_DIM table with the following definition:

CREATE INDEX PART_DIM_INX200

ON PART_DIM ( RETAILPRICE ASC , "PART" ASC , "TYPE" ASC ) ;

Note that the leading key in the above index matches the column defined in the equal predicate of the query (retailprice). The default optimization goal for queries submitted through ODBC is First I/O (i.e., a plan that is the fastest for identifying and returning the first 30 rows of the result set). Accordingly, for this query, the optimizer builds an access plan that is best for returning the first batch of rows to the client. The plan uses the PART_DIM_INX200 index to quickly navigate to the first handful of rows. The Visual Explain graph for this initial plan is shown in Figure 2.

Figure 2: The Visual Explain graph for the initial access plan

In Figure 2, the optimization goal reported by the optimizer is, as expected, First I/O.

I coded the ODBC client so that it retrieves all the rows from the result set, not just the first screenful. In other words, the client reads the data until the End Of File (EOF) marker is returned from the server. The fact that each execution of the query results in reaching the EOF is noticed by the DB2 engine, and the optimizer gets notified.

Now, if I execute the same query 10 times in the same job, the First I/O-to-All I/O mitigation logic kicks in and the optimizer reoptimizes the query with the optimization goal changed to All I/O. The new access plan changes the access method for the PART_DIM table from index probe to table scan. The new plan is shown in Figure 3.

Figure 3: Visual Explain for a new plan after First I/O mitigation

The new plan is, as expected, cheaper from the elapsed CPU time point of view and faster for delivering the entire result set back to the client. The Visual Explain reports the run time of 1,401 ms for the initial plan and 628 ms for the mitigated (rebuilt) plan.

You may now wonder how a performance analyst would know that one of the mitigation algorithms did in fact kick in and for which queries. In V6R1, the detailed SQL performance monitor has been enhanced to collect these two new optimizer-initiated plan rebuilds. To be specific, in the database monitor traces, column QQC15 (Hard Close Reason Code) in record type 1000 for the operation code (QQC21) 'HC' (Hard Close) is set to 'A' (Refresh error). Additionally, the optimization record type 3006 reports the following:

3006 Reason Codes and Sub Codes
	Reason Code QQRCOD	Sub Code hex(QQC21)
Cold I/O to Warm I/O Mitigation	B2	2
First I/O to All I/O Mitigation	B2	3

Consult the DB2 for i5/OS Performance and Query Optimization handbook available on the Infocenter Web site for a detailed description of the database monitor and the format of its traces.

Derived Key Index Support

On any platform, good database performance depends on good design. And good design includes a solid understanding of indexes and column statistics: the quantity to build, their structure and complexity, and their maintenance requirements.

DB2 for i5/OS provides two types of indexes: namely, radix trees and encoded vector indexes (EVIs). The optimizer can effectively use both types of indexes to create efficient access plans. However, until V6R1, you could only specify a list of actual key columns in an index definition. In certain cases, this limits the usefulness of indexes. Consider the following SQL statement:

SELECT ExtendedPrice*(1-Discount)*(1+Tax) AS FinalPrice FROM item_fact

WHERE Decimal(ExtendedPrice*(1-Discount)*(1+Tax),15,2) >= 110000

In the example shown above, the WHERE clause contains a search criteria based on a data derivation. The optimizer may select an access plan that uses table scan, where each row in the table is first read and then the expression is evaluated and compared to 110000. Maybe, to eliminate the table scan, you created an index over columns ExtendedPrice, Discount, and Tax. Now the optimizer may choose an index scan access method, where each index entry is read and the retrieved key values are used to evaluate the expression. For a table with many rows, both of these methods can be quite expensive. The sample query is actually very selective. It retrieves only 22 rows out of 6 million rows. So, the most efficient access method would be index probe, where an index is used to quickly navigate to the matching key. But didn't I just claim that you cannot create indexes over expressions? Well, in V6R1 you can! Therefore, to speed up the query processing, I created the following derived key index:

CREATE INDEX ITAM_FACT_INX202 ON ITEM_FACT

( EXTENDEDPRICE*(1-DISCOUNT)*(1+TAX) ASC )

Since the index definition matches the expression in the WHERE clause predicate, it is in fact used by the optimizer to implement the index probe access method, as shown in Figure 4.

Figure 4: Derived Key Index used to implement index probe

The advantages of the derived key index become evident when we compare the elapsed CPU runtimes for the table scan access method (7234 ms = 7.234 sec) to the index probe access method (37 ms).

In addition to the local selection, the derived key indexes can now be chosen by the optimizer to implement other operations, such as join, grouping, and ordering. The index definition can contain both key derivations and non-derived key columns. Consider the following sample query:

SELECT Country, SUM(ExtendedPrice * (1 -Discount) * (1 + Tax)) AS FinalPrice from ITEM_FACT i, supp_dim s

WHERE i.suppkey = s.suppkey

AND s.continent= 'AMERICA'

AND i.year = 1997 AND (ExtendedPrice * (1 -Discount) * (1 + Tax)) >= 110000

GROUP BY country;

From the ITEM_FACT table access point of view, the query contains the equal local selection predicate i.year = 1997, the unequal selection predicate (ExtendedPrice * (1 -Discount) * (1 + Tax)) >= 110000, and the join predicate i.suppkey = s.suppkey. According to the indexing best practices described in the "Indexing and statistics strategies for DB2 for i5/O"' white paper, a perfect index that can be used to implement both the local selection and the join should be defined as shown below:

CREATE INDEX ITEM_FACT _INX201 ON ITEM_FACT

( YEAR , SUPPKEY , EXTENDEDPRICE*(1-DISCOUNT)*(1+TAX))

The applicable best practice rule is as follows: Order the columns in an index, starting with equal predicates for local selection, followed by equal join predicates, followed by one non-equal predicate. The Visual Explain image shown in Figure 5 confirms that the perfect index is in fact used by the optimizer in the sample query access plan.

Figure 5: Derived Key Index used in join and local selection implementation

In both examples, I use radix tree indexes, but keep in mind that the new derived key index support is also applicable to EVIs. The optimizer will consider the EVIs for local selection and join implementation. By their very nature, EVIs cannot be used for order by implementation.

When reviewing the V6R1 documentation, you'll notice that the derived index definition also supports the WHERE clause. In this case, a derived index would be similar in its characteristics to select/omit logical files used by traditional (e.g., RPG, COBOL) applications. In other words, when you include the WHERE clause on the index definition, DB2 for i5/OS builds a sparse index, an index that only references rows that match the WHERE clause condition(s). Be advised that as of V6R1 the SQE optimizer will not take advantage of such defined indexes. Currently, this type of SQL-created index will be used only by the native database access interfaces.

The additional consideration is the cost of maintaining the indexes. The more indexes built over a particular table, the more system resources (roughly linear correlation) required to maintain them. Tables with a large number of indexes also experience slower insert/update/delete performance. So, be prudent in your judgment on when an index is really necessary.

Other Functional and Performance Improvements

The complete list of all the performance and functional enhancements would go beyond the scope of this article and beyond your attention span, so let me just highlight a few that, from my vantage point, are the most important:

User Defined Table Function (UDTF) support: A UDTF is a user-defined function that returns a table to the SQL statement in which it is invoked. A UDTF reference is valid only in a FROM clause of a SELECT statement. In V6R1, statements that refer UDTFs can be optimized and executed in SQE. The performance of such statements can be significantly improved because SQE can invoke UDTFs that were registered as NOT FENCED in the main thread of execution. CQE will always spawn a secondary thread for references to UDTFs (and UDFs for that matter). Spawning a temporary secondary thread is a fairly costly operation that can be eliminated with the new SQE support.
Optimization time and full open improvements: The process of creating access plans has been streamlined by using the so-called "one pass tree costing" strategy. As a result, I observed faster optimization times, especially for complex joins. DB2 implements an access plan by creating internal structures that are used by the database runtime to execute a given request. These constructs are jointly referred to as open data path (ODP). You can think of an ODP as a pipe that connects physical data stored in data spaces with your application. The process of creating a new ODP requires a lot of system resources. In V6R1, the ODP creation code path has been scrubbed to make it more efficient. This should result in faster full open times.
Feedback enhancements:Every time an SQL statement is executed on DB2, the database engine collects statistics that describe the performance signature of a given statement. This data can then be presented to the system administrator through one of various feedback mechanisms. Here's a short list of feedback enhancements:
- In addition to the list of perfect indexes that could be used to implement a given query, the Index Advisor will now also advise the "basic" indexes in case they are missing. An example of a "basic" index would be an index built only over join columns.
- The database monitor adds new filters for Query Governor and TCP/IP ports and also provides enhanced host variable and parameter marker value collection in the 3010 record format.
- Visual Explain adds a new option, Explain while running, that allows you to refresh the Visual Explain diagram while the query is executed so that you can watch the progress of the query. This option will be invaluable for longer-running SQL statements.
- The Plan Cache adds an event monitor that records changes in the plan cache. The event monitor captures access plan entries as they are removed from the plan cache. By combining the event monitor output with a plan cache snapshot, you can gain a comprehensive view of the cache efficiency for an arbitrarily selected period of time.

How to Prepare for SQE

Running your SQL workload through SQE will in most cases significantly improve the performance. Generally, change is good, provided you are prepared for it. Before diving into V6R1, I recommend that you evaluate your system's performance, as well as the query and reporting mechanisms. This involves implementing a proper indexing strategy and evaluating the potential for proactive statistics collection. The "Indexing and statistics strategies for DB2 for i5/OS" white paper that I mentioned earlier is an excellent source of information on how SQE works under the covers. It also provides a field-proven methodology for index and statistics management. I also recommend that you collect system performance data (such as Performance Collection or PEX stats) and database monitor traces before you upgrade to the new release. This holds true for any upgrade (not just an upgrade to V6R1). Having the before and after system and database traces will allow you to validate the effectiveness of the indexing strategy and quantify the performance improvements as well as identify the areas that may require additional tuning.

Additional Material

The following publications can be helpful to those who want to learn more about the topics covered in this article:

"Indexing and statistics strategies for DB2 for i5/OS," IBM white paper

"i5/OS DBA: New Derived Key Indexes in DB2 for i5/OS V6R1," IBM Database Magazine

Preparing for and Tuning the SQL Query Engine on DB2 for i5/OS, IBM ITSO Redbook

Jarek Miszczyk is a Lead Technical Consultant for System x Virtualization and Cloud Computing at the IBM STG Global ISV Enablement organization. He is located in Rochester, Minnesota. He can be reached by email at This email address is being protected from spambots. You need JavaScript enabled to view it..

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Book Reviews

Book Review: Extract, Transform, and Load with SSIS

Do your business apps access different data sources? This book shows you how to make that task easier
Book Review: 21st Century RPG: /Free, ILE, and MVC

David Shirey’s first book is an educational and entertaining read for “modern” and “old” RPG programmers alike
Book Review: Developing Business Applications for the Web--With HTML, CSS, JSP, PHP, ASP.NET, and JavaScript

If you are ready to get into Web application development, take this book along as your guide
Book Review: DB2 10.5 Fundamentals for LUW: Certification Study Guide (Exam 615)

DBAs who use the book will find it very helpful first in their test study and later as a reference book.
Book Review: DB2 11 for z/OS Database Administration—Certification Study Guide

This is a well-written DB2 11 book that could easily stand on its own as a reference manual, not just a certification guide.
Book Review: Free-Format RPG IV, Third Edition

Jim Martin comes through for us again.
Book Review: IBM i Security Administration and Compliance, Second Edition
Book Review: Programming in ILE RPG, Fifth Edition

This book really hits the mark and is a must-read for all RPG developers.
Book Review: DB2 10.1/10.5 for Linux, UNIX, and Windows Database Administration: Certification Guide
Book Review: Subfiles in Free-Format RPG

Whether you're a newbie or a seasoned pro, this book has something for you.
Book Review: Evolve Your RPG Coding: Move from OPM to ILE ... and Beyond

This book provides an amazingly comprehensive introduction to the concepts while at the same time delivering enough technical detail to make you productive very quickly.
Book Review: Database Design and SQL for DB2
Book Review: The Chief Data Officer Handbook for Data Governance

When implemented appropriately, data governance is a powerful framework.
Book Review: DB2 10 for z/OS: The Smarter, Faster Way to Upgrade

Trying to figure out whether to upgrade? Read on.
Book Review: 5 Keys to Business Analytics Program Success
Book Review: DB2 11: The Ultimate Database for Cloud, Analytics, and Mobile
Book Review: Flexible Input, Dazzling Output with IBM i

Today, it's all about input and output. Getting data into the IBM i from non-traditional sources and then displaying it back out again in varied formats. But where can you go to learn all that you need to know about this critical skill?
Book Review: Advanced Guide to PHP on IBM i

Enterprise-level PHP skills and techniques have been adapted for IBM i developers in Kevin Schroeder's new book.
Book Review: Java for RPG Programmers

If you've been putting off learning Java, you have no excuse anymore!
Book Review: DB2 10.1 Fundamentals: Certification Study Guide

Too valuable to be classified as merely excellent certification material, this book should also rightly take its place on DB2 DBA bookshelves as a solid day-to-day DB2 reference.
Book Review: DB2 10 for Z/OS Database Administration: Certification Study Guide

Whether you're trying to get certified or you just need a great reference book, this is the book for you.
Book Review: Developing Web 2.0 Applications with EGL for IBM i

It's everything you need to know, from the bottom up.
Book Review: Advanced Integrated RPG

Isn't it about time somebody told us how to integrate RPG and Java?
Book Review: Managing Without Walls

If you manage remote or satellite teams, this book is a must-read!
Book Review: Managing Without Walls

If you manage remote or satellite teams, this book is a must-read!
Book Review: The Remote System Explorer

This book speaks directly to the thousands of IBM i programmers who develop in RPG, COBOL, CL, and DDS every day.
Book Review: IBM System i APIs at Work, Second Edition

API expert Bruce Vining delivers the only comprehensive guide to APIs.
Book Review: Functions in Free-Format RPG IV

This one short volume manages to essentially be both a general introduction and a detailed reference.
Book Review: DB2 11: The Database for Big Data and Analytics
Book Review: IBM Mainframe Security: Beyond the Basics

Beginners will have a strong foundation after reading this book. Experienced professionals will reference it frequently.
Book Review: IBM InfoSphere: A Platform for Big Data Governance and Process Data Governance

Find out how IBM is addressing the challenges of big data.
Book Review: Fundamentals of Technology Project Management

Projects can be overwhelming, but taken in small, deliberate steps, all projects are achievable.
Book Review: Customer Experience Analytics

Use CEA as a strategic weapon to stay ahead of your competitors.
Book Review: Big Data Analytics: Disruptive Technologies for Changing the Game

The disciplines of data analytics are evolving to meet the new challenges of big data.
Book Review: IBM i Security: Administration and Compliance

If you have any interest in IBM i security, whether as an administrator, a programmer, or an auditor, then this book is the perfect resource.
Book Review: DB2 9.7 for Linux, UNIX, and Windows Database Administration (Exam 541)

This book, written by the creator of the certification exam, reveals exactly what you'll need to know to prep for the test.
Book Review: Selling Information Governance to the Business

Who governs the information that runs your company?
Book Review: You Want to Do WHAT with PHP?

If you're serious about programming in PHP, get a book that treats you that way.
Book Review: The IBM i Programmer's Guide to PHP

Both a primer and a reference, this book is a must-have for anyone who wants to program in PHP.
Book Review: JavaScript for the Business Developer

There's no faster, easier way to become proficient in JavaScript.
Book Review: SOA for the Business Developer

If you want to know how SOA works in the real world, this is your book.
Book Review: DB2 9 Fundamentals

Whether you want to obtain an IBM certified DB2 professional certification or simply become well-rounded in the fundamental concepts of DB2 and general database theory, this is your book.
Book Review: The Modern RPG IV Language, Fourth Edition

This book isn't a training manual; it's a reference book.

Resource Center

How to Modernize Fast and Within Budget (Quick Guide)
Why Migrate When You Can Modernize?

Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.
Resource Center

The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit
IBM i Transformation Risks Every Business Leader Should Know

Join us for this hour-long webcast that will explore:
What to Do When Your AS/400 Talent Retires

IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn:

Analytics & Cognitive Categories

Latest Analytics & Cognitive News

Career Catgories

Latest Career News

Cloud Categories

Latest Cloud News

IT Infrastructure Categories

Latest IT Infrastructure News

News Categories

Latest News

Programming Categories

Latest Programming News

Security Categories

Latest Security News

Typography

Share This

The Need for SQE

V6R1 SQE Under the Hood

How to Prepare for SQE

Additional Material

LATEST COMMENTS

MC Press Online

Support MC Press Online

Book Reviews

Book Review: Extract, Transform, and Load with SSIS

Book Review: 21st Century RPG: /Free, ILE, and MVC

Book Review: Developing Business Applications for the Web--With HTML, CSS, JSP, PHP, ASP.NET, and JavaScript

Book Review: DB2 10.5 Fundamentals for LUW: Certification Study Guide (Exam 615)

Book Review: DB2 11 for z/OS Database Administration—Certification Study Guide

Book Review: Free-Format RPG IV, Third Edition

Book Review: IBM i Security Administration and Compliance, Second Edition

Book Review: Programming in ILE RPG, Fifth Edition

Book Review: DB2 10.1/10.5 for Linux, UNIX, and Windows Database Administration: Certification Guide

Book Review: Subfiles in Free-Format RPG

Book Review: Evolve Your RPG Coding: Move from OPM to ILE ... and Beyond

Book Review: Database Design and SQL for DB2

Book Review: The Chief Data Officer Handbook for Data Governance

Book Review: DB2 10 for z/OS: The Smarter, Faster Way to Upgrade

Book Review: 5 Keys to Business Analytics Program Success

Book Review: DB2 11: The Ultimate Database for Cloud, Analytics, and Mobile

Book Review: Flexible Input, Dazzling Output with IBM i

Book Review: Advanced Guide to PHP on IBM i

Book Review: Java for RPG Programmers

Book Review: DB2 10.1 Fundamentals: Certification Study Guide

Book Review: DB2 10 for Z/OS Database Administration: Certification Study Guide

Book Review: Developing Web 2.0 Applications with EGL for IBM i

Book Review: Advanced Integrated RPG