Improve your application and system performance by eliminating the usage of QTEMP objects.
After recently returning from five years of working on IBM Watson solutions, I was hoping that IBM i developers had moved away from using QTEMP tables with SQL. Yet the first two customer projects that I worked on featured SQL performance issues caused by improper usage of QTEMP. Based on this recent experience, I thought it would be good to highlight the top three reasons why programmers should not mix SQL with QTEMP in application code.
#1: QTEMP Tables Force Data Copies
QTEMP tables are often used in SQL programs to create a smaller set of data to work on or break a complex query into smaller, simpler steps. For example, an application needs to join detail data with summarized data. While these are reasonable methods from a programming perspective at a high level, these approaches cause suboptimal performance because the first step is creating a copy of the data. It takes time and system resources to make a copy of the data in a QTEMP table.
These physical copies of the data can often be avoided by using SQL constructs to accomplish the same goal with logical data sets. Those SQL features include Views, Common Table Expressions, and Derived Tables. In the code below, an SQL View is used to logically summarize the average turnaround time for all the line items in an order. This logical aggregate is then joined to customer and order details to produce a report that can be analyzed for possible improvements to long turnaround times.
CREATE VIEW turnaround_time AS
SELECT orderID, AVG(days_to_receipt) avg_turnaround
FROM orders
GROUP BY orderID
SELECT custName orderID, avg_turnaround
FROM customers, orders o, turnaround_time tt
WHERE custID=ordCustID AND o.orderID=tt.orderID
GROUP BY avg_turnaround DESC
Using a logical construct gives the Db2 query optimizer a chance to avoid copying the data. Even if the optimizer decides that making a copy of the data is the fastest method, the Db2 engine makes a copy of the data at a low level, which is faster and more efficient than a program populating a QTEMP table.
#2: QTEMP Tables Handcuff the Query Optimizer
Speaking of the optimizer, the optimizer heavily depends on the availability of indexes (and keyed logical files) existing on the tables referenced in an SQL request. Indexes provide the optimizer statistics about your data (e.g., column selectivity) and provide a way to speed query execution since they can be used to quickly sort or collate data. Based on these facts, go compare how many indexes exist over the permanent tables in your database versus the QTEMP tables used by your applications. That’s an easy comparison because QTEMP tables rarely have indexes defined over them. As a result, the query optimizer is limited when it comes to determining the best way to run a query against QTEMP tables.
#3: QTEMP Tables Inflate the SQL Plan Cache
After the query optimizer decides the fastest method to run your SQL, that information is stored in a data structure known as an access plan. That access plan gets stored in the SQL Plan Cache. To minimize the number of plans in the cache, Db2 tries to share plans for SQL statements running across different jobs and connections on the system. An access plan can be shared only if the SQL being executed exactly matches the original SQL request that was used to create the plan cache entry.
The access plan for the SQL request in the code below will never be shared across jobs. If there are 10 jobs all running this same SQL, each job will have to create an entry in the SQL Plan Cache. While the name and definition of the temporary table is the same in each job, the object address of the temporary table is different because each job has its own unique instance of the QTEMP library.
SELECT col2 FROM qtemp/tempTab1 WHERE col1>5
If this SQL were changed to reference a permanent table, then there would be one plan cache entry instead of 10 separate plan cache entries. The “extra” QTEMP plan cache entries could cause plans for other SQL statements to be pruned from the cache or cause Db2 to increase the size of the SQL Plan Cache, thereby increasing the temporary storage usage on your server.
Making Good Choices
Hopefully, this article challenges you to review the usage of QTEMP in your SQL programs and to replace it with logical SQL alternatives in order to boost performance.
LATEST COMMENTS
MC Press Online