29
Fri, Nov
0 New Articles

Database Performance by the Numbers Feedback

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

My article “Database Performance by the Numbers” (in the January 2000 issue of MC) generated a lot of feedback. As I had hoped, many readers downloaded the code and ran the tests themselves and were kind enough to send me their results. This article addresses some of the criticisms raised about the benchmark and also gives a fresh perspective on the problems with AS/400 date and time stamp data types.

Your Test Table Is Too Small

I’ll admit that the WEBTEMP2 table used in the test did not have a lot of records, but that does not matter. I did the tests with 6,000 records, 12,000 records, 50,000 records, and 100,000 records; the records-per-second measurements were the same no matter how the table scaled. I was looking at the overhead of the programming environment and access method. Yes, the object was placed into memory, so I was looking at the overhead of the programming language mechanism and access method, not how fast the AS/400 read data from a disk drive. I did not care about disk I/O; I wanted to know how expensive the language and access method was.

RPG Can Do Calculated Fields Using OPNQRYF

Yes, that is true. However, Open Query File (OPNQRYF) does not support some of the calculations and scalar functions available in SQL. OPNQRYF is also a dead end: IBM will not allow access to views using new SQL features with OPNQRYF beyond what is available in V4R5. Also, in the test looking at the overhead of the access mechanism and programming language, calculated fields are going to be overhead of the database manger, not the programming language and access method. I would, however, like to test OPNQRYF against SQL in the future and welcome ideas from readers on the types of queries and functions they would like to see tested.

You Should Have Used an Access Path in Your Query

OK. But this was not a test of how fast the AS/400 can read data using an access path. This was a test of the overhead of a programming language and the access method used.


You Didn’t Read the REQTS Field in the RPG Program

One reader pointed out that the RPG SQL program was not reading the REQTS field into his result sets like all of the other programs in the benchmark. Yikes, this is true! I will be the first to admit that I am not an RPG programmer, but I should have caught that one. What’s funny is that several people at MC, IBM Rochester, and IBM Toronto looked at the code for this article, and no one caught that difference.

Both RPG programs were tasked to read the fields from the WEBTEMP2 file and load them into an array. Both programs were loading all fields except the REQTS time stamp field, so their numbers were not comparable to those of the other programs. So I modified the programs to retrieve the results: RPG using SQL, 24,377 records per second; RPG using record-level access (RLA), 13,244 records per second.

I found a lot of interesting things when I changed the program to grab the time stamp data into the array. The records-per-second times did fall, but SQL was still almost twice as fast as RLA. So I did additional tests in which I converted the time stamp field into a character representation of a time stamp. Guess what? SQL and RLA ran at almost the same speed. So there seems to be a problem with RLA reading time stamp fields, since it’s half the speed of SQL when doing so.

To explore these performance differences, I made several copies of the WEBTEMP2 table. In one copy, I changed the data type of the REQTS field from time stamp to char(27). In another copy, I changed the REQTS field from a time stamp to a date. Finally, I changed the REQTS field from a date into a char(10) column to hold the date value. I then made multiple copies of my SQL and RLA programs and modified them to read the various table copies, to test the performance of converting date and time stamp values. Figure 1 shows the results.

What’s interesting is that RLA is faster than SQL when the data is not a date or time stamp, and SQL is twice as fast as RLA when the field is a date or time stamp. Also, note that there is significant overhead associated with reading the time stamp and date columns in SQL. An SQL program reading a time stamp as a char(26) can read 50,000 records in 652 milliseconds. The same program reading the data as a time stamp field requires 1,910 milliseconds to do the same operation. It takes 2.9 times as long to read a time stamp as a character representation of a time stamp.

Now look at the record-level access times. Reading a char(26) representation of the time stamp, RLA reads 50,000 records in 487 milliseconds. It takes 7.5 times longer (using RLA) to read the same number of records if the data is stored in a time stamp column. This is “patheticsad,” which means it is both pathetic and sad at the same time.

Why does it take so much longer to read the time stamp and date columns? It has to be something in the way the AS/400 is converting the date and time stamp from the stored representation to the program representation of the data type. You see, the time stamp and date columns store the date portion as a 4-byte integer, called a Scaliger number. This type of date reckoning, called Julian Period, was invented by Joseph Justus Scaliger (1540-
1609). Scaliger named his dating scheme after his father, Julius, who was an Italian physician and scholar. Scaliger’s date system has day number one beginning at noon on January 1, 4713 B.C. This date represents when three major cycles began on the same day: the 28-year solar cycle, the 19-year lunar cycle, and the 15-year Roman tax calendar (called the indiction cycle).

Since the AS/400 uses the Scaliger dating system to store dates in the database, to format a Scaliger date into a string representation of the date requires several division operations. How well you write this conversion can significantly change the speed at which you can manipulate date and time stamp columns. Also affecting this is how well your machine can do integer arithmetic, since the AS/400 most likely stores this data as a 4-byte integer.

Figure 2 shows some simple programs I wrote to test the performance of packed, zoned, binary integer, and floating-point division on my AS/400 in RPG. Notice that there


is a huge difference between the division of floating-point numbers and decimal and binary integer numbers. To further complicate this, I did the test in C using a long, which is a 32- bit integer, and achieved a time of 636 milliseconds on 1,000,000 calculations. Then I tested casting the 32-bit integer into a double-precision number before performing the calculation, and the processor did the calculation in only 524 milliseconds. This points to two things. First, the RPG compiler is generating very slow integer division code. Second, even with the overhead of casting to floating point in the C program, the processor is very efficient for integer operations. With most chips, you would see a small performance penalty, not a gain, for the conversion I did in the C program. This points to the excellent architecture of the PowerPC chip and how well the C compiler exploits it. It also points that IBM has a lot of work to do on the RPG compiler and the way it handles division operations. This test might also be indicative of the problem in converting the time stamp data type in the RLA programs.

The only conclusion I can draw at this point is that I am not working with a processor limitation on the time required to format time stamp and date fields (as evidenced by the fine times turned in by the C programs and Java programs). I am working with non- optimized code generated by the RPG compiler. This needs to be addressed immediately by IBM. The PowerPC chip is an awesome chip; let’s get some of that performance into our RPG programs.

Another observation you might make from this information is that, even if you are not referencing a date or time stamp field in your code, if you access a file that contains these data types in its field collection, you still have to convert the time stamp and date data types once. This is a byproduct of reading the record using the RLA functions. RLA always buffers the entire record when reading, so conversion must be done at least once. You should consider SQL if you need to access large recordsets of information in a table that contains these data types, as SQL is faster at converting the time stamp and date columns; also, if your program is not accessing these columns in its read, you will avoid the overhead entirely.

Keep Those Letters Coming!

Some have complained that this is not a true benchmark, which is most definitely the case. I’m not IBM Rochester, and the folks at Rochester aren’t sending me any money to design the be-all end-all of performance tests. This is a simple set of programs I cooked up to test one aspect of the AS/400 and its operations: the overhead of the programming language and access method. I will be the first to admit that I am not an RPG programmer, nor a Java expert. I am a simple client/server C/VB/SQL/ODBC guy who wants to find the best way to do operations on the AS/400, because this is the question that I am asked by my customers and readers. This is the second in a series of articles on performance in which
I’m going to look for that information, and I’m going to put the code I use on the Web so that you can test it yourself and send in your tips, tricks, kudos, and admonitions. I don’t claim to have all of the answers, but I know that, if we continue to design simple tests and write them in multiple languages and put the code out to the midrange community, we can all find the answers—and have a lot of fun in the process.

REFERENCES AND RELATED MATERIALS

• The Julian Period information page: http://booksrv2.raleigh.ibm.com/cgi-bin/bookmgr/ bookmgr.exe/NOFRAMES/SCIENCE4.25.3


Method Data Type Seconds

RPG RLA time stamp 3.658 RPG RLA char(26) .487 RPG SQL time stamp 1.910 RPG SQL char(26) .652 RPG RLA date 1.777 RPG RLA char(10) .475 RPG SQL date .881 RPG SQL char(10) .648

Figure 1: These are the results of RPG RLA and SQL versus various data types, reading 50,000 records.

Type Compiler Seconds Ops/Second

Zoned RPG DIV 5.912 169,147 Packed RPG DIV 5.038 198,491 Integer RPG DIV 18.056 5,383 Float RPG DIV .406 2,463,052 Zoned RPG EVAL 37.502 26,665 Packed RPG EVAL 36.827 27,153 Integer RPG EVAL 46.453 21,527 Float RPG EVAL .405 2,469,133 Float C .136 7,352,934 Int C .636 1,572,352 Packed C .867 1,153,401 Float Java .374 2,673,794 Int Java .674 1,483,678

Figure 2: These are the results of RPG, C, and Java doing division with various data types 1,000,000 times.


BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: