18
Sat, Jan
2 New Articles

Performance and Parameters

RPG
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

When I wrote the RPG xTools, I included a procedure that converts DB2/400 files to comma-separated values (CSV) format. Originally, I used this routine on a number of small files containing roughly several hundred to a few thousand records. It seemed to work quickly enough.

But then, one customer wanted to convert 46 million records. So I ran a small test over approximately 46,000 records to see how long it would take to convert to CSV. It seemed odd, but the tests showed correctly showed that it would take a few weeks to convert the file.

Of course, this situation was unacceptable. So I went back to the drawing board and did some performance studies and analysis. It turns out there were some issues with redundant logic and a few situations in which I was declaring rather large local variables that we also initialized each time a field in a record was converted.

Reengineering the CSV conversion routine wasn't all that challenging, but it was a lesson in optimization. Sadly, it seems that bad coding (coding for poor performance) is rather easy to do in RPG IV.

It's the Parameters, Stupid

It turns out that things we do in RPG IV to pass parameters to subprocedures can work against us. Certainly, there is some expected, yet subtle overhead in the call performance. But the increased overhead of adding CALLP (CALL with prototype) to a procedure should not make or break a routine.

In fact, back when IBM was trying to sell ILE (called "NPM" during development) to some industry analysts, one insider asked, "What's the benefit of reengineering everything just to try to make the CALL run faster? Why not just make the CALL faster?" IBM's answer was that languages other than RPG III and CL didn't run well in OS/400 and they had to create a single runtime environment for everything. The insider went on to ask, "Will the CALL be faster as a result?" The IBMer responded, "It better be, or somebody will get shot!"

Ironically, based on the V3R1 tests I ran, somebody in Rochester should have been "shot." But I digress....

Today, nearly everybody is using RPG IV, and many are trying to move to subprocedures. But there has been a somewhat quiet debate stirring regarding the performance of subroutines vs. subprocedures. As I mentioned in a recent article, EXSR is nothing more under the covers than a GOTO/branch instruction, whereas a procedure call requires several more instructions. Consequently, there is more overhead in a subprocedure call than in EXSR. No question about it.

But is it reasonable or excessive? It turns out it could be better, as it always can, but it is very acceptable. The problem involves with the extremely poor way IBM has implemented return values and parameter passing to subprocedures.

While infinitely faster than program-to-program calls, subprocedure calls can suffer if you pass parameters conveniently rather than effectively.

I recently ran a series of tests to timestamp study the overhead of calling a procedure with various types of parameter settings. Each variation accomplished the same task, but the results of passing parameters one way as opposed to another were drastic. Let's look at the various styles of parameter passing.

Before we begin, let me say that passing parameter using the standard or default method of "by reference" is very efficient. In fact, it is the second most efficient method, according to my test results.

Constant Parameters

Parameters passed by reference that also include the CONST keyword are becoming more and more popular. When it comes to performance, the CONST keyword is a good performer, but it can also cause problems if used blindly.

A fast-growing practice with character parameters is to include the VARYING keyword along with the CONST keyword. This allows the parameter to accept both fixed-length and variable-length character fields. The compiler converts fixed-length fields (or literals) to variable-length fields automatically when the procedure is called. It does this by copying the data to a compiler-generated temporary variable.

This extra step adds additional overhead to the procedure call. Why? Because copying the parameter value from a fixed-length field to a temporary variable-length field takes a little time.

Here's an example of the CVTCASE procedure prototype using CONST but not using VARYING:

      **  CONST parm and fixed return value.
     D CvtCase         PR         65535A   
     D  InString                  65535A   Const 

Here is the same prototype with the addition of the VARYING keyword:

      **  Varying const parm and fixed return value.
     D CvtCase         PR         65535A   
     D  InString                  65535A   Const Varying

The benefit of having VARYING in addition to CONST is that the procedure can use the %LEN built-in function to determine the length of the incoming parameter value. Also, most RPG IV opcodes are optimized to work more efficiently with VARYING fields since they only process the data in the field as indicated by the field's current length.

So are CONST and VARYING bad? No, but it if VARYING isn't necessary to the success of a procedure, don't use it.

The CONST keyword on the other hand, while offering you greater flexibility for your parameters, can not only speed up a procedure call, but also slow it down. For example, if you define a parameter as 7P2 (7 packed with 2 decimals) and include the CONST keyword, the compiler allows you to pass not only a 7P2 field, but also any numeric value. So if you specify a literal or even a 4-byte binary value, the compiler will convert that value into 7P2 and send it to the subprocedure.

Normally, this is great, but it can slow down the procedure call because everything except 7P2 values are copied to temporary result fields, and then that copy is passed to the procedure. Again, additional overhead.

Granted, copying numbers isn't as severe an issue as character fields, particularly large character fields, but you get the point.

Regular "By Reference" Parameters

I've already mentioned that passing a parameter by reference with CONST can be the fast way to call a procedure. But this only applies when the value being passed already matches the parameter definition.

Traditional "by reference" parameters offer the second-best performance when calling a procedure because they limit the format of the data being specified for the procedure to the parameter definition. That is, a parameter defined as a 7P2 value is limited to accepting fields that are defined as 7P2. So if you attempt to pass a literal (constant) or a packed field with a different length or different decimal positions, the compiler will give you an error.

Character fields have a similar restriction. When a regular character parameter is defined by reference, you have to pass a value that is the same length as or longer than the parameter definition. The compiler does allow longer values because the procedure may ignore those extra characters, but it does not allow shorter character fields to be passed because the procedure could touch the bytes not passed to it.

Here's an example of a procedure prototype with a parameter passed by reference.

      ** Fixed-length parm passed by reference.
     D CvtCase         PR
     D  InString                  65535A   

Since the compiler will only allow the above prototype to be passed a field that is at least as long as the parameter, we're sort of at a disadvantage if we specify 65535 for the parameter length.

The solution is to use the OPTIONS keyword. Specifying OPTIONS(*VARSIZE) for a parameter removes the size restriction from the parameter. You may pass shorter or longer values. Of course, by doing this, you're telling the compiler that you have a clever way to figure out if the caller sent you data that isn't the same length as the parameter.

One method used by OS/400 and i5/OS APIs and some RPG xTools subprocedures is to pass an addition "parameter length" parameter. This additional parameter tells the procedure the length of the original parameter. Of course, the caller has to provide accurate information in this case; otherwise, you could have a "learning experience." Here's an example:


      ** Fixed-length by reference parm with 2nd "length" parm.
     D CvtCase         PR
     D  InString                  65535A   OPTIONS(*VARSIZE)
     D  nLength                      10I 0 CONST

Note that I used the CONST keyword on the length parameter to allow the caller to specify something useful, such as %SIZE(myField). CONST allows this type of value. Now, it's up to my procedure to interpret the length as well as the input string correctly.

What About Return Values?

As you know, a procedure may return a value to the caller. When a procedure is called, the first line of the prototype or procedure interface statement often includes a definition. This definition identifies the kind of data that is optionally sent back to the caller. It allows you to use a procedure just like a function. For example, the following prototype defines a return value for the CVTCASE procedure:

      ** Illustrate return value
     D CvtCase         PR         65535A   
     D  InString                  65535A   OPTIONS(*VARSIZE)
     D  nLength                      10I 0 CONST

In this example, the return data is a 64K value. It doesn't have to match the parameter's attributes, but in this example, it does.

To call this procedure, you would use the EVAL or one of the conditional opcodes, such as IF or WHEN, as follows:

     C                   eval      name = CvtCase(name:%size(name))

When you specify a return value, you may also specify the VARYING keyword. This allows the procedure to return only the number of bytes necessary and theoretically return them faster. However, in practice, character return values greater than a few dozen bytes seem to be poor performers.

Here's the bottom line on return values: Use them for an easy user interface, but don't go overboard. If you need to call a procedure 10 million times and it has a 64K return value, consider changing the design to allow the returned data to be sent back via a second parameter instead a return value.

Pointers Are Faster Than "By Reference"

Wouldn't it be nice if you could make things run faster? Well, you can. But you have to use pointers. It turns out, again from my own tests, that passing an address of a field to a procedure is faster than passing the field. Here's an example of a prototype that allows this:

     D CvtCase         PR
     D  pInput                         *   
     D  pOutput                        *   
     D  nLen                         10I 0 Const

The tricky part is calling the procedure; you have to pass the address of the parameter's value. You do this with the %ADDR built-in function. For example:

     D p1              S               *
     D p2              S               *
     C                   eval      p1 = %addr(name)
     C                   eval      p2 = %addr(newName)
     C                   callp     CvtCase(p1:p2:%size(name))

In the example above, the address of the two pieces of data are retrieved and stored in pointer variables. Then, those pointers are passed to the procedure.

You can avoid the tricky assignment (copying to pointer variables) if you specify that the parameters are also CONST or VALUE, as follows:

     D CvtCase         PR
     D  pInput                           Const
     D  pOutput                          Const
     D  nLen                         10I 0 Const

Then, when calling CVTCASE, you would specify %ADDR directly on the procedure call:

     C                   callp     CvtCase(%addr(name):
     C                              %addr(newname):%size(name))

Again, my own tests showed that this method appears to run the fastest and to substantially reduce the overhead of procedure calls. In fact, the two fastest methods are by reference and pointer.

Make the Right Choice

The overall performance of any routine is going to be based on several factors, not the least of which is database I/O. But by understanding the options available to you and making the right choice of performance or function, you can eliminate some of the overhead introduced by using procedures over subroutines.

Bob Cozzi is a programmer/consultant, writer/author, and software developer of the RPG xTools, a popular add-on subprocedure library for RPG IV. His book The Modern RPG Language has been the most widely used RPG programming book for nearly two decades. He, along with others, speaks at and runs the highly-popular RPG World conference for RPG programmers.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: