A common task in data processing is to build an array of data that meets complex criteria, and RPG provides many ways to make that task easier.
I think arrays are fantastic. I love how they can be used to represent so many things: the lines in an order, the stores in a city, or the cards in a poker hand.
When SQL Just Isn’t Quite Enough
Nowadays, whenever I need to extract a subset of elements from the database, my natural inclination is to use SQL. The ability to flexibly define tables, columns, relationships, and selection criteria makes it the easiest way to get my data. But SQL isn’t a silver bullet; it has its limitations. The query processor, intelligent as it may be, can only guess at the best way to process the data. As the selection criteria get more complex and the size of the table gets larger, the SQL engine may choose paths that just take too long. And while you may be able to mitigate these issues, it is sometimes just easier to write the logic yourself in RPG. This is why I like to refer to RPG as “assembly language for the database.”
Here’s our example. I need to read through the transaction history file for a given day and build a list of lots. There’s a complex set of tests required to make sure the lot should be included, and once it passes those tests, I add it to the array. But there’s one additional little twist: the lots sometimes get a second transaction that clears out the lot. So what we need to do is simple: If the lot exists, clear it; otherwise, add it. Let’s get to it.
dcl-c C_MAXLOTS 1000;
dcl-s aLots like(THLOT) dim(C_MAXLOTS);
dcl-s #Lots int(5);
dcl-s i int(5);
Above are the basic definition statements for a simple array. I call it “simple” because the element is just a simple field as opposed to a data structure—in this case, defined to be like the THLOT field. I start by defining the size. I always use a constant to define the size of the array because it comes in handy whenever I need to test the index. In my programs, constants are uppercase and start with C_. The next two statements describe the array: aLots is the array itself, while #Lots is the current number of elements in the array (or in other words, the highest element index). Since arrays have a fixed size in RPG, I have to keep track of the number of elements in the array myself. I like the naming convention of using the prefix a for the array and # for the count. It’s easy for me to remember. I also define the index i, which is used as a work variable in the program.
clear #Lots;
setll (iDate) TRANHIST02;
dow (1=1);
reade (iDate) TRANHIST02;
if %eof(TRANHIST02);
leave;
endif;
This is the top of the loop. I’ve discussed this RPG design pattern in the past: initialize, position, loop forever, read next, and exit on EOF. In this case, the initialization is very simple: Just clear the highest index. This effectively empties the array. Positioning usually involves the SETLL opcode, which in my example just uses the date that was passed in. And then we have the forever loop of DOW (1=1), which just loops forever. Two quibbles here: I wish there were just a simple DO opcode that wouldn’t require any forced syntax, and on a more technical note, I wish you could set a breakpoint on the actual DO instruction (if you try to set such a breakpoint, it actually sets the breakpoint on the first line after the DO opcode, which can be a little bit of a problem). Next is the read and exit code, which should be pretty easily understandable: a READE following by a check of the %EOF BIF.
if not valid();
iter;
endif;
These lines represent any additional validation that you might need. This might include more complex database navigation or other types of operations that don’t lend themselves to a pure SQL syntax. Purists may not like the ITER operation, but I find it to be a very powerful technique that allows me to effectively cancel the processing of a record as soon as I hit any condition that eliminates it. It’s then up to me to write the validation logic in the most efficient manner.
i = %lookup( THLOT: aLots: 1: #Lots);
Here’s the meat of the program and the focus of the article. The %LOOKUP BIF has a number of variants, and the one shown here allows you to search for a value within just a subsection of a larger array. The way I use it here, I search only the beginning of the array, from position 1 to the highest occupied index. So even though there are 1000 elements in the array, if I’ve added only two entries, then #Lots will contain the value 2 and the %LOOKUP BIF will check only those first two elements.
if i > 0;
clear aLots(i);
else;
#Lots += 1;
aLots(#Lots) = THLOT;
endif;
This is the business logic. If the entry is found, then this is the second one that reverses the lot, so I clear that entry. Otherwise, I didn’t find the lot, so I increment the highest index and then populate that entry with the new lot number. At that point, the sub-array that is searched includes the newly added lot.
if #Lots >= C_MAXLOTS;
leave;
endif;
enddo;
This is the end of the loop, including the index check to make sure we don’t get array index errors. If we haven’t completely filled the array, we just jump back to the top of the loop and try to get another record. And that’s it! The next section of the program would then spin through the array and process the lots that have been selected.
Comments and Caveats
I know there are some obvious issues with this approach. The most obvious deficiency is that any lots over the maximum number are simply ignored. A more subtle problem has to do with the lots that are reversed; they leave blank spots in the array, and those blank spots use up index positions. A more robust approach would compress out the blank spots either as they occurred or when the array is filled. Another option would be to dynamically resize the array when the limit is reached.
As a final note, this article outlines a purely native I/O approach. However, there are cases in which you can use SQL to provide a high-level selection and then use the validation routine to perform a finer-grained record selection. In such a hybrid architecture, initialization consists of opening an SQL cursor, while the read step is a FETCH NEXT. One of the most powerful features of RPG as it has evolved is the seamlessness of the integration between native RPG and SQL.
Feel the Power
Arrays are very powerful. I hope you can use this article to help you take advantage of them.
LATEST COMMENTS
MC Press Online