Built-in functions (BIFs) are powerful on their own, but together they can perform some fantastic feats!
In a previous article, we touched on how we can use built-in functions (BIFs) in I/O operations, a subtle but potent result of the syntactical enhancements in RPG. Using BIFs (and other expressions) in I/O operations reduces code complexity by removing work variables. Another way to reduce code complexity is just to remove lines of code. This article provides an example of how combining BIFs can do just that.
The Problem with Data
The focus of this article is that old bugaboo: dirty data. No matter what you do, you can get dirty data. It might be an unprintable character in a description or a comma in a number or a dash in a zip code. Users like to put things in their data that make them a little more human readable, but unfortunately human readability doesn't always translate well to computers. When that happens, we find ourselves writing code to strip those characters out.
In the bad old days of RPG, this sort of cleanup involved MOVEA, a couple of indexes, and a lot of looping. It wasn't pretty, and the code was typically messy and prone to error. Let's take a simple example in which we want to remove the following characters: apostrophes, quotation marks, ampersands, and hashtags. In this example, we're simply going to remove them entirely from the string. Another option would be to replace them with a specific character, which would require a single BIF. However, for this example I want to show you how we can chain two BIFs together, so we're not just going to replace the offending characters but instead entirely remove them.
Let's start with the original RPG III code. You've probably done code like this a million times in the past.
E WK1 30 1
E WK2 30 1
C*
C *ENTRY PLIST
C PARM IFIELD 30
C*
C MOVEAIFIELD WK1
C Z-ADD*ZERO X 30
C Z-ADD*ZERO Y 30
C 1 DO 30 X
C WK1,X IFNE *BLANK
C WK1,X ANDNE'/'
C WK1,X ANDNE'-'
C ADD 1 Y
C MOVE WK1,X WK2,Y
C ENDIF
C ENDDO
C*
C MOVEAWK2 IFIELD
C IFIELD DSPLY
C MOVE *ON *INLR
The code is straightforward, although due to the simple opcodes and the columnar nature of RPG III, it's not necessarily obvious. The function of this program (and the programs that follow) will be to strip periods, slashes, and embedded blanks from the input parameter and return the result. For debugging purposes, before we exit we display the result. The code to do this employs an obsolete opcode and some basic array manipulation. The input field IFIELD is moved into the source array WK1 using MOVEA. The program then loops using two indexes: X is the source index, Y is the target index. As the X index is incremented, each position in the source array is checked for one of the invalid characters. If it is not any of those, the target index is incremented and the character is moved into the target array WK2. After the loop is done, the target array is moved back into the input parameter to be passed back to the caller, and finally for debug purposes the formatted string is displayed.
Note the use of the MOVEA opcode. Since this opcode is unavailable in RPG /free, you need to be a bit more creative to get it to work. Primarily, it involves a data structure and the OVERLAY keyword. This gets around the MOVEA limitation. You can find the V5R4-level code below, although I don't intend to go through it in much detail. In many ways, the code (especially the FOR loop) is just a slightly modernized version of the RPG III code. The one major change would be the data structure named "data"; it allows me to reference the same memory as both a 30-character field named myField and a 30-position array named aField. Other than that, the code is pretty self-explanatory.
d data ds
d myField 1 30
d aField 1 dim(30) overlay(data)
d aWork s 1 dim(30)
d x s 3u 0
d y s 3u 0
d DATASCRUB pr
d iField 30
d DATASCRUB pi
d iField 30
/free
myField = iField;
y = 0;
clear aWork;
for x = 1 to 30;
if aField(x) <> ' '
and aField(x) <> '/'
and aField(x) <> '-';
y += 1;
aWork(y) = aField(x);
endif;
endfor;
aField = aWork;
iField = myField;
dsply myField;
*inlr = *on;
/end-free
You'll note that I specifically said V5R4; that's because I have a prototype for the mainline. That's required, and even has to match the program name. I've always been uncomfortable with the idea that inside program MYPGM I have to have a line of code that has the literal value MYPGM in it; it's always an opportunity for yet one more programming mistake. Thankfully, that particular restriction has been relaxed quite a bit in later releases. Combine that with the new more powerful APIs, and the line count goes down dramatically. So let's finish up the article with the lean, mean V7.1 version of the program.
d pi
d iField 30
/free
iField = %scanrpl(' ':'':%xlate('/-':' ':iField));
dsply iField;
*inlr = *on;
/end-free
This is much more like it. In the D-specs, the prototype is gone, leaving only the procedure interface, which takes the place of the *ENTRY PLIST in older code. As you can see, pretty much all of the D-specs are gone, because we don't need any array manipulation; we use the new BIF %scanrpl to do the yeoman's part of the work with some help from the %xlate BIF. Let me take some time to walk you through that one single line of code.
First, we look at the %xlate BIF that takes up the right side of the line. That BIF takes three parameters (it also has an optional fourth parameter that we won't using in this example). The last parameter is the string to translate, while the first two identify how to translate it. The first parameter is a list of characters that will be translated, and the second parameter defines what to translate each of those characters to. This is a one-to-one relationship: each character in the first string must have a matching position in the second string (be careful with this; if your second parameter doesn't have as many characters as the first, the extra characters in the first parameter are ignored). This is a very flexible design; we sometimes see this BIF used to translate lowercase letters to uppercase using the parameters 'abc…xyz' (with the ellipsis representing the other 20 letters) and 'ABC…XYZ' in the second parameter. In this case, though, we're doing something a little different: we're converting the characters we want to remove to spaces. So the %xlate returns the original string with all unneeded characters replaced with spaces.
Now we see how chaining BIFs works: we take the result of the %xlate BIF and use it as the input parameter to the new %scanrpl BIF. %scanrpl takes the contents of the third parameter and replaces every instance of the first parameter with the second parameter. This can be anything from a single character to a whole phrase; it's a great way to perform substitutions of all kinds, including placeholders for formatted data. But in this case, we do something just a little clever: we replace every single blank with a zero-length string, which effectively removes the blank from the source string. Clever indeed!
So this single line of code does two things: first, it replaces all dashes and slashes with spaces; then it removes all spaces. This single line does all the work we did with our original array work, and thus "ABC-DEF GHI/JKL" comes out as "ABCDEFGHIJKL". No arrays, no indexes, no work variables (all of which are opportunities for programming mistakes). Perfect!
It's probably a good idea to take a little time to familiarize yourself with any new BIFs (or BIFs that are new to you!) and see if you can incorporate them into your algorithms. And keep coming back here for more BIFs! Happy programming!
LATEST COMMENTS
MC Press Online