MONITOR takes RPG into the next level of error detection, and this article introduces one of the most useful aspects of this new opcode.
RPG has come a long way since the H1 halt indicator. Back in the day, you issued a halt when a program encountered a completely unexpected condition--usually application errors such as a master record not found or a divide by zero, although fatal errors could also include programs not found or files full. The current height of technology is to use the Program Error Subroutine (PSSR) to handle unexpected errors, but this article will show you an even cleaner and easier way to handle errors: the MONITOR opcode.
Using PSSR and INFSR
Using the PSSR is simple enough; you just declare a subroutine named *PSSR (the asterisk is required), and all unhandled program-level errors will trigger a call to that subroutine. Note that I said "unhandled"; errors specifically handled in the code by using an error indicator or an error extension to the operation code (e.g., CALLP(E)) will not cause transfer to the PSSR. But other program errors invoke the PSSR.
Note that file errors do not automatically go to the *PSSR routine. Instead, you must specify a file-error subroutine (INFSR) for the file. This is done at the file level, specifying an error routine for each file. In an effort to consolidate error processing, you may also see code in which the program status subroutine is also used to handle file errors, so the INFSR is routed to the program status using the following keyword on the file specification: INFSR(*PSSR).
For me, the biggest weakness in the PSSR concept is its poor handling of nested errors. If an error occurs inside of a PSSR, it will invoke the PSSR again, recursively. So unless you have a bunch of code in place to detect the nested failure, your program will go into a hard loop. It doesn't help that this approach also requires non-intuitive and somewhat inconsistent syntax.
Enter the Monitor
The MONITOR opcode removes a lot of the ills of the PSSR/INFSR construct. Let's revisit my error code from a previous article:
// Special monitor logic.
// This will trigger a dump for any internal error.
monitor;
exsr mainline;
on-error;
dump;
endmon;
*inlr = *on;
This is the mainline of every new production program I write. The actual program logic is then coded in the subroutine mainline. The code above is the simplest version; if any error is encountered, the program dumps itself and then returns. If this were a report program in the middle of a jobstream, the job would continue running, but you'd see a dump instead of this report (or in addition to a partial version of the report).
And while this is a very simplistic version, it's actually perfectly fine for situations that don't need a lot of error-handling--for example, if this were an ad hoc report that the user was going to look at immediately anyway. However, I think we can agree that even the simplest report could use a little more robust error-handling. At the minimum, I suggest writing a simple CL program that can be called right after dumping the program to send a message back to the user or to the system operator.
One of the benefits of the monitor concept is that it handles nested errors. If for some reason the CL program bombs (or doesn't exist), you'll get a standard escape message and the job will go into MSGW status. Not perfect, but I think it's a reasonable response to a fatal error in your fatal-error-handling routine, and in any event it's better than an infinite loop.
Variation on the Error-Handling Theme
Other variations exist. If your program is already set up with a return-code field, you could always set that field to indicate that an internal error has occurred before returning to the caller. On the other hand, some errors may indeed be so bad that you want to halt the job; you can do that by sending an escape message to the calling program.
At the far end of the spectrum, you can even have different responses for different errors. The on-error clause allows you to identify a specific error code; it makes the monitor work very much like a case statement for errors. Here's an example:
monitor;
exsr mainline;
on-error 1217;
callp halt('File ORDER not found - check library list.');
on-error *FILE;
dump;
callp halt('File error occurred - check dump.');
on-error;
dump;
callp info('Program error occurred - check dump.');
endmon;
*inlr = *on;
In this situation, the monitor checks for three different situations. First, it checks specifically for a file-not-found error (error 1217). If that's the case, it halts the jobstream because the likely problem is a bad library list, and other programs may fail as well. The call to the external procedure halt will send an escape message to put the job into a message wait state. Next, it looks for any file errors. Again, it halts on those errors because it indicates that there are serious environmental issues. However, since this could be any file error (as indicated by the special value *FILE for the error number), it dumps the program and directs the user to the dump. Similarly, on a program error, the user is directed to a dump, but since program errors are most often logic problems, this type of error triggers an informational message (by calling info rather than halt), which notifies the user but allows the job to continue.
So that's the short version of the monitor opcode. I hope I'll get a chance to revisit it in more detail in a later article.
LATEST COMMENTS
MC Press Online