29
Fri, Nov
0 New Articles

i5/OS Offers Native XML Support in V5R4

Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times
One of the hottest topics at conferences this year has been XML in general and, in particular, the native support added to RPG in V5R4. Demands for processing XML continue to grow in the System i arena, and as more and more of you are installing this release, I thought the time might be right to discuss this new capability of our favorite language.

Before we start, I should mention that much of the XML support relies heavily on the compound data structure support added to RPG IV in the V5R2 release. I will be briefly covering the relevant aspects of this support during the course of this article, but for a more comprehensive overview, click here.

Let's start by taking a look at the simple XML document that we will use for our initial explorations. The document contains the details of products produced by our company, including the category, product code, description, selling price, and quantity in stock. The details are grouped by category and also include a description and code for each category. Hopefully, the layout of the document will be obvious from the short extract shown below:
<Products>
  <Category Code="02">
        <Description>Toasters</Description>
    <Product Code="1234">
      <Description>Two slot chrome</Description>
      <MSRP>22.95</MSRP>
      <SellPrice>15.95</SellPrice>
      <QtyOnHand>247</QtyOnHand>
    </Product>
    <Product Code="2345">
      <Description>Four slot matt black</Description>
      <MSRP>35.75</MSRP>
      <SellPrice>23.95</SellPrice>
      <QtyOnHand>247</QtyOnHand>
    </Product>
  </Category>
  <Category Code="14">
    <Description>Coffee Makers</Description>
    <Product Code="9876">
      <Description>10 cup auto start</Description>
...

RPG IV's most powerful XML support, the new XML-INTO opcode, operates by matching the names and hierarchy of the elements in the XML document to a matching data structure (DS) hierarchy in the program. So the first step is to build the required DS.

As you will note, the root of the XML document is an element named Products. So our starting point is to create a DS with this name (see label A in the code below). Note that I have used the keyword "Qualified" in defining this structure; the reason for this will become apparent in a moment. Next in the XML comes the element Category. If this were the only other element in the document, we could simply list it as a subfield in the DS, but as you can see, it is not. Category is a compound element containing both the category description and the details of all products within that category. Fortunately, V5R2 provided us with this capability in the form of the LIKEDS keyword. So, at label B, we simply define the subfield category as looking like the DS category! If you are unfamiliar with these new DS capabilities, this may seem strange to you; after all, RPG can't have two things with the same name, can it? This is where the keyword "Qualified" that I mentioned earlier comes into play. By adding this keyword to the definition of the DS, we changed the name of the category field to products.category. That is to say that the name category is qualified by the name of its parent DS. In fact, the use of the keyword "Qualified" is compulsory for any DS that contains the LIKEDS keyword in one of its subfield definitions.

Notice that I also added the keyword DIM(20) to the definition of products.category since Category is a repeated element. This ability to dimension a DS as an array, as opposed to being limited to the old multiple-occurrence data structures (MODS), is another feature of V5R2 and absolutely essential to being able to handle the nested elements contained within the vast majority of XML documents. In this case, our program makes the assumption that there will never be more than 20 categories. We will look later in this article series at how to handle situations where the potential number of elements takes us beyond RPG's current limits.

(A) D products        DS                  Qualified   
(B) D   category                          LikeDS(category) Dim(20)
                                                         
(C) D category        DS                  Qualified
    D   description                 20a                         
    D   code                         2a
(D) D   product                           LikeDS(product) Dim(50) 
                                                             
(E) D product         DS                  Qualified              
    D   description                 40a                
    D   code                         4a                  
    D   mSRP                         7p 2                 
    D   sellPrice                    7p 2
    D   qtyOnHand                    5i 0        

    D XML_Source      S            256a   Varying
    D                                     Inz('/Partner400/XML/Example1.xml')

Looking at the definition of the category DS (C) you can see that we have defined three fields, the first (description) matches the Description element. The second (code) is less obvious, but it's simply the result of the way XML treats attributes. Attributes of a compound element are considered to be at the same hierarchical level as any elements within it. If that sounds complicated, maybe this will make it a little more obvious. This code...

<Category Code="02">
   <Description>Toasters</Description>
</Category>

...is treated in XML as being equivalent to this code:

<Category>
   <Code>02</Code>
   <Description>Toasters</Description>
</Category>

The third entry (product) in the structure will be used to represent the repeated element Product and is therefore represented as a nested array DS (D). The actual definition of the DS is shown at label E.

Although it may seem strange to you at first, no actual data will be stored in the category (C) or product (E) data structures. They exist only so that they can be referenced via the LIKEDS keywords. IBM has indicated that in the V6R1 release, a new keyword will be added to the language to allow us to indicate directly that such DSes are to be used only as templates, not for data storage. In the meantime, if you don't want to "waste" the memory they occupy, you can simply add the keyword BASED to their definition. This indicates to the compiler that you will later set a pointer to indicate where in memory this DS actually resides. But you don't actually set the pointer since you will never (deliberately!) reference the fields in these DS. When I use this technique, I tend to code it like so:

BASED(this_is_only_a_template)

I do this in the hope that my fellow programmers will understand the intention behind the subsequent definitions.

So how do we reference (for example) the selling price of the first product in the first category? Simple: products.category(1).product(1).sellPrice gives us what we need! Yes, I know! I can hear it now: "Hey, Jon, that's a lot of typing!" There are many responses to such a statement, one of the more polite being "Yes, it is. Get over it!" But rather than respond directly, I will instead pose you a question: If you couldn't reference the field in this way, just how many lines of code would you have to type to be able to reference this field in any other way?

Of course, if you are using a decent editor, such as WDSC, then it really is not a problem as the code-assist function (Ctrl+Space) can pop up a list of candidate fields whenever you need it. You simply need to select the appropriate field from the list. Who said long field names require more typing! Speaking of WDSC, if you are still having problems understanding exactly how the data in the products DS is arranged, perhaps this screen shot of the WDSC outline view of the program will make things a little clearer.

http://www.mcpressonline.com/articles/images/2002/XML%20Processing%201V5--11210700.png

Figure 1: The WDSC outline view shows how the data in the products DS is arranged.

OK, so we have finally completed the DS required to map the XML document content. All that remains is to code the operations to actually parse the document and fill up the elements. Luckily, this is the easy part. The simple XML-INTO operation shown below (G) does the entire job.

(F) D XML_Source      S            256a   Varying
    D                                     Inz('/Partner400/XML/Example1.xml')

     /Free

(G)   XML-INTO products %XML(XML_Source: 'doc=file case=any');

 

The first operand of the op-code identifies the DS products as the target for the operation. The second operand is the new %XML built-in function (BIF), and it serves two purposes. First, it identifies the XML document via its first parameter (XML_Source), and second, it supplies processing options to the XML parser. In our example, we have specified two options.

The first, doc=file, informs the parser that the first parameter contains the name of the IFS file holding the XML document. As you can see at label F in our example, this is the fully qualified path name of the XML document. If this option is not supplied, the parser assumes that the field identified by the parameter actually contains the complete XML document.

The second option, case=any, specifies that element names in the document should be converted to uppercase before being compared to the names in the RPG DS. Other options include case=lower and case=upper. The case=upper option provides the best performance as it says that the element names are already in uppercase and therefore need no conversion. However, you will probably only be able to use this option if you control the definition of the XML schema. You can probably guess what case=lower means. Yup, the element names are all in lowercase and should be converted to uppercase. Although in theory this should perform better than case=any, in practice this is not true for reasons that I won't go into here.

That's all there is to it. This one simple little opcode does all of the heavy lifting for us, and once control is returned to our program, all of the data in the XML document has been parsed and placed in our products DS with numeric conversion where appropriate.

Well, actually it is far more likely that the program would have halted with an error message. Why? Because the version of the program as it stands will work only if there are a minimum of 20 different Category entries and if each of those contains 50 Product elements! Needless to say, such rigid conditions are unlikely to occur very often.

So how do we handle a situation in which the XML document contains fewer than the declared number of elements? The first thing we need to do is to add another entry to the %XML option list. The one that we need in this case is 'allowmissing=yes'. This tells the parser that it is acceptable if the document does not contain the exact number of elements that we identified in the array definition. However, there is a problem with this approach. The "allowmissing" option does not provide any degree of control. There is no way to say that we expect to get between one and 20 Category entries but that, for each Category, the Description element must be present. (Note: This type of error can be avoided by validating the XML document against its schema, but currently that has to be a separate operation outside of the RPG program. For more information on one approach to achieving this, see the Redbook The Ins and Outs of XML and DB2 for i5/OS.

The result is that once I use this option, the parser will be perfectly happy if almost anything is missing! Luckily, there is a relatively simple way to deal with this: initialize the entire DS to a known value...say, *HIVAL...before we begin the parse. We can then simply test any compulsory fields to ensure that they were correctly populated. We can also use this value as a means of determining when the last entry in the array has been processed.

Before I close, there are two other aspects of the XML-INTO support I should briefly touch on. The first concerns the handling of numeric fields. In my example, I defined several elements (for example, MSRP) as being numeric. While this works beautifully, in fact you can even specify the H(alf Adjust) extender with the XML-INTO opcode; be aware that should an error occur during the numeric conversion, the parser will simply terminate and issue an error. Unless you can guarantee that the XML document contains only "good" data, you might want to take an alternative approach and define all numeric elements as character fields in the DS. You can then attempt the numeric conversion (probably using the %DEC BIF) under the control of your own program. This way, you can report any errors encountered but still be able to process the balance of the document. A suggested version of the revised product DS is shown below:

   D product         DS                  Qualified                           
   D   description                 40a                                       
   D   code                         4a                                       
   D   mSRP                        12a
   D   sellPrice                   12a
   D   qtyOnHand                    7a

Don't forget when setting the size of these character fields to allow enough room for the decimal point and any possible sign characters.

The second aspect of XML-INTO that I'd like to discuss here facilitates the simplification of the target DS specification. It is in fact acceptable to ignore the root element in the document (i.e., Products in our example) and to simply load directly into a DS array corresponding to the next level (i.e., Category). Since this is the first repeating element, we can therefore code it as a DS array. This has an additional benefit in that the RPG compiler can now actually tell us how many elements were processed. Indeed, if the Category element were the only repeating element in the document (it isn't because Product also repeats), we would not even need to specify the "allowmissing" option. The revised sections of the example, including the Program Status Data Structure (PSDS) that contains the element count, are shown below.

    D progStatus     SDS 
      // When the INTO target is an array, xmlElements will contain  a count
      //   of the number of elements loaded
    D   xmlElements                 20i 0 Overlay(progStatus: 372) 
                                          
      // Note that the products DS is no longer required                   
    D category        DS                  Qualified Dim(20)        
    D   code                         2a                            
    D   description                 20a                            
    D   product                           LikeDS(product) Dim(50)

      // Note modified XML-INTO now targets the category DS 
       XML-INTO category  
             %XML(XML_Source: 'case=any doc=file allowmissing=yes');

That's all I have time for in this article. Click here to download this source code.

In the next episode in this series, I will discuss how to handle situations in which RPG's current size limits "get in the way." And if we have time, we will also take a brief look at XML-INTO's little brother, XML-SAX.

Jon Paris

Jon Paris's IBM midrange career started when he fell in love with the System/38 while working as a consultant. This love affair ultimately led him to joining IBM.

 

In 1987, Jon was hired by the IBM Toronto Laboratory to work on the S/36 and S/38 COBOL compilers. Subsequently, Jon became involved with the AS/400 and in particular COBOL/400.

 

In early 1989, Jon was transferred to the Languages Architecture and Planning Group, with particular responsibility for the COBOL and RPG languages. There, he played a major role in the definition of the new RPG IV language and in promoting its use with IBM Business Partners and users. He was also heavily involved in producing educational and other support materials and services related to other AS/400 programming languages and development tools, such as CODE/400 and VisualAge for RPG.

 

Jon left IBM in 1998 to focus on developing and delivering education focused on enhancing AS/400 and iSeries application development skills.

 

Jon is a frequent speaker at user group meetings and conferences around the world, and he holds a number of speaker excellence awards from COMMON.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: