29
Fri, Nov
0 New Articles

The Atomic Add (ATMCADD) MI Instruction

RPG
Typography
  • Smaller Small Medium Big Bigger
  • Default Helvetica Segoe Georgia Times

Learn about the ATMCADD MI instruction and the PowerPC technology behind it.

 

As one of the shared storage synchronization MI instructions, the Atomic Add (ATMCADD) instruction and its siblings, the Atomic And (ATMCAND) and Atomic Or (ATMCOR) MI instructions, were introduced to IBM i at V5R3 to support atomic update operations on 4-byte or 8-byte storage shared between two or more threads. The concept of atomicity of storage operations is discussed in the Atomicity section in the Machine Interface Architecture Introduction.

 

This article introduces the details of the working mechanism of the ATMCADD instruction and its siblings.

User Code Generated for ATMCADD

The ATMCADD MI instruction operates on two signed binary values of the same length. ATMCADD atomically increments the signed binary value addressed by operand 1 (which is a space pointer) by the signed binary value specified by operand 2, and returns the original value addressed by operand 1. ATMCADD is available to user programs only as system built-in functions (SYSBIFs); in other words, it can be used only by ILE programs. Two SYSBIFs are provided for the ATMCADD MI instruction, _ATMCADD4 and _ATMCADD8, which operate on 4-byte signed binary values and 8-byte signed binary values, respectively. The ILE RPG prototypes of these SYSBIFs can be found in mih-stgsync.rpgleinc as follows:

 

     /**

     * @BIF _ATMCADD4 (Atomic Add (ATMCADD))

     * @return Original value of sum_addend

     */

     d atmcadd4       pr           10i 0 extproc('_ATMCADD4')

     d       sum_addend               10i 0

     d       augend                  10i 0 value

 

     /**

     * @BIF _ATMCADD8 (Atomic Add (ATMCADD))

     * @return Original value of sum_addend

     */

     d atmcadd8       pr           20i 0 extproc('_ATMCADD8')

     d       sum_addend               20i 0

     d       augend                   20i 0 value

 

The ILE C prototypes of the _ATMCADD4 and _ATMCADD8 SYSBIFs are the following:

 

# pragma linkage(_ATMCADD4, builtin)

long _ATMCADD4(long*, long);

 

# pragma linkage(_ATMCADD8, builtin)

long long _ATMCADD8(long long*, long long);

 

Now let's look at the user code generated for the ATMCADD instruction in a compiled program. Let's take the _ATMCADD8 SYSBIF, for example. Compile an ILE C program containing the following tiny ILE C procedure iadd():

 

# pragma linkage(_ATMCADD8, builtin)

long long _ATMCADD8(long long*, long long);

 

long long iadd(long long *num, long long diff) {

_ATMCADD8(num, diff); /* Statement 1 */

return *num;

}

 

Find the RISC instructions generated for the first statement of procedure iadd() in the System Service Tools (SST) dump of the program containing iadd(). At V5R4, the user code generated for statement 1 of iadd(), which invokes _ATMCADD8, might look like the following:

 

RISC INSTRUCTIONS (iadd)

Location

Object text

Source statement

Statement numbers

Description

000028

E9860030

LD 12,0X30(6)

1

At the time when statement 1 is executed, General Purpose Register (GPR) 6 (aka r6) addresses the parameter area of procedure iadd(). The format of the procedure parameter area is described in the documentation of the _NPMPARMLISTADDR SYSBIF in the IBM Knowledge Center. [1] At offset hex 20 from the start of the procedure parameter area is the first parameterspace pointer long long *num of procedure iadd(). The Load Doubleword instruction loads the 8-byte data contents of the second parameter (diff) of iadd() at offset hex 30 from the start of the procedure parameter area into r12.

00002C

E1060026

LQ 8,0X20(6),6

 

The Load Quadword (LQ) instruction tries to load the 16-byte space pointer num from offset hex 20 from the start of the procedure parameter area into r8 and r9.

000030

7923049C

SELRI 3,9,0,41

 

If the previous LQ instruction loads the space pointer successfully, the SELRI instruction copies r9 (the 8-byte address portion space pointer num) to r3, which acts as the operand 1 of _ATMCADD8. Otherwise, the immediate value hex 0000000000000000 is placed in r3.

000034

61840000

ORI 4,12,0

 

Copy r12 (value of BIN(8) diff) to r4, which acts as the operand 2 of _ATMCADD8.

000038

4B801DC3

BLA 0X3801DC0

 

Invoke the implementing Licensed Internal Code (LIC) routine of the _ATMCADD8 SYSBIF. [2]

00003C

606A0000

ORI 10,3,0

 

Copy the return value of _ATMCADD8 (stored in r3) to r10.

 

Notes

[1] Note that _NPMPARMLISTADDR, instead of what appears in the documentation of the SYSBIF, is the correct name of the NPM Procedure Parameter List Address (NPM_PARMLIST_ADDR) SYSBIF. This is important if you want your programs that invoke the SYSBIF to get compiled.

[2] A BLA form Branch (b) instruction branches the execution to the absolute address specified by target address operand and places the next instruction address (NIA) into the link register so that the called routine can return to the calling code via a Branch Conditional to Link Register (bclr) instruction. The BLA 0X3801DC0 instruction branches the execution to the address (FFFFFFFFFF 801DC0) of the implementing LIC routine of _ATMCADD8 (in LIC module #cfgrbla), which stores the return value in r3 on return.

The PowerPC Load and Reserve Instructions and Store Conditional Instructions

Before we turn to the code of the implementing LIC routines of the ATMCADD MI instruction, let's review the load and reserve instructions and the store conditionally instructions briefly. Section 4.2.6 Memory Synchronization Instruction–UISA of the Programming Environments Manual for 64-bit Microprocessors describes the purpose and working mechanism of these PowerPC memory synchronization instructions as follows.

 

The concept behind the use of the lwarx, ldarx, stwcx., and stdcx. instructions is that a processor may load a semaphore from memory, compute a result based on the value of the semaphore and conditionally store it back to the same location. If the store was successful, the sequence of the instructions from the read of the semaphore to the store that updated semaphore appear to have been executed atomically (that is, no other processor or mechanism modified the semaphore location between the read and the update), thus providing the equivalent of a real atomic operation.

 

The lwarx instruction must be paired with a stwcx. instruction, and ldarx instruction with an stdcx. instruction, with the same effective address (EA) specified by both instructions of the pair. The only exception is that an unpaired stwcx. or stdcx. instruction on any (scratch) EA can be used to clear any reservation held by the processor. The conditional store is performed based upon the existence of a reservation established by the preceding lwarx or ldarx instruction. Note that at most one reservation exits simultaneously on any processor. If the reservation exists when the store is executed, the store is performed and the EQ bit of the CR field 0 (CR0) is set. If the reservation does not exist when the store is executed, the target memory location is not modified and the EQ bit of CR0 is cleared.

 

The reservation held by the processor is cleared if any of the following events occurs:

  • The processor holding the reservation executes another load and reserve instruction; this clears the first reservation and establishes a new one.
  • The processor holding the reservation executes a store conditional instruction to any address.
  • Another processor executes any store instruction to the address associated with the reservation.
  • Any mechanism, other than the processor holding the reservation, stores to the address associated with the reservation.

 

Therefore, a sequence of the instructions from a load and reserve instruction to a paired successful store conditional instruction is equivalent to a real atomic storage operation (at the address associated with the reservation established by the load and reserve instruction).

 

Also note that the lwarx/stwcx. and ldarx/stdcx. instructions require the EA to be aligned to 4-byte and 8-byte boundaries, respectively. Like the majority of Load/Store Indexed PowerPC instructions (e.g., stdx rS,rA,rB), EA is the sum of the content of rA (or zero if rA=0) and the content of rB, aka (rA|0)+(rB), for all these four instructions. After reading the next section, you will see that this is the reason that the ATMCADD MI instruction requires its first operand to be aligned based on the length it operands.

Analyze the Implementing LIC Routines of ATMCADD

At V5R4, the implementing LIC routines of _ATMCADD4 and _ATMCADD8 are at addresses FFFFFFFFFF 801DA0 and FFFFFFFFFF 801DC0, respectively. The LIC module containing these routines is called #cfgrbla. At V5R4, the disassembled PowerPC instructions of the implementing LIC routine of _ATMCADD8 are like so:

 

Implementing LIC Routine of _ATMCADD8 (FFFFFFFFFF 801DC0)

Location

Object Text

Source Statement

Description

1DC0

7CA018A8

ldarx 5,0,3

The effective address (EA) is the content of r3, i.e., the address of the first operand of _ATMCADD8 passed by use code. The ldarx instruction loads the operand 1 of _ATMCADD8 into r5 and establishes a reservation for use by a Store Doubleword Conditionally (stdcx.) instruction. The address of the operand 1 of _ATMCADD8 is associated with the reservation.

1DC4

7C042A14

add 0,4,5

The sum of the value of operand 1 (in r5) and the operand 2 (passed by use code via r4) of _ATMDADD8 is placed into r0.

1DC8

7C0019AD

stdcx. 0,0,3

Store the sum to the address of the operand 1 of _ATMCADD8 if the reservation established by the previous ldarx instruction at address of the operand 1 of _ATMCADD8 is not cleared due to reasons such as another processor executes a store instruction to the same address. Whether the conditional store is performed or not, any reservation held by the processor will be cleared. The EQ bit of CR0 (CR[2]) is set to reflect whether the store is performed: CR[2] is set if the store is performed; CR[2] is cleared if the store is not performed.

1DCC

40C2FFF4

bc 6,2,-0xc

If the previous conditional store isn't performed, the Branch Conditional (bc) instruction branches the execution to the ldarx 5,0,3 instruction to retry the sequence of instructions from ldarx 5,0,3 to stdcx. 0,0,3.

1DD0

38650000

addi 3,5,0

Upon a successful store, the original value of the operand 1 of _ATMCADD8 is copied to r3 as the return value of _ATMCADD8.

1DD4

4E800020

bclr 20,0,0

The BO field with the value 20 means branch always. The Branch Conditional to Link Register (bclr) instruction (bclr 20,0,0) branches the execution to the user code which invokes the implementing LIC routine of _ATMCADD8 via a bla instruction.

 

Storage Synchronization Related Consideration When Using the ATMCADD MI Instruction

As documented by the MI documentation in the IBM Knowledge Center, the ATMCADD instruction and its siblings (ATMCAND and ATMCOR) do not synchronize storage. When sharing more than one variable between multiple threads or processes, it is the programmer's responsibility to ensure proper shared storage synchronization. Please refer to the Storage Synchronization Concepts in the IBM Knowledge Center for details.

 

Within a thread, to enforce ordering of update operations made by ATMCADD, ATMCAND, and ATMCOR to multiple shared storage, you can separate two update operations by a Synchronize Shared Storage Accesses (SYNCSTG) MI instruction to guarantee that the first update operation to be completed not later than the second update operation to be started. For example, consider the following scenario:

  • An array of counters that are expected to be accessed by any thread or MI process within the system are stored in a user space (*USRSPC) object called CTRARA.
  • An ILE RPG program called ATMC03 accepts an array of counter indices (ctrinx), increments each counter specified by crtinx atomically, and returns an array of the incremented counter values.

 

In this scenario, the ATMCADD MI instruction can be used to make sure each counter is incremented atomically, and the SYNCSTG MI instruction can be used to enforce the ordering of the update operations to multiple counters. The following is the example source code (atmc03.rpgle) of ILE RPG program ATMC03:

 

     /**

     * @file atmc03.rpgle

     *

     * Atomically increment each of one or more counters shared by

     * multiple threads.

     * @pre Create a hex 1934 space named CTRRA:

     *       CALL PGM(QUSCRTUS) +

     *         PARM('CTRARA   *CURLIB' 'USRARA' X'00000F00' +

     *         X'00' *USE 'Counter Space')

     */

 

     h dftactgrp(*no)

 

     d atmc03         pr                 extpgm('ATMC03')

     d     ctrinx                       5u 0 dim(16) options(*varsize)

     d     numinx                       5u 0

     d     incctr                    20i 0 dim(16) options(*varsize)

 

     /**

     * @BIF _SYNCSTG (Synchronize Shared Storage Accesses (SYNCSTG))

     */

     d syncstg         pr                 extproc('_SYNCSTG')

     d       action                   10u 0 value

     /**

     * @BIF _ATMCADD8 (Atomic Add (ATMCADD))

     */

     d atmcadd8       pr           20i 0 extproc('_ATMCADD8')

     d       sum_addend               20i 0

     d       augend                   20i 0 value

     *

     d rslvsp_tmpl     ds                 qualified

     d       obj_type                   2a

     d       obj_name                 30a

     * required authorization

     d       auth                       2a     inz(x'0000')

     /**

     * @BIF _RSLVSP2 (Resolve System Pointer (RSLVSP))

     */

    d rslvsp2         pr                 extproc('_RSLVSP2')

     d       obj                           *   procptr

     d       opt                       34a

     /**

     * @BIF _SETSPPFP (Set Space Pointer from Pointer (SETSPPFP))

     */

     d setsppfp      pr             *   extproc('_SETSPPFP')

     d       src_ptr                       *   value procptr

 

     * System pointer to hex 1934 space object CTRARA

     d spc@           s               *   procptr

     d counter         s             20i 0 based(spp@)

     d                                     dim(16)

     d one             s             20i 0 inz(1)

     d n               s             5u 0

 

     d atmc03         pi

     d     ctrinx                       5u 0 dim(16) options(*varsize)

     d   numinx                       5u 0

     d     incctr                     20i 0 dim(16) options(*varsize)

 

     /free

           // Resolve a SYSPTR *USRSPC *LIBL/CTRARA

           rslvsp_tmpl.obj_type = x'1934';

           rslvsp_tmpl.obj_name = 'CTRARA';

           rslvsp2(spc@ : rslvsp_tmpl);

           spp@ = setsppfp(spc@);

 

           for n = 1 to numinx;

               // Increase @var counter by 1 atomically

               atmcadd8(counter(ctrinx(n)) : one);

 

               // Enforce ordering of shared storage operations

               syncstg(0);

 

               // Display the increased value of @var counter

               dsply ctrinx(n) '' counter(ctrinx(n));

           endfor;

 

           *inlr = *on;

     /end-free

 

 

Junlei Li

Junlei Li is a programmer from Tianjin, China, with 10 years of experience in software design and programming. Junlei Li began programming under i5/OS (formerly known as AS/400, iSeries) in late 2005. He is familiar with most programming languages available on i5/OS—from special-purpose languages such as OPM/ILE RPG to CL to general-purpose languages such as C, C++, Java; from strong-typed languages to script languages such as QShell and REXX. One of his favorite programming languages on i5/OS is machine interface (MI) instructions, through which one can discover some of the internal behaviors of i5/OS and some of the highlights of i5/OS in terms of operating system design.

 

Junlei Li's Web site is http://i5toolkit.sourceforge.net/, where his open-source project i5/OS Programmer's Toolkit (https://sourceforge.net/projects/i5toolkit/) is documented.

BLOG COMMENTS POWERED BY DISQUS

LATEST COMMENTS

Support MC Press Online

$

Book Reviews

Resource Center

  • SB Profound WC 5536 Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application. You can find Part 1 here. In Part 2 of our free Node.js Webinar Series, Brian May teaches you the different tooling options available for writing code, debugging, and using Git for version control. Brian will briefly discuss the different tools available, and demonstrate his preferred setup for Node development on IBM i or any platform. Attend this webinar to learn:

  • SB Profound WP 5539More than ever, there is a demand for IT to deliver innovation. Your IBM i has been an essential part of your business operations for years. However, your organization may struggle to maintain the current system and implement new projects. The thousands of customers we've worked with and surveyed state that expectations regarding the digital footprint and vision of the company are not aligned with the current IT environment.

  • SB HelpSystems ROBOT Generic IBM announced the E1080 servers using the latest Power10 processor in September 2021. The most powerful processor from IBM to date, Power10 is designed to handle the demands of doing business in today’s high-tech atmosphere, including running cloud applications, supporting big data, and managing AI workloads. But what does Power10 mean for your data center? In this recorded webinar, IBMers Dan Sundt and Dylan Boday join IBM Power Champion Tom Huntington for a discussion on why Power10 technology is the right strategic investment if you run IBM i, AIX, or Linux. In this action-packed hour, Tom will share trends from the IBM i and AIX user communities while Dan and Dylan dive into the tech specs for key hardware, including:

  • Magic MarkTRY the one package that solves all your document design and printing challenges on all your platforms. Produce bar code labels, electronic forms, ad hoc reports, and RFID tags – without programming! MarkMagic is the only document design and print solution that combines report writing, WYSIWYG label and forms design, and conditional printing in one integrated product. Make sure your data survives when catastrophe hits. Request your trial now!  Request Now.

  • SB HelpSystems ROBOT GenericForms of ransomware has been around for over 30 years, and with more and more organizations suffering attacks each year, it continues to endure. What has made ransomware such a durable threat and what is the best way to combat it? In order to prevent ransomware, organizations must first understand how it works.

  • SB HelpSystems ROBOT GenericIT security is a top priority for businesses around the world, but most IBM i pros don’t know where to begin—and most cybersecurity experts don’t know IBM i. In this session, Robin Tatam explores the business impact of lax IBM i security, the top vulnerabilities putting IBM i at risk, and the steps you can take to protect your organization. If you’re looking to avoid unexpected downtime or corrupted data, you don’t want to miss this session.

  • SB HelpSystems ROBOT GenericCan you trust all of your users all of the time? A typical end user receives 16 malicious emails each month, but only 17 percent of these phishing campaigns are reported to IT. Once an attack is underway, most organizations won’t discover the breach until six months later. A staggering amount of damage can occur in that time. Despite these risks, 93 percent of organizations are leaving their IBM i systems vulnerable to cybercrime. In this on-demand webinar, IBM i security experts Robin Tatam and Sandi Moore will reveal:

  • FORTRA Disaster protection is vital to every business. Yet, it often consists of patched together procedures that are prone to error. From automatic backups to data encryption to media management, Robot automates the routine (yet often complex) tasks of iSeries backup and recovery, saving you time and money and making the process safer and more reliable. Automate your backups with the Robot Backup and Recovery Solution. Key features include:

  • FORTRAManaging messages on your IBM i can be more than a full-time job if you have to do it manually. Messages need a response and resources must be monitored—often over multiple systems and across platforms. How can you be sure you won’t miss important system events? Automate your message center with the Robot Message Management Solution. Key features include:

  • FORTRAThe thought of printing, distributing, and storing iSeries reports manually may reduce you to tears. Paper and labor costs associated with report generation can spiral out of control. Mountains of paper threaten to swamp your files. Robot automates report bursting, distribution, bundling, and archiving, and offers secure, selective online report viewing. Manage your reports with the Robot Report Management Solution. Key features include:

  • FORTRAFor over 30 years, Robot has been a leader in systems management for IBM i. With batch job creation and scheduling at its core, the Robot Job Scheduling Solution reduces the opportunity for human error and helps you maintain service levels, automating even the biggest, most complex runbooks. Manage your job schedule with the Robot Job Scheduling Solution. Key features include:

  • LANSA Business users want new applications now. Market and regulatory pressures require faster application updates and delivery into production. Your IBM i developers may be approaching retirement, and you see no sure way to fill their positions with experienced developers. In addition, you may be caught between maintaining your existing applications and the uncertainty of moving to something new.

  • LANSAWhen it comes to creating your business applications, there are hundreds of coding platforms and programming languages to choose from. These options range from very complex traditional programming languages to Low-Code platforms where sometimes no traditional coding experience is needed. Download our whitepaper, The Power of Writing Code in a Low-Code Solution, and:

  • LANSASupply Chain is becoming increasingly complex and unpredictable. From raw materials for manufacturing to food supply chains, the journey from source to production to delivery to consumers is marred with inefficiencies, manual processes, shortages, recalls, counterfeits, and scandals. In this webinar, we discuss how:

  • The MC Resource Centers bring you the widest selection of white papers, trial software, and on-demand webcasts for you to choose from. >> Review the list of White Papers, Trial Software or On-Demand Webcast at the MC Press Resource Center. >> Add the items to yru Cart and complet he checkout process and submit

  • Profound Logic Have you been wondering about Node.js? Our free Node.js Webinar Series takes you from total beginner to creating a fully-functional IBM i Node.js business application.

  • SB Profound WC 5536Join us for this hour-long webcast that will explore:

  • Fortra IT managers hoping to find new IBM i talent are discovering that the pool of experienced RPG programmers and operators or administrators with intimate knowledge of the operating system and the applications that run on it is small. This begs the question: How will you manage the platform that supports such a big part of your business? This guide offers strategies and software suggestions to help you plan IT staffing and resources and smooth the transition after your AS/400 talent retires. Read on to learn: