Explore more options with the Calculate Hash API.
Last month, in Verifying That Nothing Has Changed, we saw how easy it can be to create a hash over a string such as 'Some data to be hashed' by using the Calculate Hash API. We also saw how the calculated hash value is very dependent on what data is being hashed. Something as simple as a trailing blank or a punctuation change can cause significant changes in the derived hash value. Today, we'll look at some options on how we can hash data, and we'll see that agreement between the generator and verifier of the hash value on the how is not nearly as important as the what.
Last month's DOHASH program hashed the left-adjusted 25-byte string 'Some data to be hashed ' using one call to the Calculate Hash API and got back the corresponding hash value. For longer strings—for instance, a string representing all of the records in a file—it might be more convenient (and in some cases required due to high-level language constraints) to pass the data to be hashed in pieces (records if you will) using multiple API calls and, when all of the data has been processed, then get back the final hash value.
To demonstrate this piecemeal approach to hashing, we'll continue to hash the string 'Some data to be hashed ' but we'll do it passing only 5 bytes of the string per call to the Calculate Hash API. That is, on the first call we'll hash 'Some ' (note that there's a trailing blank after the 'e' of Some), the second call 'data ', the third call 'to be', and so on. The program to accomplish this is shown below.
h dftactgrp(*no)
d CrtAlgCtx pr extproc('Qc3CreateAlgorithmContext')
d AlgDsc 4096a const options(*varsize)
d FmtAlgDsc 8a const
d AlgCtxTkn 8a
d ErrCde likeds(QUSEC)
d DltAlgCtx pr extproc('Qc3DestroyAlgorithmContext')
d AlgCtxTkn 8a const
d ErrCde likeds(QUSEC)
d HshDta pr extproc('Qc3CalculateHash')
d InpDta 4096a const options(*varsize)
d LenInpDta 10i 0 const
d FmtInpDta 8a const
d AlgDsc 4096a const options(*varsize)
d FmtAlgDsc 8a const
d CryptoPrv 1a const
d CryptoDev 10a const
d HshVal 1a options(*varsize)
d ErrCde likeds(qusec)
d Data s 25a inz('Some data to be hashed')
d DataSubset s 5a
d HashValue s 32a
d X s 10i 0
/copy qsysinc/qrpglesrc,qc3cci
/copy qsysinc/qrpglesrc,qusec
/free
QUSBPrv = 0;
QC3Ha = 3;
CrtAlgCtx(QC3D0500 :'ALGD0500' :QC3ACT :QUSEC);
QC3FOF = '0';
X = 1;
dow X < %size(Data);
DataSubset = %subst(Data :X :%size(DataSubset));
HshDta(DataSubset :%size(DataSubset) :'DATA0100'
:QC3D0100 :'ALGD0100' :'0' :' '
:HashValue :QUSEC);
X += %size(DataSubset);
enddo;
QC3FOF = '1';
HshDta(' ' :0 :'DATA0100'
:QC3D0100 :'ALGD0100' :'0' :' '
:HashValue :QUSEC);
DltAlgCtx(QC3ACT :QUSEC);
*inlr = *on;
return;
/end-free
Reviewing the previous code, you'll find that two new APIs are being used. The first new API, Create Algorithm Context, can be used to create a job-specific context (environment) to enable the sharing of algorithmic parameters and interim values across multiple API calls. The API has four parameters:
- Algorithm description (AlgDsc)—The algorithm and related parameters to be used within this context
- Algorithm description format name (FmtAlgDsc) —The format of the Algorithm description (the first parameter). Format ALGD0500, used in this month's program, indicates that the Algorithm description is related to hashing. Other formats are defined for algorithms such as block cipher, stream cipher, and public keys.
- Algorithm context token (AlgCtxTkn) —The output parameter to receive an 8-byte token that uniquely identifies the context being created. The format of a token is not defined. The token value is simply passed as-is in subsequent calls to other APIs in order to identify the algorithmic environment to be used.
- Error code (ErrCde) —The standard API error code parameter
Notice that when calling the Create Algorithm Context API, the first and second parameters being passed are using the same values as the fourth and fifth parameters used last month when calling the Calculate Hash API. Combined, these two parameters, the data structure QC3D0500, and the constant 'ALGD0500' indicate that the program will be performing SHA-256 hashing. By using these values with the Create Algorithm Context API, we are simply enabling the ability to do the actual hashing using one or more calls to the Calculate Hash API.
Having created the hashing context, the program then sets the Final operation flag variable (QC3FOF) of the QSYSINC QC3CCI provided data structure QC3D0100 to the value '0'. When subsequently calling the Calculate Hash API, a Final operation flag value of '0' indicates that the hashing operation is still in progress, allowing us to continue calling the API with additional data to be included in the hashing operation.
The program then enters a DOW conditioned by not all data of the string to be hashed having been processed. Within the DOW, the program performs these steps:
- Sets the variable DataSubset to the next 5 bytes of the string 'Some data to be' to be hashed.
- Calls the Calculate Hash API to hash the currently identified 5 bytes. In addition to the change from last month's program regarding the number of bytes to hash (25 to 5), this call to the API also uses different values for the fourth and fifth parameters. Last month's call to the Calculate Hash API set these parameters, respectively, to the data structure QC3D0500 and format 'ALGD0500' in order to provide the necessary parameters for the hashing algorithm to be used. This month, the hashing algorithm parameters are set when calling the Create Algorithm Context API, so the fourth and fifth parameters are set to the data structure QC3D0100 and format 'ALGD0100', respectively, in order to reference the previously created algorithm context the program wants to work with.
- Identifies the next 5 bytes to be hashed.
- Re-runs the DOW until all data has been hashed.
When all of the string has been hashed, the DOW ends, the variable QC3FOF is set to the value '1' (indicating that the hashing operation is complete), and the Calculate Hash API is called one last time in order to obtain the hash value for the previously hashed data. On this call to the API, the second parameter, LenInpDta, is set to zero, indicating that no additional data is to be hashed as part of the API processing.
The program can, if desired, avoid this sixth call to the Calculate Hash API. If, within the DOW block, the program can determine that the fifth call to the API represents the last data to be used, then the program can set the QC3FOF variable to the value of '1' on the fifth call and obtain the hash value as an output of this fifth call. In the case of the sample program, this would be easy enough to do as the length of the string to be hashed is known in advance.
In other cases, this determination of the last data to be hashed may not be as straightforward. The current approach, though, is structured in a way that makes it fairly easy to replace the string being hashed (the variable Data) with, for instance, a database file. Prior to entering the DOW issuing a READ to a file, changing the DOW conditioning to test for the file being not %EOF, hashing the record buffer previously read, READing the next record, and re-entering the DOW allows the program to now create a hash value over the contents of the entire file—without, in advance, having to determine the number of records in the file.
Having completed the hashing operation, the second new API, Destroy Algorithm Context, is called to destroy the context previously created with the Create Algorithm Context API (and subsequently referenced when calling the Calculate Hash API using format ALGD0100). The two parameters passed are the token identifying the context and the standard API error code. The program then ends.
Assuming that the previous program source is stored in member DOHASH2 of source file QRPGLESRC, you can create the program using the command CRTBNDRPG PGM(DOHASH2).
Calling DOHASH2 will then generate the following 32-byte HashValue (shown in hex):
7FCBD451 76B2E190 AFC956AE F507CB35
A9C7F58B 36A68E25 58E9F3F9 736FB213
Hopefully, you will notice that, while this hash value was created using multiple calls to Calculate Hash API (five times with data to be hashed, one time to access the hash value), the returned HashValue is the same as last month's hashing of the string with one call. The internal processing related to the data being hashed did not impact the derived hash value.
As usual, if you have any API questions, send them to me at
LATEST COMMENTS
MC Press Online