Cryptography
hash nist sha-2
Updated Mon, 26 Sep 2022 10:53:31 GMT

# NIST example shows extra hexadecimal characters in Block Contents of SHA512-256

I was recently trying to gain a better understanding of the SHA-512/256 algorithm and on this NIST example they use the word "abc" as the input. In the Block Contents, it shows the expected hexadecimal 616263 (abc), although why is it directly followed by 0x80 and 0x18 at the very end?

Block Contents:
W[ 0] = 6162638000000000
W[ 1] = 0000000000000000
W[ 2] = 0000000000000000
W[ 3] = 0000000000000000
W[ 4] = 0000000000000000
W[ 5] = 0000000000000000
W[ 6] = 0000000000000000
W[ 7] = 0000000000000000
W[ 8] = 0000000000000000
W[ 9] = 0000000000000000
W[10] = 0000000000000000
W[11] = 0000000000000000
W[12] = 0000000000000000
W[13] = 0000000000000000
W[14] = 0000000000000000
W[15] = 0000000000000018


I realize the zeros are padding, however, it's the 0x80 and 0x18 presence I'm not understanding.

## Solution

It is the byte padding of SHA-512 encoded as the big-endian, simply

• add bit 1, that is the last 0x80 in the begging part
• fill zeros
• then add the size in the 128-bit big-endian in the end. 0x18 is 24 bytes which are for the 3 characters.

• More formal;

NIST FIPS 180-4 on page 13 defines the padding scheme for SHA-512 as;

Suppose that the length of the message, $$M$$, is $$\ell$$ bits. Append the bit 1 to the end of the message, followed by $$k$$ zero bits, where $$k$$ is the smallest, non-negative solution to the equation $$\ell + 1 + k \equiv 896 \bmod 1024$$

• Your case that fits one block.

Your message size is $$\ell =24$$ than

$$24 + 1 + k \equiv 896 \bmod 1024$$ solving for minimal $$k = 896-24-1 = 871$$. Therefore the padded message is

$$\text{padded message } = \overbrace{M}^{24-bit} \mathbin\| \overbrace{1}^{1-bit} \mathbin\| \overbrace{000\cdots 000}^{871-bit\; 0s} \mathbin\| \overbrace{00000000\cdots00011000}^{128-\text{bit binary encoded length } \ell}$$