 Cryptography

# Extending AES-GCM initialization vector

We all know that applications of crypto primitives like to not think about the management of nonces, and initialization vectors and often prefer to just set them to random values. This sometimes leads to problems, when then IV is too short.

For example, in AES-GCM, the variable part of the IV is only 64 bits. If for each message we choose the IV randomly, we will start getting collisions after a $$2^{32}$$ messages; which is very insecure depending on the protocol.

Now a hacky way to roll our own crypto would be the following:

### AES-GCM with extended-IV

First of all, we will stop using the normal IV part of the AES-GCM construction. Instead, for each message, we will mangle the key like this:

$$K' = \text{KDF}(K || \text{nonce})$$

where $$K$$ is the original key, $$K'$$ is the new key, and $$\text{nonce}$$ is a nonce that is long (say 256 bits) and randomly generated for each encryption; $$\text{KDF}$$ is assumed to be a properly domain-separated PRF that returns a new 256-bit value.

Now we encrypt our message using AES256-GCM with the new key. As mentioned, we set the IV to some kind of constant value. We transmit $$\text{nonce}$$ along with the ciphertext.

I would expect that, because collisions are possible in $$K'$$, this construction has only $$\text{len}(K') = 128$$ bits of security. However, I find it hard to reason about its security. The main question:

Could this scheme be used as an alternative for AES128-GCM, but with randomized nonces (similar to XSalsa20Poly1305)?

I mean hypothetically! I would not actually such a construction like that. I don't think that would make any sense.

Edit: As Poncho shows, this scheme is obviously not nonce-misuse resistant. I badly worded the question. I updated it. ## Solution

Well, supposing everything is properly domain separated, nonces are always generated randomly for encryption queries, and your KDF behaves like a perfectly random function, the first step is to bound the probability of a nonce repeat. A birthday bound tells us that this happens with probability at most $$q^2/2^{256}\,,$$ where $$q$$ is the number of encryption/decryption queries. So let's rule this case out, and move to the case where this never happens.

Next is the event of key collisions. Again, a simple birthday bound gives us a probability of at most $$q^2/2^{256}$$ that there are repeated derived keys.

Now what we have here is, in fact, an instance of multi-user GCM where the same nonce is always used. That is, you have up to $$q$$ independent instances of GCM, and to break the scheme it suffices to break any one of these instances. (Strictly speaking, the multi-user setting gives you more freedom in choosing how many queries per user you can use, whereas here you're limited to 1 (encryption) query per user.)

The multi-user security of GCM was conveniently analyzed by Hoang, Tessaro, and Thiruvengadam in the ideal cipher model, which tells us in Theorem 3.1 \begin{align*} \mathbf{Adv}_{\mathtt{CAU}[H,E]}^{\text{mu-ae}} \le &\frac{d(p+q) + n(q + \sigma + p)}{2^k} + \frac{\sigma(2B + cn + 3)}{2^n} \\& + \frac{2q+1}{2^{2n}} + \frac{\sigma(\sigma + ncd) + 2pq}{2^{k+n}}\,. \end{align*} Here we can set $$n = 128$$, $$k = 256$$, $$d = q$$ ($$d$$ is the number of nonce repetitions across users; since the nonce is fixed, this is equal to the number of queries), $$c$$ is a bound on the differential probability of $$H$$ (for GHASH, it can be set to $$B/2^n$$), $$B$$ is the maximum message size in blocks, $$\sigma$$ is the total number of blocks across queries (which can be bounded by $$qB$$), and $$p \le 2^{n-2}$$ is the number of offline AES evaluations (i.e., bruteforce of the key).

So, roughly speaking, as long as your messages are sufficiently short (to keep the $$O\left(\frac{qB^2}{2^{128}}\right)$$ term down), and the total number of queries remains well below $$2^{128}$$ (to keep the terms $$2q^2/2^{256}$$ and $$O\left(\frac{q^2 + pq}{2^{256}}\right)$$ down), as you suggested, this scheme should remain secure.