I'm writing an AES file encryption program, and I'd like to put in a way to tell whether or not the user has entered the correct password without decrypting the entire file and GCM telling me the tag is invalid.
My process is as follows:
Get the user to enter a password ($p$), generate a salt/nonce/IV ($n$), and use Scrypt
to generate 2 keys (The first and second half of the generated key); $$k_1, k_2 = Scrypt(salt:n, keylen:32, n:2^{16}, r:8, p:1).derive(p)$$
Encrypt the data of a file with $\text{AES-GCM-128(}k_1, \text{iv/nonce=n})$ and empty associated data.
Write encrypted data to the file so that the contents of the file is $n || gcmtag || data$
Would it be secure if I instead wrote the following to the file: $$n || gcmtag || k_2 || data$$
That means I can load $n$ from the file, take the user-inputted password, derive the keys, and check if the value of $k_2$ is equivalent to the $k_2$ loaded from the file.
Your $k_2$ value is functioning effectively the same way as conventional password verification methods, where you store a salted password hash of the users' passwords. So it allows for an adversary to test password guesses, but
Alternative to very strongly consider: instead of encrypting the whole file in one GCM encryption call, split it into chunks to be encrypted separately with some construction that protects against reordering, deletion and truncation. Study these examples:
The main reason for this is that way you can encrypt/decrypt very large files with a fixed memory footprint, and yet abort decryption as soon as you hit an inauthentic chunk. And secondarily to this, it also indirectly tackles your problem: if the user enters the wrong password, then decryption will fail on the first block.
Potential downsides are:
External links referenced by this document: