Assume I have a perfect source of entropy, which is unknown to me, and is used to generate a private key (also unknown to me, but usable).
If I make a signature with this unknown private key on a constant piece of data (e.g. hash('foo')
) and then hash that signature, is the original entropy preserved?
Put another way, does an elliptic curve signature's output probability distribution function resemble that of a perfect hash function (i.e. uniform), or do the outputs concentrate in some other function throughout the space?
Ed25519 is a typical elliptic-curve signature scheme, in a group of large prime order $\ell \approx 2^{252}$ on a curve over the field $\mathbb F_p$ for $p = 2^{255} - 19$. A secret key is a uniform random 32-byte string; a signature is a 64-byte string encoding a point $R$ on the curve generated by the standard base point, and encoding a scalar $s \in \mathbb Z/\ell\mathbb Z$.
Since for any fixed message $m$ the key-to-signature function is a map from a 32-byte space to a 64-byte space, it can't possibly induce a uniform distribution on the output space. In fact, for each choice of $R$, of which there are $\ell$ possibilities, there is a unique choice of $s$, so there are only about $2^{252}$ possible signaturesobviously, some secret keys share a common signature on $m$ (but there are so few collisions it doesn't matter in practice).
Further, the signatures are easily distinguishable from uniform random bit strings: the scalar is always less than $\ell$, and the encoding of $R$ contains an encoding of an integer $x$ such that $(1 + x^2)/(1 + (121665/121666) x^2)$ is a quadratic residue modulo $2^{255} - 19$.
Does the output distribution have approximately the same entropy as the uniform distribution on secret keys? At best it can have about 252 bits of entropy, because there are only about $2^{252}$ possible signaturesbut that's pretty close to the 256 bits of entropy possible in a secret key. At worst it ought not to have substantially less than 128 bits of entropy, because if not, there would be a random algorithm with substantially better probability of success at forgery than, e.g., Pollard's $\rho$ to recover the secret scalar from the public key, at comparable costs.
As it turns out, we choose between these options more or less uniformly at random: the choice of $R$ is determined by a scalar in $\mathbb Z/\ell\mathbb Z$ chosen by a pseudorandom function of the message, keyed by the secret key. So, to anyone without knowledge of the public key: Yes, the distribution on signatures has about 252 bits of entropy.
This is why you can make a VRF out of it, though you need a little more work like an output filter to get all the properties of a VRF, so that even knowing the public key is not enough to distinguish outputs from uniform random if you don't have the proof of output too.
External links referenced by this document: