Cryptography
pseudo-random-function terminology notation probability
Updated Thu, 21 Jul 2022 04:58:58 GMT

Probability conventions in cryptography


I am working on Victor Shoup's tutorial on game-based security proof and want to figure out some notions from the perspective of probability theory.

Consider the following PRF advantage defined on Page 11: $$ \bigl|\,\Pr[s \leftarrow S: A^{F_s}() = 1] - \Pr[f \leftarrow \Gamma_{\ell_1, \ell_2}: A^f() = 1]\,\bigr| $$ where $\{ F_s \}_{s \in S}$ is a family of keyed functions with key space $S$, domain $\{ 0,1 \}^{\ell_1}$ and range $\{ 0,1 \}^{\ell_2}$, and $\Gamma_{\ell_1, \ell_2}$ denotes the set of all functions from $\{ 0,1 \}^{\ell_1}$ to $\{ 0,1 \}^{\ell_2}$.

I have the following questions:

  1. Does the probability notion $\Pr[s \leftarrow S: \cdot]$ (or, $\Pr[\cdot | s \leftarrow S]$ ) has the same meaning as the notion $\Pr_{s \leftarrow S}[\cdot]$ that is also commonly used in crypto contexts (e.g., this)? For me, $\Pr[s \leftarrow S: \cdot]$, $\Pr[\cdot | s \leftarrow S]$, and $\Pr_{s \leftarrow S}[\cdot]$ seem to refer to the same conditional probability measure.

  2. How do we interpret a probability $\Pr[s \leftarrow S: A^{F_s}() = 1]$? Literally, I know that this captures the probability that the distinguisher $A$ outputs $1$ if it is given oracle access to a function $F_s$ keyed by $s \in S$, and the probability is taken over the random choice of $s$. However, from probability theory, what is the probability space $(\Omega, \mathcal{F}, \Pr)$ where the "event" that "the distinguisher $A$ with oracle access to $F_s$ outputs $1$" is defined? Does the sample space $\Omega$ contains all outcomes of $s \leftarrow S$ (so that we can say "the probability is taken over the random choice of $s$")? If so, does this implies that the two probabilities in the PRF advantage come from two different probability spaces?




Solution

I completely understand your thoughts, I also had a similar question here.

I will answer your question based on my understanding, I would like to hear any thoughts or disagreements, because I haven't completely figured out the situtation yet.

Let's consider the Kolmogorov's definition of probability and forget about Random variables (these are used just to help us by converting a probability space to one that can be more efficiently handled).

A Probability Space is just $(, \mathcal{F}, Pr)$ where:

  • $$ : is the set of the events
  • $\mathcal{F}$ : is a -algebra defined on $$, it is the event space.
  • $Pr$ : is a set function $P : \mathcal{F} \rightarrow \mathbb{R}$
  1. How do we interpret a probability $\Pr[s \leftarrow S: A^{F_s}() = 1]$? Literally, I know that this captures the probability that the distinguisher $A$ outputs $1$ if it is given oracle access to a function $F_s$ keyed by $s \in S$, and the probability is taken over the random choice of $s$. However, from probability theory, what is the probability space $(\Omega, \mathcal{F}, \Pr)$ where the "event" that "the distinguisher $A$ with oracle access to $F_s$ outputs $1$" is defined? Does the sample space $\Omega$ contains all outcomes of $s > \leftarrow S$ (so that we can say "the probability is taken over the random choice of $s$")? If so, does this implies that the two probabilities in the PRF advantage comes from two different probability spaces?

There are multiple level of approaches you can take, depending on the time have.

Consider the algorithm $B() = s \leftarrow S; A^{F_s}()$ that outputs whatever $A$ outputs. Now we can rewrite the above as $Pr[B() = 1]$. The $=\{0,1\}$ since $A$ is a distinguisher, see here. The $\mathcal{F} = \{\emptyset, \{1\}, \{0\},\{0,1\}\} = \mathcal{P}()$. Also let $Pr$ be a probability measure. Now I think it is clear.

But let's try to rewrite this. Consider a uniform distribution on $S$. That is $=S$, $\forall s \in S, Pr(s) = 1/|S|$, this is the probability measure, you can check that it satisfies the definition. Also $\mathcal{F}=\{\emptyset, s_1, s_1^c, s_2, s_2^c, \dots, \}$. In my opinion the $\mathcal{F}$ here is an overkill because we are only interested in single events, it is a way to model a simple random algorithm that outputs a single element. Now consider $B(s) = A^{F_s}()$. Now we have two probability spaces and we want to create a conditional probability. So we have

$$Pr(B(s)|s)=\dfrac{Pr(B(s),s)}{Pr(s)}$$

We can again create a -algebra $\mathcal{F}$ and repeat the process as above, but it will require a lot of time.

  1. Does the probability notion $\Pr[s \leftarrow S: \cdot]$ (or, $\Pr[\cdot | s \leftarrow S]$ ) has the same meaning as the notion $\Pr_{s \leftarrow S}[\cdot]$ that is also commonly used in crypto contexts (e.g., [this][2])? For me, $\Pr[s \leftarrow S: \cdot]$ and $\Pr[\cdot | s \leftarrow S]$ seem to be two conditional probabilities, and $\Pr_{s \leftarrow S}[\cdot]$ is like a conditional probability measure.

From my experience I think they refer to the same conditional probability measure. If someone has another view he can leave a comment to discuss it or update the answer.

My opinion: At the end of the day all of this is just notation to express the same thing. You can either consider the higher level approach I gave first, where you consider a random process that has embedded some other random processes, or you can analyze every random process itself and consider their relations. In cryptography we are not really interested in the low level mathematical mechanisms, this is why you find so many different notations. It is about the subject we want to describe and authors will always the higher level approach as soon as it is can be understood by the reader for the purpose of the specific matter.





Comments (5)

  • +0 – Thank you very much for your answer. I edited my first question. In this question, I want to figure out whether these three notations refer to the same probability measure so that we can meaningfully calculate probabilities assigned by this measure, as well as useful bounds (e.g., union bound). However, I am not sure whether these notations are probability measures since $s \leftarrow S$ is not an event ... — Jun 21, 2022 at 09:32  
  • +0 – Hello, I updated my answer. In my opinion and from my experience I think they refer to the same measure, but again I am not a mathematician, I'm a computer scientist. — Jun 21, 2022 at 10:14  
  • +0 – Hi, thanks again for your inspiring answer. I added my understanding to this question and tried to answer the two questions from the perspective of random variables. The answers seem to be reasonable to me. — Jun 21, 2022 at 14:26  
  • +0 – You better create an answer with your understanding. I am reading it now... — Jun 21, 2022 at 14:50  
  • +0 – I think mathematically speaking, you miss something on your explanation. According to the definition, you can only define a probability measure on a -algebra. At some point you define it on the event space $$. — Jun 21, 2022 at 15:05