I am reading the proof from here and found a technical lemma that seems to be incorrect (its proof is short and very vague). I know this is rather specific and the context is problematic, but I couldn't figure it out myself.
In page 67 (when proving a claim that is used for soundness of local decoding of low degree polynomials procedure), the writers use a "fact" A.9 (regarding "Well Distribution Lemma in Curves") that is located in page 156.
The proof is very short and I cannot grasp it.
Moreover, the statement seems to be incorrect by the following example:
Let $S=\{x_1\}$, then $\forall$ curve $C\in P(\langle x_1,\dots,x_k\rangle)$ we have $|C\cap S|\geq 1$ as $C$ is a multiset, so $$\mathbb{E}\left[\frac{|C\cap S|}{|C|}\right]\geq\mathbb{E}\left[\frac{1}{|C|}\right]>\frac{1}{|\mathbb{F}|^m}=\frac{|S|}{|\mathbb{F}|^m}$$ Maybe, I miss something in the statement of the lemma...
I will appreciate any help in understanding this problem.
Notation: Let $P(\langle x_1,\dots,x_k\rangle)$ the set of degree $k$ curves that evaluates to $x_1,\dots,x_k\in\mathbb{F}^m$ at the first $k$ field elements in $\mathbb{F}$ and we will use just $P$ as a shorthand for this set. Let $S$ be any subset of $ \mathbb{F}^m$. Below, we assume that multiplicity is taken into account when set cardinalities are computed and for any curve $C\in P$, $|C|:=|\{C(i) : k+1<i\leq \mathbb{F} \}|$.
Claim 1: $\mathbb{E}_{C}\big[ \frac{|C\cap S|}{|C|} \big]=\frac{|S|}{|\mathbb{F}|^m}$ where $C$ is a random curve in $P$.
Proof: $\mathbb{E}_{C}\big[ \frac{|C\cap S|}{|C|} \big]=\sum_{C'\in P}\Pr_{C}\big[C=C' \big]\cdot\Pr_{i}\big[C'(i) \in S\big]= \Pr_{C,i} \big[C(i)\in S \big]= \frac{|S|}{|F|^m} $
Here $C$ is a random curve in $P$ and $i$ is randomly sampled from $\{k+1,\dots,\mathbb{F} \}$. The last line holds because sampling a random curve from $P$ and then evaluating the curve at a random $i$ gives a random element from $\mathbb{F}^m$. To see why this is true consider the bipartite graph on $P\cup \mathbb{F}^m$ where for each $i\in \{k+1,\dots,\mathbb{F}\}$, each curve $C\in P$ has a neighbor $C(i)\in \mathbb{F}^m$. Its easy to check that this is a bi-regular graph. Thus, the distribution that samples a random $C$ and outputs a random neighbor $C(i)\in \mathbb{F}^m$ , is same as the distribution that randomly outputs an element from $\mathbb{F}^m$. The claim follows.
Remark 1: Claim 1 implies that if $|S|/|\mathbb{F}|^m\leq \delta$ then we must have $\Pr_{c}[\Pr_{i}[~c(i)\in \mathcal{S}] > \sqrt\delta] \leq \sqrt{\delta}$. This I believe is the claim (from the linked thesis) that OP refers to.
Remark 2: What happens if we just sample a random point $i$ from the entire field $\mathbb{F}$ instead of $\{k+1,\dots,\mathbb{F} \}$. Nothing will change much as it will incur a small(if $k\ll |\mathbb{F}|$) additive error of $k/|\mathbb{F}|$. In [1] below this error is there in the argument in page 38.
[1] https://madhu.seas.harvard.edu/papers/1997/arora-journ.pdf
External links referenced by this document: