Theoretical Computer Science
it.information-theory coding-theory st.statistics
Updated Thu, 02 Jun 2022 06:18:38 GMT

Notation of sequences in rate distortion theory

I have been reading whatever sources I could get my hands on today, regarding this problem. Most notes online about rate distortion theory come from the book Elements of Information Theory by Thomas M. Cover and Joy A. Thomas. The book seems well regarded so I assume i am misunderstanding something.

I.e the following slides. I am confused by the notation - which comes from the book. If we look at slides 14 and 16 in the slides deck linked. Firstly on slide 14 they denote a sequence of random variables (i.e a sequence of bits, as far as I understand) as $$X^n$$ $$X^n = (X_{1}, \ldots, X_{n}), \quad X_{n} \sim p(x)$$ and the codeword for that sequence as $$\hat{X}^n$$. Now moving on to slide 16 they seem to mix between $$X^n$$ and $$x^n$$ i.e they mention that the distortion between sequences $$d\left(x^{n}, \hat{x}^{n}\right)=\frac{1}{n} \sum_{i=1}^{n} d\left(x_{i}, \hat{x}_{i}\right)$$

And distortion for a $$\left(2^{n R}, n\right)$$ code: $$D=E d\left(X^{n}, g_{n}\left(f_{n}\left(X^{n}\right)\right)\right)=\sum_{x^{n}} p\left(x^{n}\right) d\left(x^{n}, g_{n}\left(f_{n}\left(x^{n}\right)\right)\right)$$

but as far as i can $$x^n$$ and $$X^n$$, and consequently $$\hat{x}^n$$ and $$\hat{X}^n$$ denote the same thing? What is the idea behind this, if any? It would make more sense to me to keep the capitalization consistent unless I am missing something? This notation is consistent across all sources I could find.

Solution

In information theory notation, capital letters such as $$X$$ denote random variables, and lowercase letters such as $$x$$ mean their possible outcomes (i.e., fixed values). For example you can write the definition of expectation as $$E[X] = \sum p(x) \cdot x$$. Moreover, vector values are denoted using length as superscripts, for example $$X^n$$ means $$(X_1, \ldots, X_n)$$ (similarly for $$x^n$$). Moreover a substring $$(X_i, \ldots, X_j)$$ is often denoted by $$X_i^j$$. These slides seem to be consistent with this convention.