Theoretical Computer Science
ds.algorithms lower-bounds it.information-theory
Updated Tue, 21 Jun 2022 03:25:49 GMT

Showing that interval-sum queries on a binary array can not be done using linear space and constant time


You are given a $n$-sized binary array.

I want to show that no algorithm can do the following (or to be surprised and find out that such algorithms exist after all):

1) Pre-process the input array using unlimited time, but only using $O(n)$ bits.

2) Answer queries in constant time, where query $(x,y)$ asks for the number of set bits between index $x$ and index $y$ in the array.

It seems that constant time per query should not allow the algorithm to read enough information to compute the number of set bits.

How can we prove that no such algorithm exists?

A more general question would be,

given that the algorithm is allowed to use $f(n)$ space, what lower bound on the query time can we derive?

Obviously, if we have $f=\Omega(n\log n)$ space we can store the all partial sums and answer queries in $O(1)$, but what if $f$ is smaller?


You may assume that the size of a memory word is $\Theta(\log n)$ and we can read the $x,y$ indices in constant time.




Solution

I believe that what you are looking for is a compact data structure supporting the rank operation. See...

https://en.m.wikipedia.org/wiki/Succinct_data_structure

Specifically, you can modify Emils (first) solution to remove the pop count operation and replace it with a lookup table (for the details see the wiki article). By reducing the size of a block to (log n)/2 bits, the lookup table uses o(n) bits.





Comments (5)

  • +0 – (* facepalm *) Lookup table, obviously. Why didn't I think of that. — Feb 24, 2016 at 20:31  
  • +0 – In view of the OP's comment above, it's worth noting that the structure can be easily implemented as a sliding window so that moving the window by one bit (or even block) also takes constant time. — Feb 24, 2016 at 22:08  
  • +0 – @EmilJeábek - translating your solution to the sliding window setting isn't straight forward. Instead of counting the number of set bits in the entire array-part that precedes the current block, we need to count the number of set bits within the sliding window that precedes this block, which doesn't seem doable in constant time. Am I missing anything? — Feb 25, 2016 at 09:17  
  • +0 – No, you don't need to do that. Since you will ultimately subtract the count for $x$ from the count for $y$, it doesn't matter if all the stored counts are off by a fixed amount; in other words, the left edge of the sliding window can be assigned an arbitrary count. Thus, when moving the window, you only need to update the count for the block where the new element resides. From time to time, you need to reduce all the counts so that they do not occupy too much space, but this little work can be easily distributed so that it only takes O(1) time per any given query. — Feb 25, 2016 at 11:07  
  • +0 – Actually, the nuisance in the last sentence can be elegantly avoided altogether by computing all counts modulo a fixed number greater than $n$. — Feb 25, 2016 at 12:13  


External Links

External links referenced by this document: