I wanted to ask for the meaning of this fragment from Postgres doc regarding
The storage requirement for a short string (up to 126 bytes) is 1 byte plus the actual string, which includes the space padding in the case of character. Longer strings have 4 bytes of overhead instead of 1.
Let's assume that I have a
varchar(255) field. And now, the following statements:
Unsurprisingly, the manual is right. But there is more to it.
For one, size on disk (in any table, even when not actually stored on disk) can be different from size in memory. On disk, the overhead for short
varchar values up to 126 bytes is reduced to a 1 byte as stated in the manual. But the overhead in memory is always 4 bytes (once individual values are extracted).
The same is true for
char(n) - except that
char(n) is blank-padded to
n characters and you normally don't want to use it. Its effective size can still vary in multi-byte encodings because
n denotes a maximum of characters, not bytes:
strings up to
ncharacters (not bytes) in length.
All of them use
"char" (with double-quotes) is a different creature and always occupies a single byte.
Untyped string literals (
'foo') have a single byte overhead. Not to be confused with typed values!
CREATE TEMP TABLE t (id int, v_small varchar, v_big varchar); INSERT INTO t VALUES (1, 'foo', '12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890'); SELECT pg_column_size(id) AS id , pg_column_size(v_small) AS v_small , pg_column_size(v_big) AS v_big , pg_column_size(t) AS t FROM t UNION ALL -- 2nd row measuring values in RAM SELECT pg_column_size(1) , pg_column_size('foo'::varchar) , pg_column_size('12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890'::varchar) , pg_column_size(ROW(1, 'foo'::varchar, '12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890'::varchar)); id | v_small | v_big | t ----+---------+-------+----- 4 | 4 | 144 | 176 4 | 7 | 144 | 176
As you can see:
integerhas no overhead (but it has alignment requirements that can impose padding).
varcharis still just 1 byte while it has not been extracted from the row - as can be seen from the row size. (That's why it's sometimes a bit faster to select whole rows.)
External links referenced by this document:
Local articles referenced by this article: