I have a very strange misreporting of the size of one specific database.
According to \l+
in psql
the DB size is 292 MB.
Doing a sum over the sizes reported by the following statement also reports very close to 292 MB.
SELECT
table_name,
pg_size_pretty(total_size) AS total_size
FROM (
SELECT
table_name,
pg_total_relation_size(table_name) AS total_size
FROM (
SELECT ('"' || table_schema || '"."' || table_name || '"') AS table_name
FROM information_schema.tables
) AS all_tables
ORDER BY total_size DESC
) AS pretty_sizes
Yet a pg_dump
of this database produces a sql file of 2.02 GB
I suspect that both counting methods skip LOBs since there are two tables in this database that contain TEXT
fields with largish content (up to ~4MB per row).
Is there a way to include LOBs in the size calculation?
EDIT: It gets even stranger:
This query:
SELECT SUM(LENGTH(text_column))
FROM some_table
Gives a result of 2,091,245,318 (i.e. 2.02 GiB) which is about the size of the dump.
-
Your question related to LOBs is answered here: stackoverflow.com/questions/18636850/…András Váczi– András Váczi2014年06月16日 12:33:39 +00:00Commented Jun 16, 2014 at 12:33
-
Unfortunately this is not the answer (see my edit above)UloPe– UloPe2014年06月16日 14:08:53 +00:00Commented Jun 16, 2014 at 14:08
2 Answers 2
Note that unless your application is going out of its way to use the large objects interface, then it is not storing those columns as LOBs. Normal columns can store up to 1gb using TOAST (The Oversized-Attribute Storage Technique).
In addition to storing the data out of line, it may be stored compressed. Although you don't say so I assume the dump you are composing with is uncompressed, which could account for most of the size difference.
-
Hm, compression might explain it. Is there a way to query if rows are stored compressed?UloPe– UloPe2014年06月16日 14:12:28 +00:00Commented Jun 16, 2014 at 14:12
-
@UloPe Not directly. However TEXT and other (VAR)CHAR types have a default storage strategy of 'EXTENDED' which means they definitely will be compressed if > TOAST_TUPLE_THRESHOLD (2kb). Read the page I linked about TOAST.harmic– harmic2014年06月16日 23:49:14 +00:00Commented Jun 16, 2014 at 23:49
-
Your edit seems to validate my hypothesis. What size is your dump if you compress it using eg. gzip?harmic– harmic2014年06月16日 23:55:23 +00:00Commented Jun 16, 2014 at 23:55
-
Indeed. Compressing the dump file with gzip reduces it's size to about 150 MB. Still I think it would be nice if there was a way to get the "real" size a databases content.UloPe– UloPe2014年06月17日 12:09:14 +00:00Commented Jun 17, 2014 at 12:09
a sql file will always be much bigger than the data stored in your db. It's a text representation of binary data.
-
I'm aware of that. However in this case we're talking about an order of magnitude in size difference in a database where the prevalent data type is text - see my edit in the original question.UloPe– UloPe2014年06月16日 13:58:33 +00:00Commented Jun 16, 2014 at 13:58
-
I think large objects are compressed in postgresql, which would explain why the text format is that bigger...Leo– Leo2014年06月16日 14:27:46 +00:00Commented Jun 16, 2014 at 14:27
-
This is definitely not true. It very much depends on the data. I've done a short test on one of my toy databases, it reports 8554 kB in
psql
, the text format dump is 2279 kB.András Váczi– András Váczi2014年06月16日 15:49:01 +00:00Commented Jun 16, 2014 at 15:49