5

I have a very strange misreporting of the size of one specific database.

According to \l+ in psql the DB size is 292 MB.

Doing a sum over the sizes reported by the following statement also reports very close to 292 MB.

SELECT
 table_name,
 pg_size_pretty(total_size) AS total_size
FROM (
 SELECT
 table_name,
 pg_total_relation_size(table_name) AS total_size
 FROM (
 SELECT ('"' || table_schema || '"."' || table_name || '"') AS table_name
 FROM information_schema.tables
 ) AS all_tables
 ORDER BY total_size DESC
) AS pretty_sizes

Yet a pg_dump of this database produces a sql file of 2.02 GB

I suspect that both counting methods skip LOBs since there are two tables in this database that contain TEXT fields with largish content (up to ~4MB per row).

Is there a way to include LOBs in the size calculation?

EDIT: It gets even stranger:

This query:

SELECT SUM(LENGTH(text_column))
FROM some_table

Gives a result of 2,091,245,318 (i.e. 2.02 GiB) which is about the size of the dump.

asked Jun 16, 2014 at 11:03
2
  • Your question related to LOBs is answered here: stackoverflow.com/questions/18636850/… Commented Jun 16, 2014 at 12:33
  • Unfortunately this is not the answer (see my edit above) Commented Jun 16, 2014 at 14:08

2 Answers 2

2

Note that unless your application is going out of its way to use the large objects interface, then it is not storing those columns as LOBs. Normal columns can store up to 1gb using TOAST (The Oversized-Attribute Storage Technique).

In addition to storing the data out of line, it may be stored compressed. Although you don't say so I assume the dump you are composing with is uncompressed, which could account for most of the size difference.

answered Jun 16, 2014 at 13:01
4
  • Hm, compression might explain it. Is there a way to query if rows are stored compressed? Commented Jun 16, 2014 at 14:12
  • @UloPe Not directly. However TEXT and other (VAR)CHAR types have a default storage strategy of 'EXTENDED' which means they definitely will be compressed if > TOAST_TUPLE_THRESHOLD (2kb). Read the page I linked about TOAST. Commented Jun 16, 2014 at 23:49
  • Your edit seems to validate my hypothesis. What size is your dump if you compress it using eg. gzip? Commented Jun 16, 2014 at 23:55
  • Indeed. Compressing the dump file with gzip reduces it's size to about 150 MB. Still I think it would be nice if there was a way to get the "real" size a databases content. Commented Jun 17, 2014 at 12:09
-1

a sql file will always be much bigger than the data stored in your db. It's a text representation of binary data.

answered Jun 16, 2014 at 13:38
3
  • I'm aware of that. However in this case we're talking about an order of magnitude in size difference in a database where the prevalent data type is text - see my edit in the original question. Commented Jun 16, 2014 at 13:58
  • I think large objects are compressed in postgresql, which would explain why the text format is that bigger... Commented Jun 16, 2014 at 14:27
  • This is definitely not true. It very much depends on the data. I've done a short test on one of my toy databases, it reports 8554 kB in psql, the text format dump is 2279 kB. Commented Jun 16, 2014 at 15:49

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.