Postgres reported table / db size varies wildly from dump size

Question 1

I have a very strange misreporting of the size of one specific database.

According to \l+ in psql the DB size is 292 MB.

Doing a sum over the sizes reported by the following statement also reports very close to 292 MB.

SELECT
 table_name,
 pg_size_pretty(total_size) AS total_size
FROM (
 SELECT
 table_name,
 pg_total_relation_size(table_name) AS total_size
 FROM (
 SELECT ('"' || table_schema || '"."' || table_name || '"') AS table_name
 FROM information_schema.tables
 ) AS all_tables
 ORDER BY total_size DESC
) AS pretty_sizes

Yet a pg_dump of this database produces a sql file of 2.02 GB

I suspect that both counting methods skip LOBs since there are two tables in this database that contain TEXT fields with largish content (up to ~4MB per row).

Is there a way to include LOBs in the size calculation?

EDIT: It gets even stranger:

This query:

SELECT SUM(LENGTH(text_column))
FROM some_table

Gives a result of 2,091,245,318 (i.e. 2.02 GiB) which is about the size of the dump.

Question 2

Your question related to LOBs is answered here: stackoverflow.com/questions/18636850/…

Question 3

Unfortunately this is not the answer (see my edit above)

Question 4

Note that unless your application is going out of its way to use the large objects interface, then it is not storing those columns as LOBs. Normal columns can store up to 1gb using TOAST (The Oversized-Attribute Storage Technique).

In addition to storing the data out of line, it may be stored compressed. Although you don't say so I assume the dump you are composing with is uncompressed, which could account for most of the size difference.

Question 5

Hm, compression might explain it. Is there a way to query if rows are stored compressed?

Question 6

@UloPe Not directly. However TEXT and other (VAR)CHAR types have a default storage strategy of 'EXTENDED' which means they definitely will be compressed if > TOAST_TUPLE_THRESHOLD (2kb). Read the page I linked about TOAST.

Question 7

Your edit seems to validate my hypothesis. What size is your dump if you compress it using eg. gzip?

Question 8

Indeed. Compressing the dump file with gzip reduces it's size to about 150 MB. Still I think it would be nice if there was a way to get the "real" size a databases content.

Question 9

a sql file will always be much bigger than the data stored in your db. It's a text representation of binary data.

Question 10

I'm aware of that. However in this case we're talking about an order of magnitude in size difference in a database where the prevalent data type is text - see my edit in the original question.

Question 11

I think large objects are compressed in postgresql, which would explain why the text format is that bigger...

Question 12

This is definitely not true. It very much depends on the data. I've done a short test on one of my toy databases, it reports 8554 kB in psql, the text format dump is 2279 kB.

harmic harmic 9166 silver badges5 bronze badges · Accepted Answer · 2014-06-16 13:01:34Z

2

Note that unless your application is going out of its way to use the large objects interface, then it is not storing those columns as LOBs. Normal columns can store up to 1gb using TOAST (The Oversized-Attribute Storage Technique).

In addition to storing the data out of line, it may be stored compressed. Although you don't say so I assume the dump you are composing with is uncompressed, which could account for most of the size difference.

Share

Improve this answer

answered Jun 16, 2014 at 13:01

harmic's user avatar

harmic harmic

9166 silver badges5 bronze badges

4

Hm, compression might explain it. Is there a way to query if rows are stored compressed?

UloPe
– UloPe

2014年06月16日 14:12:28 +00:00
Commented Jun 16, 2014 at 14:12
@UloPe Not directly. However TEXT and other (VAR)CHAR types have a default storage strategy of 'EXTENDED' which means they definitely will be compressed if > TOAST_TUPLE_THRESHOLD (2kb). Read the page I linked about TOAST.

harmic
– harmic

2014年06月16日 23:49:14 +00:00
Commented Jun 16, 2014 at 23:49
Your edit seems to validate my hypothesis. What size is your dump if you compress it using eg. gzip?

harmic
– harmic

2014年06月16日 23:55:23 +00:00
Commented Jun 16, 2014 at 23:55
Indeed. Compressing the dump file with gzip reduces it's size to about 150 MB. Still I think it would be nice if there was a way to get the "real" size a databases content.

UloPe
– UloPe

2014年06月17日 12:09:14 +00:00
Commented Jun 17, 2014 at 12:09

Add a comment |

Stack Exchange Network

Postgres reported table / db size varies wildly from dump size

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Postgres reported table / db size varies wildly from dump size

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions