is it possible to limit somehow usage of resources for pg_dump?
The problem is, whole DBase is 775GB and i have two very large tables:
pg_largeobject table is – 390GB
pg_largeobject statistics:
Index scans 778313505
Index tuples fetched 1079812292
Tuples inserted 201022148
Table size 395 GB
Indexes size 6900 MB
hive.lob_messages – 265GB.
Index scans 194377937
Index tuples fetched 183894199
Tuples inserted 16902844
Table size 8127 MB
Toast table size 272573947904
Indexes size 3090 MB
In all cases I am executing this pg_dump command:
pg_dump -U postgres -d hive -Fc -v -f /nfs/hive_dump_10/hive_full_20191016_1945.dmp 2> /nfs/hive_dump_10/hive_full_20191016_1945.log
When i try to run pg_dump, to dump all the data it is killed by "out of memory". My Postgre SQL server have had 20gb of RAMs and when I started a pg_dump, it kills after about ~30minutes. saying out of memory.
DB with 20GB of rams and pg_dump is killed by this *.log line:
pg_dump: reading large objects
Kern.log:
Oct 15 09:00:18 hive-psql kernel: [21504626.583951] Out of memory: Kill process 30663 (pg_dump) score 750 or sacrifice child
Oct 15 09:00:18 hive-psql kernel: [21504626.584000] Killed process 30663 (pg_dump) total-vm:19151692kB, anon-rss:16873728kB, file-rss:0kB, shmem-rss:0kB
Here is ZABBIX monitoring tool showing my three times tried a pg dump with 20GB rams: enter image description here
So to cope with the problem i gave my all resources, additional +20 GB ram (40gb in total), and the it kills after ~3hours by this *.log line:
pg_dump: dumping contents of table "hive.lob_messages"
And kern.log says:
Oct 16 23:19:01 hive-psql kernel: [15614.408113] Out of memory: Kill process 2693 (pg_dump) score 840 or sacrifice child
Oct 16 23:19:01 hive-psql kernel: [15614.408169] Killed process 2693 (pg_dump) total-vm:38363104kB, anon-rss:34020140kB, file-rss:24kB, shmem-rss:0kB
ZABBIX window again, with 40GB in total: enter image description here
I do not have any more resources for that. So what are my options? My DB grows everyday, thank you. I think that additional 20GB will be suffice, 60GB of rams it should be enough . But is this OK situation? What to do when my DB became like ~2TB in size and etc...
Other question, is my hive.lob_messages table is okay? Table size and toast table size is different.
2 Answers 2
What the graph shows is pg_dump probably starting at 19:45 and having a normal memory consumption until 20:00. At 20:00, it appears that ~30 GB are suddenly allocated and not much is released until 23:15 when presumably the Out-Of-Memory condition occurs.
From the comments, the kernel says:
hive-psql kernel: [15614.408169] Killed process 2693 (pg_dump) total-vm:38363104kB, anon-rss:34020140kB, file-rss:24kB, shmem-rss:0kB
The anon-rss
of pg_dump at about 36GB shows that it is indeed pg_dump that over-allocates. There is something wrong in pg_dump needing that much memory.
Your database surely has many large objects, some of them may be very large (the theorical maximum per object is 2TB) but pg_dump processes large objects by:
opening a cursor on
SELECT oid FROM pg_largeobject_metadata ORDER BY 1
and reading it by chunks of 1000 entries: that should never blow up.for each large object, loop on lo_read to fetch the contents into a buffer of 16384 bytes: that should never blow up.
The fact that the table itself weighs 390 GB should not be a reason for pg_dump to do large allocations.
But, first time pg_dump was killed at "retreiveing large objects"
Assuming pg_dump displays this with the -v
options (by the way, you did not mention what options were passed), it should display "reading large objects" instead.
-
Thank you Daniel for your message. I will edit my first post to answer to your questions.Gumuliokas– Gumuliokas2019年10月23日 05:11:01 +00:00Commented Oct 23, 2019 at 5:11
Postgres pg_dump sacrificed memory allocation check to speed up the back up process. pg_dump should be rewritten to take into account out of memory problem. What I did was let pg_dump backup non-blob fields then I wrote a program to backup blob fields separately. I had to write a restore program too.
-
2This answer would be better with a) a reference to the code change assertion b) an explanation of how your custom code worked.Michael Green– Michael Green2019年12月29日 11:11:34 +00:00Commented Dec 29, 2019 at 11:11
pg_dump
client tool?