I am inserting millions of rows into a PostgreSQL 9.5 database and observe a constant growth of memory usage. As the tables are not that large and the operations performed (the insertions trigger a Pl/Python function) should not be that expensive I wonder why this happens.
At the moment PostgreSQL is using ~ 50 GB of total available 60 GB. I would like to understand how PostgreSQL is using those 50 GB especially as I fear that the process will run out of memory.
[Update] Tonight PostgreSQL ran out of memory and was killed by the OS.
$ pg_top
last pid: 13535; load avg: 1.26, 1.41, 1.42; up 2+02:57:11 19:29:26
3 processes: 1 running, 2 sleeping
CPU states: 12.4% user, 0.0% nice, 0.1% system, 87.4% idle, 0.0% iowait
Memory: 63G used, 319M free, 192M buffers, 28G cached
DB activity: 2 tps, 0 rollbs/s, 0 buffer r/s, 100 hit%, 42 row r/s, 0 row w/s
DB I/O: 0 reads/s, 0 KB/s, 0 writes/s, 0 KB/s
DB disk: 98.0 GB total, 41.9 GB free (57% used)
Swap: 38M used, 1330M free, 12M cached
Re-run SQL for analysis:
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
8528 postgres 20 0 50G 39G run 18.3H 97.55% 99.35% postgres: postgres my_db ::1(51692) EXECUTE
11453 postgres 20 0 16G 157M sleep 0:06 0.00% 0.00% postgres: postgres my_db ::1(51808) idle
13536 postgres 20 0 16G 17M sleep 0:00 0.00% 0.00% postgres: postgres postgres [local] idle
$ top
top - 21:51:48 up 2 days, 5:19, 4 users, load average: 1.40, 1.31, 1.23
Tasks: 214 total, 2 running, 212 sleeping, 0 stopped, 0 zombie
%Cpu(s): 12.4 us, 0.0 sy, 0.0 ni, 87.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.1 st
KiB Mem : 65969132 total, 341584 free, 40964108 used, 24663440 buff/cache
KiB Swap: 1400828 total, 1361064 free, 39764 used. 17366148 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
8528 postgres 20 0 54.563g 0.043t 4.886g R 99.0 69.3 1236:27 postgres
$ htop
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
8528 postgres 20 0 54.8G 43.8G 5028M R 98.3 69.7 20h43:51 postgres: postgres my_db ::1(51692) EXECUTE
8529 postgres 20 0 54.8G 43.8G 5028M S 0.0 69.7 0:00.04 postgres: postgres my_db ::1(51692) EXECUTE
8530 postgres 20 0 54.8G 43.8G 5028M S 0.0 69.7 0:00.04 postgres: postgres my_db ::1(51692) EXECUTE
8531 postgres 20 0 54.8G 43.8G 5028M S 0.0 69.7 0:00.03 postgres: postgres my_db ::1(51692) EXECUTE
8532 postgres 20 0 54.8G 43.8G 5028M S 0.0 69.7 0:00.04 postgres: postgres my_db ::1(51692) EXECUTE
8533 postgres 20 0 54.8G 43.8G 5028M S 0.0 69.7 0:00.07 postgres: postgres my_db ::1(51692) EXECUTE
8534 postgres 20 0 54.8G 43.8G 5028M S 0.0 69.7 0:00.06 postgres: postgres my_db ::1(51692) EXECUTE
8535 postgres 20 0 54.8G 43.8G 5028M S 0.0 69.7 0:00.06 postgres: postgres my_db ::1(51692) EXECUTE
8270 postgres 20 0 15.5G 5915M 5913M S 0.0 9.2 1:16.71 postgres: checkpointer process
11453 postgres 20 0 15.5G 4990M 4968M S 0.0 7.7 0:33.91 postgres: postgres my_db ::1(51808) idle
8268 postgres 20 0 15.5G 398M 397M S 0.0 0.6 0:42.65 /usr/lib/postgresql/9.5/bin/postgres -D /var/lib/postgresql/9.5/main -c config_file=/etc/postgresql/9.5/main/postgresql.conf
8271 postgres 20 0 15.5G 124M 122M S 0.0 0.2 0:11.12 postgres: writer process
439 root 20 0 68464 34500 30156 S 0.0 0.1 0:05.07 /lib/systemd/systemd-journald
8272 postgres 20 0 15.5G 21232 19488 S 0.0 0.0 1:11.16 postgres: wal writer process
my_db=# -- https://wiki.postgresql.org/wiki/Disk_Usage#General_Table_Size_Information
oid | table_schema | table_name | row_estimate | total_bytes | index_bytes | toast_bytes | table_bytes | total | index | toast | table
-----------+--------------------+-------------------------+--------------+-------------+-------------+-------------+-------------+------------+------------+------------+------------
123037947 | public | my_second_table | 9482 | 233570304 | 36601856 | 107692032 | 89276416 | 223 MB | 35 MB | 103 MB | 85 MB
123037936 | public | my_table | 4.42924e+06 | 4362895360 | 104685568 | 8192 | 4258201600 | 4161 MB | 100 MB | 8192 bytes | 4061 MB
my_db=# SELECT c.relname,
my_db-# pg_size_pretty(count(*) * 8192) as buffered, round(100.0 * count(*) / (SELECT setting FROM pg_settings WHERE name='shared_buffers')::integer,1) AS buffers_percent,
my_db-# round(100.0 * count(*) * 8192 / pg_relation_size(c.oid),1) AS percent_of_relation,
my_db-# round(100.0 * count(*) * 8192 / pg_table_size(c.oid),1) AS percent_of_table
my_db-# FROM pg_class c
my_db-# INNER JOIN pg_buffercache b
my_db-# ON b.relfilenode = c.relfilenode
my_db-# INNER JOIN pg_database d
my_db-# ON (b.reldatabase = d.oid AND d.datname = current_database())
my_db-# GROUP BY c.oid,c.relname
my_db-# ORDER BY 3 DESC
my_db-# LIMIT 10;
relname | buffered | buffers_percent | percent_of_relation | percent_of_table
---------------------------------+------------+-----------------+---------------------+------------------
my_table | 3995 MB | 26.0 | 100.0 | 100.0
my_table_pkey | 98 MB | 0.6 | 100.0 | 100.0
my_second_table | 85 MB | 0.6 | 100.1 | 45.3
pg_toast_123037947 | 73 MB | 0.5 | 100.1 | 100.0
pg_toast_123037947_index | 30 MB | 0.2 | 100.1 | 100.0
my_second_table_parent_id_idx | 22 MB | 0.1 | 100.1 | 100.0
my_second_table_pkey | 13 MB | 0.1 | 100.2 | 100.0
pg_constraint_oid_index | 16 kB | 0.0 | 100.0 | 100.0
sql_languages | 40 kB | 0.0 | 500.0 | 83.3
pg_transform_type_lang_index | 8192 bytes | 0.0 | 100.0 | 100.0
my_db=# SELECT COUNT(*) FROM pg_stat_activity;
count
-------
2
$ sudo pmap -p 8528
8528: postgres: postgres my_db ::1(51692) EXECUTE
000000e0cd2b7000 6168K r-x-- /usr/lib/postgresql/9.5/bin/postgres
000000e0cdabc000 132K r---- /usr/lib/postgresql/9.5/bin/postgres
000000e0cdadd000 48K rw--- /usr/lib/postgresql/9.5/bin/postgres
000000e0cdae9000 316K rw--- [ anon ]
000000e0ce548000 592K rw--- [ anon ]
000000e0ce5dc000 35663940K rw--- [ anon ]
...
$ less postgres.conf
# ...
max_connections = 20
shared_buffers = 15GB
work_mem = 384MB
maintenance_work_mem = 2GB
fsync = off
synchronous_commit = off
full_page_writes = off
max_wal_size = 8GB
min_wal_size = 4GB
checkpoint_completion_target = 0.9
effective_cache_size = 45GB
Please note that top
and htop
have been called at a later time.
3 Answers 3
A trigger that is defined as AFTER INSERT...FOR EACH ROW
will queue up info all the inserted rows and then fire the trigger for each one at the end of the statement. So if you insert millions of records with a single statement, that queue will take up a lot of memory.
BEFORE INSERT does not do this, it executes the trigger function for each row immediately before each one is inserted, and doesn't queue up anything. If possible, rewrite to a BEFORE trigger.
-
Thanks! At the moment each transaction includes (only) 1000 new rows but nevertheless this is a legit point and I'll try to rewrite the trigger. However, I need the id column (serial) of the new row inside the trigger function, though there should be a way to get this value...Brik– Brik2017年09月13日 20:30:45 +00:00Commented Sep 13, 2017 at 20:30
-
At only 1000 it really sounds like you have a memory leak in python code. I'd redefine the trigger function to be some no-op (and probably use a different language) and see if that makes the problem go away. If so, then you would know where to look.jjanes– jjanes2017年09月13日 20:33:58 +00:00Commented Sep 13, 2017 at 20:33
Which version of PostgreSQL are you running? I've seen one case where PostgreSQL 12.x had memory leak with work_mem=128MB
but it didn't leak any memory with work_mem=32MB
. For that workload the performance didn't have any visible degration with only 32 MB work_mem
so it was a good fix for that case.
I believe this is fully fixed in PostgreSQL 13.x only:
According to section E.2.3.1.4. General Performance PostgreSQL has following behavior until version 13: "Previously, hash aggregation was avoided if it was expected to use more than work_mem
memory. [...] once hash aggregation had been chosen, the hash table would be kept in memory no matter how large it got — which could be very large if the planner had misestimated". Using high work_mem
value results in hash aggregation getting chosen more often and you end up going over any set memory limits as a result. With PostgreSQL version 13.0 or greater the memory usage will not go above work_mem
setting and PostgreSQL will use temporary files on disk to handle the resource requirements instead if planner estimate is really bad. With version 12.x or lesser the actual memory usage is unlimited if hash aggregation is chosen due planner misestimation.
And when I wrote above that PostgreSQL leaked memory, I meant that memory usage continued to raise until OOM Killer killed one of the PostgreSQL processes and the PostgreSQL master did full restart. It might have been that memory usage just raised so much that it looked like leak but in reality given infinite RAM it would have released the memory some time in the future.
Note that even if postgres logically releases memory it has allocated, it may not be returned to operating system depending on the malloc()
/free()
implementation of your execution environment. That may result in multiple PostgreSQL processes getting over the limit due use of hash aggregation as described above and the memory is never released to OS even though PostgreSQL isn't actually using it either. This happens because technically malloc()
may use brk()
behind the scenes and actually releasing the memory back to OS is only possible only in some special cases. Historically this is considered "a feature, not a bug" in UNIX systems because it can be worked around simply by adding enough swap. The system will then swap the freed-but-not-really-usable parts of RAM and that's fine because that memory is never actually used in the future. If system sometimes swaps out some memory that was actually used, it will swapped in to memory when used but if this is rare, that will not cause major performance problems in practice.
See also: my answer to How to limit the memory that is available for PostgresSQL server?
I think it might be some kind of bug in PostgreSQL. We're running PostgreSQL 12 with 3 redundant servers (one master + 2 standby servers) each having 64 GB of RAM.
Originally we configured the system to have shared_buffers = 24 GB
and work_mem = 128 MB
. The system seemed to eat memory until OOM Killer finally took over when the system run out of memory.
I reconfigured the system to have shared_buffers = 16 GB
and work_mem = 32 MB
and magically all our problems went away. Note that the system behavior changed totally and instead of running out of memory in around 12-24 h the system has stable 28-29 GB of disk cache available.
According to munin
graphs, the system was continously using more and more shmem
which is documented as "Shared Memory (SYSV SHM segments, tmpfs)." With the original settings above, the shmmem
run up to 28-31 GB before OOM Killer took postgresql down which freed the same 28-31 GB of RAM. Now it's 1.33 MB on all servers so this clearly does not scale according to above configuration values.
I don't know exact steps to reproduce and I'm not going to reconfigure our production environment to debug the issue.
We're running PostgreSQL with huge_pages = try
in case that makes a difference.
For details, see my other answer: https://dba.stackexchange.com/a/285423/29183
-
1If I had to guess, PostgreSQL will cause huge memory leak if memory taken by
shared_buffers
AND ALLwork_mem
of all clients do not fit in huge pages. We have 30 GB put aside for huge pages and it seems probable that 24 GB forshared_buffers
+work_mem
of parallel client processes have went over 30 GB limit.Mikko Rantalainen– Mikko Rantalainen2020年07月01日 11:30:14 +00:00Commented Jul 1, 2020 at 11:30
Explore related questions
See similar questions with these tags.
my_second_table
based on rows inserted intomy_table
. The function imports further Python modules such asnumpy
. Maybe there is a memory leak somewhere in the Python code?CREATE TRIGGER my_trigger AFTER INSERT ON my_table FOR EACH ROW EXECUTE PROCEDURE my_plpython3u_fct()
CREATE FUNCTION my_plpython3u_fct() RETURNS trigger AS $$ import my_module as m \\ m.process(TD) $$ LANGUAGE plpython3u;
top
below thepg_top
dump