Postgres Transaction OOM on 100k DDL statements

Question 1

We execute approximately 100k DDL statements in a single transaction in PostgreSQL. During execution, the respective Postgres connection gradually increases on its memory usage and once it can't acquire more memory (increasing from 10MB to 2.2GB usage on 3GB ram), OOM killer hits it with 9 which results in Postgres being gone to recovery mode.

BEGIN;
CREATE SCHEMA schema_1;
-- create table stmts - 714
-- alter table add pkey stmts - 714
-- alter table add constraint fkey stmts - 34
-- alter table add unique constraint stmts - 2
-- alter table alter column set default stmts - 9161
-- alter table alter column set not null stmts - 2405
-- alter table add check constraint stmts - 4
-- create unique index stmts - 224
-- create index stmts - 213
CREATE SCHEMA schema_2;
-- same ddl statements as schema_1 upto schema_7
-- ...
-- ...
-- ...
CREATE SCHEMA schema_7;
COMMIT

Including the create schema statement, approximately 94304 DDL statements were meant to be executed.

As per Transactional DDL in PostgreSQL

Like several of its commercial competitors, one of the more advanced features of PostgreSQL is its ability to perform transactional DDL via its Write-Ahead Log design. This design supports backing out even large changes to DDL, such as table creation. You can't recover from an add/drop on a database or tablespace, but all other catalog operations are reversible.

We have even imported approximately 35GB of data into PostgreSQL in a single transaction without any problem, but why does the Postgres connection requires huge memory when executing thousands of DDL statements in single transaction?

We can temporarily resolve it by increasing the RAM or allocating swap, but we can say that the number of schema creation in a single transaction can increase up to 50 - 60 (Approx 1M DDL statements) which would require 100+ Gigs of RAM or swap which isn't feasible right now.

PostgreSQL version: 9.6.10

Is there any reason why executing lots of DDL statements requires more memory while dml statements does not? Isn't both handles transactions by writing to underlying WAL? So why, for DLL it's different?

Reason for Single Transaction

We sync the entire database of customers from Customer Premise (SQL Server) to cloud (PostgreSQL). All customers have different no of databases. Process is, entire data will be generated as CSV from SQL Server and import into PostgreSQL using Temp Tables, COPY and ON CONFLICT DO UPDATE. During this process, we treat each customer as a single database in PG and individual DB in customer's SQL Server as schemas in customer's PG DB.

So based on the CSV data, we will create the schemas dynamically and import data into it. As per our application design, the data in PG should be strictly consistent at any point of time and there shouldn't be any partial schema / tables / data. So we had to achieve this in a single transaction. Also we incrementally sync from customer to cloud DB every 3 minutes. So the schema creation can happen either in first sync or incremental sync. But the probability of creating so many schemas in first sync itself is very high.

Update 1

Commenting the ALTER TABLE ALTER COLUMN statements greatly reduced the memory usage as it now takes only 300MB atmost. Have to merge those into the CREATE TABLE statements itself.

Will ask the core problem in PG Hackers mailing list.

Question 2

Is the CREATE DATABASE statement issued automatically as part of the first sync process (obviously not as part of the same transaction, because CREATE DATABASE cannot be executed inside a transaction block) or is it executed in a separate process? Related question (possibly a rewording of the previous one): how does the application become aware of a new customer/new database?

Question 3

Can you modify the procedure that creates the DDL to get rid of the ALTER COLUMN statements by adjusting the CREATE TABLE statements? That would get rid of approximately 11.5K statements, for first schema only.

Question 4

Independently you could put each schema in a separate transaction.

Question 5

You can greatly reduce the number of ddl statements by packing many clauses into a single alter table statement for the same table ...

Question 6

@AndriyM create database executed in a seperate process. Customer creation is a seperate process. We maintain customer info and connection properties in a distributed way (etcd)

Question 7

This bit of comment in src/backend/utils/cache/relcache.c seems relevant:

 * If we Rebuilt a relcache entry during a transaction then its
 * possible we did that because the TupDesc changed as the result
 * of an ALTER TABLE that ran at less than AccessExclusiveLock.
 * It's possible someone copied that TupDesc, in which case the
 * copy would point to free'd memory. So if we rebuild an entry
 * we keep the TupDesc around until end of transaction, to be safe.
 */
 if (remember_tupdesc)
 RememberToFreeTupleDescAtEOX(relation->rd_att);

I don't really understand it, as who is this "someone" that might have a pointer? This is private memory, not shared memory. Anyway, it does seem to explain the bloat, as every 'alter table' statement in the same transaction leaves behind another copy of TupDesc for that table. And apparently, even if you use multiple actions in one alter table, each separate action also leaves behind a copy. But whatever the merits, this does explain an big part of the memory usage.

See the pg hackers maillist for more discussion.

Question 8

A better idea entirely is to use SQL Server FDW which actually has the logic to pull in Microsoft SQL Server into PostgreSQL format (for example, Bit gets mapped to Bool). From this point

Then every three minutes,

you import the foreign schema into last_fetch_schema
if the last_fetch_schema is different from local_schema
- you resync schemas
you copy all of the data over with a INSERT INTO ... SELECT ON CONFLICT DO UPDATE, and you can select only the newest data.
you drop the foreign schema last_fetch_schema

What do you gain?

On first load, you can simply use CREATE TABLE local.foo ( LIKE foreign.foo)
You can easily compare meta-data differences
CSVs loses types and leave you to infer things, FDW can read meta-data catalog.
Grabbing only the newest stuff is very simply if the rows are versioned/ you don't have to send the entire database anymore.

Question 9

That was a good suggestion. But not everyone's SQL Server is accessible over internet. Customers has no restrictions in outbound connections, but most of the customers have difficulties in configuring Inbound Connections (this is just one, there also other cases like db / table / column creation / deletion / modification in premise patch etc..). Based on the customer volume, this isn't kinda feasible / scalable for us.

Question 10

Normally you do this with VPN. I just have a strong feeling you're barking up the wrong tree, but good luck with it. There are lots of other solutions depends on how much work you want to put into it. =)

jjanes jjanes 42.4k3 gold badges44 silver badges54 bronze badges · Accepted Answer · 2018-10-15 02:12:06Z

This bit of comment in src/backend/utils/cache/relcache.c seems relevant:

 * If we Rebuilt a relcache entry during a transaction then its
 * possible we did that because the TupDesc changed as the result
 * of an ALTER TABLE that ran at less than AccessExclusiveLock.
 * It's possible someone copied that TupDesc, in which case the
 * copy would point to free'd memory. So if we rebuild an entry
 * we keep the TupDesc around until end of transaction, to be safe.
 */
 if (remember_tupdesc)
 RememberToFreeTupleDescAtEOX(relation->rd_att);

I don't really understand it, as who is this "someone" that might have a pointer? This is private memory, not shared memory. Anyway, it does seem to explain the bloat, as every 'alter table' statement in the same transaction leaves behind another copy of TupDesc for that table. And apparently, even if you use multiple actions in one alter table, each separate action also leaves behind a copy. But whatever the merits, this does explain an big part of the memory usage.

See the pg hackers maillist for more discussion.

Stack Exchange Network

Postgres Transaction OOM on 100k DDL statements

Reason for Single Transaction

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions