How to reuse list of columns in multiple statements

Question 1

I have a list of columns for each entity I want to get from a SELECT/UPDATE/INSERT statement.

I want to reuse the same list instead of having a copy in each query.

Example:

SELECT col1, col2 FROM entity;
INSERT INTO entity (...) VALUES (...) RETURNING col1, col2;
UPDATE entity SET ... RETURNING col1, col2;

It's not always simple col1, but also more complex expressions e.g. COALESCE(a, b). That's why I am aiming for reuse.

One way I found it can be done is with functions such as this:

CREATE FUNCTION to_entity_columns(
 e entity,
 OUT col1 INTEGER,
 OUT col2 INTEGER
)
AS
$$
SELECT
 e.col1,
 e.col2
$$ LANGUAGE SQL
 IMMUTABLE
 STRICT;

It's possible to do:

SELECT (to_entity_columns(entity)).* FROM entity;
INSERT INTO entity (...) VALUES (...) RETURNING (to_entity_columns(entity)).*;
UPDATE entity SET ... RETURNING (to_entity_columns(entity)).*;

While this approach works, the query time now scales with a number of rows. This means the time can go up as much as 100x or 1000x. I see queries going from 1ms to 1s. The function is always IMMUTABLE but Postgres won't inline it as I would hope. It is because (see source code here) the function returns a RECORD.

I have tried to modify the function e.g. to return a composite type instead, remove .* from the function call, but it doesn't make a difference.

The question here is twofold:

a) Is there a way to make the functions like the one above work with reasonable performance?

b) Are there any alternatives that would allow simple reuse of the list of columns like shown in the example above?

Question 2

I guess that at least part of the performance problem arises because the function will be evaluated once per result row and result column, as the documentation states:

For example, if myfunc() is a function returning a composite type with columns a, b, and c, then these two queries have the same result:
SELECT (myfunc(x)).* FROM some_table;
SELECT (myfunc(x)).a, (myfunc(x)).b, (myfunc(x)).c FROM some_table;
Tip

PostgreSQL handles column expansion by actually transforming the first form into the second. So, in this example, myfunc() would get invoked three times per row with either syntax. If it's an expensive function you may wish to avoid that, which you can do with a query like:

SELECT m.* FROM some_table, LATERAL myfunc(x) AS m;

Placing the function in a LATERAL FROM item keeps it from being invoked more than once per row. m.* is still expanded into m.a, m.b, m.c, but now those variables are just references to the output of the FROM item. (The LATERAL keyword is optional here, but we show it to clarify that the function is getting x from some_table.)

So you could for example rewrite the INSERT as

WITH x(r) AS (
 INSERT INTO entity (...) VALUES (...)
 RETURNING to_entity_columns(entity)
)
SELECT * FROM r;

I cannot say if that will inline the function or not (read the Wiki article and experiment), but at least it will avoid calling the function more often than necessary.

Question 3

Thank you for your suggestions. I've tried these things and the speedup is significant e.g. 50% but overall very marginal - instead of 100x slow down, I get 50x. I think functions are not a way to go about this unless they are reliably inlined.

Laurenz Albe Laurenz Albe 61.9k4 gold badges57 silver badges93 bronze badges · Answer 1 · 2021-09-08 09:52:10Z

I guess that at least part of the performance problem arises because the function will be evaluated once per result row and result column, as the documentation states:

For example, if myfunc() is a function returning a composite type with columns a, b, and c, then these two queries have the same result:
SELECT (myfunc(x)).* FROM some_table;
SELECT (myfunc(x)).a, (myfunc(x)).b, (myfunc(x)).c FROM some_table;
Tip

PostgreSQL handles column expansion by actually transforming the first form into the second. So, in this example, myfunc() would get invoked three times per row with either syntax. If it's an expensive function you may wish to avoid that, which you can do with a query like:

SELECT m.* FROM some_table, LATERAL myfunc(x) AS m;

Placing the function in a LATERAL FROM item keeps it from being invoked more than once per row. m.* is still expanded into m.a, m.b, m.c, but now those variables are just references to the output of the FROM item. (The LATERAL keyword is optional here, but we show it to clarify that the function is getting x from some_table.)

So you could for example rewrite the INSERT as

WITH x(r) AS (
 INSERT INTO entity (...) VALUES (...)
 RETURNING to_entity_columns(entity)
)
SELECT * FROM r;

I cannot say if that will inline the function or not (read the Wiki article and experiment), but at least it will avoid calling the function more often than necessary.

Thank you for your suggestions. I've tried these things and the speedup is significant e.g. 50% but overall very marginal - instead of 100x slow down, I get 50x. I think functions are not a way to go about this unless they are reliably inlined.

Stack Exchange Network

How to reuse list of columns in multiple statements

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

How to reuse list of columns in multiple statements

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions