I have written a PostgreSQL function, which returns a specific order of products. Now I would like, not only to show, but to put the results of the first SELECT
query to an array as well, so I can reuse the ID's inside another select query. I first tried to add an alias to the select query like SELECT * FROM (SELECT id FROM products) as pr
and use pr
inside the NOT IN(pr)
statement of the second query, but that doesn't work ...
I will explain it more clearly with an example, this is a simplified version of the function:
CREATE OR REPLACE FUNCTION featured_products(
valid_to_in timestamp without time zone,
taxonomy_id_in integer,
product_limit_in integer)
RETURNS SETOF integer AS
$BODY$
BEGIN
RETURN QUERY
(
-- #1
SELECT * FROM (
SELECT "product"."supplier_id" FROM products AS "product"
) AS "featured"
LIMIT 2
)
UNION ALL
SELECT *
FROM (
SELECT "product"."supplier_id" FROM products AS "product"
) AS "featured"
WHERE id NOT IN (
-- #2
SELECT * FROM (
SELECT "product"."supplier_id" FROM products AS "product"
) AS "featured"
LIMIT 2
)
LIMIT product_limit_in;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
I deleted some joins and GROUP BY
and ORDER BY
statements, so the function is a bit more readable. And I added #1
and #2
inside the code above, so you know what I mean with select query 1 and 2.
As you can see the query #2 should return the same results as query #1. In reality these queries are much bigger. So you I just want to replace the second, identical query with just an array of ID's. Less code and probably faster.
I don't know how to add the IDs returned from the first query, to an array and put that in a NOT IN(<id's>)
statement instead the second query.
Anyone who does know how to fix this?
1 Answer 1
It's a textbook case for a CTE , like @Daniel commented.
The example can be simplified some more. And you need to be aware of how LIMIT
works in a UNION
query.
CREATE OR REPLACE FUNCTION featured_products(valid_to_in timestamp
, taxonomy_id_in integer
, product_limit_in integer)
RETURNS SETOF integer AS
$func$
BEGIN
RETURN QUERY
WITH featured AS (SELECT supplier_id FROM products LIMIT 2)
SELECT supplier_id
FROM featured
UNION ALL
(
SELECT p.supplier_id
FROM products p
LEFT JOIN featured f USING (supplier_id)
WHERE f.supplier_id IS NULL
LIMIT product_limit_in
) -- parens required - or not?
END
$func$ LANGUAGE plpgsql VOLATILE;
LIMIT
can only be applied once in aUNION
(ALL
) query, unless you enclose the leg of the query in parentheses. You may or may not want to add parentheses.- The way I have it, a maximum of
product_limit_in
rows are returned in addition to the "featured" rows from the CTE. - If you remove the parentheses you get a maximum of
product_limit_in
rows total - meaning that even "featured" products may be discarded.
Related: Optimize a query on two big tables
- The way I have it, a maximum of
Either way, don't
ORDER BY
the outer (combined) result before youLIMIT
, if you can avoid it. Postgres can optimize the query very efficiently and just stop evaluating once enough rows have been returned (possibly fetching tuples from the top of a matching index). That would not be possible any more, which can make a huge difference in performance.Using
LEFT JOIN / NOT NULL
to exclude featured rows from the second SELECT, which is probably faster thanNOT IN
and does not carry "surprises" when dealing with NULL values or empty results.In Postgres (as opposed to some other RDBMS), you can refer to
p.supplier_id
andf.supplier_id
after joining withUSING (supplier_id)
.
And yes, the CTE is only evaluated once:
A useful property of
WITH
queries is that they are evaluated only once per execution of the parent query, even if they are referred to more than once by the parent query or siblingWITH
queries.
Bold emphasis mine.
Explore related questions
See similar questions with these tags.
WITH x as (...subquery...)
) at the upper level of the UNION query?SELECT * FROM <name_of_CTE_WITH_query>
? Or does it save the results in some kind of cache? Cause it still takes half a second to execute the function