I'm pretty new to database stuff.
I'm using Supabase to create an application where I keep track of the number of likes on certain items ('clicks'). I want to filter items either by the date the likes were added, or by certain categories the items have. So far I have a function that I can call from javascript like:
const { data, error } = await supabase.rpc('rpc_test', { "json_params": {
"categories": '{"fruits", "test"}',
"start_date": "2024年04月16日 00:22:35.837547+00",
} })
Which should return all items that have a category matching the array I pass in, and the number of clicks that have been created since start_date
and before and end_date
if provided, or zero if no clicks have been created in that time window. And it so nearly works, but I keep running into errors that I don't know how to fix.
The important tables in my database are:
Items:
id | name |
---|---|
1 | apple |
2 | beet |
Clicks:
item_id | created_at |
---|---|
1 | 2024年04月09日 |
2 | 2024年04月09日 |
Categories:
id | name |
---|---|
1 | vegetable |
2 | fruit |
Item Categories:
item_id | category_id |
---|---|
1 | 2 |
2 | 1 |
My function query currently looks like this:
create
or replace function public.rpc_test (json_params json default null) returns table (
id bigint,
created_at timestamp with time zone,
name text,
clicks bigint,
categories_arr text[]
) as $$
BEGIN
RETURN QUERY
select
items.id,
items.created_at,
items.name,
click_counts.clicks,
item_id_to_cat_array.categories_arr
from
items
LEFT JOIN (
SELECT item_categories.item_id AS itemid, array_agg(categories.name) AS categories_arr
FROM item_categories
JOIN categories ON categories.id = item_categories.category_id
GROUP BY item_categories.item_id
) item_id_to_cat_array ON items.id = item_id_to_cat_array.itemid
LEFT JOIN (
SELECT item_id as click_item_id, count(c.id) AS clicks
FROM clicks as c
WHERE (json_params->>'start_date' IS NULL OR c.created_at >= (json_params->>'start_date')::date)
AND (json_params->>'end_date' IS NULL OR c.created_at <= (json_params->>'end_date')::date)
GROUP BY c.item_id
) click_counts ON click_item_id = items.id
where
json_params->>'categories' IS NULL OR
(json_params->>'categories')::text[] && item_id_to_cat_array.categories_arr;
END;
$$ language plpgsql;
The only problem with this is that categories_arr
never has any data.
At various points I've had iterations of this that have worked for gathering the information I want but without the filtering in place. I've tried doing things with GROUP BY and HAVING instead, and I'm not really sure where to go.
How can I get more information about what is happening in my query?
I would like to see what categories_arr
is at every step in the process, but I don't know how to log that information.
1 Answer 1
It's not simple to monitor steps of an SQL query during execution. SQL is a declarative language. You seem to think procedurally. In an actual PL/pgSQL code block you can output anything with RAISE NOTICE
between steps. There are even debuggers, like the one built into pgAdmin - which I hardly ever use.
But your function is just a wrapper around one big SQL query. So that wouldn't give you anything.
Assuming a standard many-to-many design like:
After switching to simpler LANGUAGE sql
, your function could look like:
CREATE OR REPLACE FUNCTION public.rpc_test (_categories text[], _start_date date = NULL, _end_date date = NULL)
RETURNS TABLE (id bigint
, created_at timestamptz
, name text
, clicks bigint
, categories_arr text[])
LANGUAGE sql AS
$func$
SELECT i.item_id
, items.created_at
, items.name
, click_counts.clicks
, item_id_to_cat_array.categories_arr
FROM ( -- get distinct item IDs that share any category with input array
SELECT DISTINCT ic.item_id
FROM categories ca
JOIN item_categories ic ON ic.category_id = ca.id
WHERE ca.name = ANY (_categories)
) i
CROSS JOIN LATERAL ( -- get full array of categories only for qualifying items
SELECT ARRAY( -- array constructor is a bit cheaper
SELECT ca.name
FROM item_categories ic
JOIN categories ca ON ca.id = ic.category_id
WHERE ic.item_id = i.item_id
ORDER BY 1 -- optional, to get consistent sort order in output array
) AS categories_arr
) item_id_to_cat_array
CROSS JOIN LATERAL ( -- get counts only for qualifying items
SELECT count(*) AS clicks
FROM clicks c
WHERE c.item_id = i.item_id
AND (_start_date IS NULL OR c.created_at >= _start_date)
AND (_end_date IS NULL OR c.created_at <= _end_date)
) click_counts
JOIN items ON items.id = i.item_id
WHERE _categories IS NULL OR _categories && item_id_to_cat_array.categories_arr;
$func$;
You had multiple SQL issues. I rewrote the logic to make it work.
- Identify items that match any input category.
item_categories.item_id
is enough for now. - Collect the complete array of categories only for those selected few instead of doing that for all (which is expensive)
- Same for clicks.
- Finally join to table
items
to get item details.
Note that this expects input for _categories
, as the query wouldn't be good for returning all items. If you need that option, switch back to LANGUAGE plpgsql
and fork the case with separate query.
Explore related questions
See similar questions with these tags.
CREATE TABLE
scripts, and always disclose your version of Postgres."categories": "{fruits, test}"
. But why pass arguments wrapped into JSON? Pass plain values instead.