Let us suppose that I have a table d_day
containing two columns, id_day
, and date
, and a table fact
that contains two columns, id_day
and fact
. My table fact
is partitioned across id_day
.
The request
SELECT * FROM fact
WHERE "ID_DAY" between <ID_A> and <ID_B>
is pruning partitions properly, and is almost instant but
SELECT * FROM fact
INNER JOIN d_dat USING (id_day)
WHERE date between <DATE_A> and <DATE_B>
is not. I have a UNIQUE constraint over date
in my d_day
column. So, I have a 1:1 relationship between the two columns.
On a broader level, I do not understand how the query planner deals with this demand.
Even something like :
SELECT * FROM fact
WHERE id_day IN (
SELECT DISTINCT(id_day) WHERE date BETWEEN <DATE_A> and <DATE_B>
)
is slow.
Here is a practical example when filtering on DATEJ:
set enable_partitionwise_join = true; set enable_partitionwise_aggregate = true;
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)(
select * from "F_TDOJ_HIST_2" fth
inner join "D_TRAIN" dt using ("ID_TRAIN")
inner join "DL_CAPTURE_DATE" dj using ("ID_JOUR", "JX")
where "JX" = -72
and "ID_ENTNAT" between 4000 and 4999
and "DATEJ" = '2023-12-08'::date);
And here is the example when filtering on ID_JOUR
set enable_partitionwise_join = true; set enable_partitionwise_aggregate = true;
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)(
select * from "F_TDOJ_HIST_2" fth
inner join "D_TRAIN" dt using ("ID_TRAIN")
inner join "DL_CAPTURE_DATE" dj using ("ID_JOUR", "JX")
where "JX" = -72
and "ID_ENTNAT" between 4000 and 4999
and "ID_JOUR" = 4756);
Also, when I play the request using DATEJ on a copy of my database, partitioned across a different ID_JOUR span (30 ID_JOUR in a partition vs 3 ID_JOUR) now, I get this plan, which is much faster, so I know doing something is possible. I just do not understand why it is now slower, what to change, and why :
set enable_partitionwise_join = true; set enable_partitionwise_aggregate = true;
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS, FORMAT JSON)(
select * from "F_TDOJ_HIST_2" fth
inner join "D_TRAIN" dt using ("ID_TRAIN")
inner join "DL_CAPTURE_DATE" dj using ("ID_JOUR", "JX")
where "JX" = -72
and "ID_ENTNAT" between 4000 and 4999
and "DATEJ" = '2023-12-08'::date);
-
2Do yourself a favor, and change all those object names to lower case.RonJohn– RonJohn2023年10月19日 21:58:24 +00:00Commented Oct 19, 2023 at 21:58
-
1And remove quoted identifiers!Vérace– Vérace2023年10月20日 00:28:58 +00:00Commented Oct 20, 2023 at 0:28
-
I know this is the usual standardization, but I am undergoing a legacy system. But still, I will change it in the question.Doe Jowns– Doe Jowns2023年10月20日 09:32:24 +00:00Commented Oct 20, 2023 at 9:32
1 Answer 1
The decision which partitions must be scanned is either made during query planning or when the query starts executing. But PostgreSQL cannot know at that time what id_day
values belong to the d_day
rows where date
satisfies the condition. So it cannot exclude any partition.
-
You confirm that there is absolutely no way to filter on "DATE" through some reindexing of my table ? I have a lot of code to change if I must change all my filters from "DATE" to "ID_DAY".Doe Jowns– Doe Jowns2023年10月19日 15:01:43 +00:00Commented Oct 19, 2023 at 15:01
-
Can you think of a theoretical way how this could work? I cannot, short of executing a query on
d_day
first and queryingfact
second, with a list ofid_day
.Laurenz Albe– Laurenz Albe2023年10月19日 15:17:15 +00:00Commented Oct 19, 2023 at 15:17 -
I have a 1:1 relationship between
d_day
andid_day
in my tableday
(1 <--> 01/01/2010, 2 <--> 02/01/2010, ...), so it should be possible to deduce it. Still, I also tried aselect from where id_day in (select distinct id_day from day where d_day = ...)
and it was just as slowDoe Jowns– Doe Jowns2023年10月20日 09:43:28 +00:00Commented Oct 20, 2023 at 9:43 -
Yes, that won't help either. You won't get partition pruning that way.Laurenz Albe– Laurenz Albe2023年10月20日 10:12:10 +00:00Commented Oct 20, 2023 at 10:12
-
But why doesn't it work ?Doe Jowns– Doe Jowns2023年10月20日 12:27:21 +00:00Commented Oct 20, 2023 at 12:27