I have a slow Postgres (v15.3) query and I can't seem to get to the bottom of why. I've considered partitioning, but articles suggest that with a 10GB database partitioning isn't required and it seems like a real pain to set up on a production database.
The tables are 'large' but small in the grand scheme of things compared to other PGSQL databases.
student_tracks has 37m rows in it, this is a months worth of data, so grows at a reasonable rate. However, we'll only retain the last 3 months worth of data. So say 100m rows on average.
students has 10m rows in it and like the above is 1 months worth of data, with a cap at 3 months.
The problem is, producing monthly reports (although ideally want to be able to run 3/6 months reports, although 1 month is fine for now) on these tables. The current query is as follows:
select
"students"."id",
"students"."student_type_id",
"students"."user_agent",
"students"."user_agent_is_mobile",
(
select
TO_CHAR(
MAX(student_tracks.enrolled_at) :: DATE,
'Mon yy'
)
from
"student_tracks"
where
student_tracks.student_id = students.id
and "student_tracks"."enrolled_at" between '2023-02-01'
and '2023-05-30'
) as "last_enrolled",
(
select
1
from
"chats"
where
chats.student_id = students.id
and "chats"."created_at" between '2023-02-01'
and '2023-05-30'
limit
1
) as "has_chat",
(
select
1
from
"chat_reports"
inner join "chats" on "chats"."id" = "chat_reports"."chat_id"
where
chats.student_id = students.id
and "chat_reports"."created_at" between '2023-02-01'
and '2023-05-30'
limit
1
) as "has_chat_report"
from
"students"
where
exists (
select
1
from
"student_tracks"
where
student_tracks.student_id = students.id
and "student_tracks"."enrolled_at" between '2023-02-01'
and '2023-05-30'
)
This query is taking 352 seconds and returns 7.8m records querying a read replica which is idle apart from these test queries.
The query plan for the above is as follows:
Gather (cost=1049084.91..31608500.92 rows=702040 width=215) (actual time=15461.166..352171.314 rows=7132605 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=80066569 read=434196
I/O Timings: shared/local read=774055.958
-> Parallel Hash Semi Join (cost=1048084.91..1693374.65 rows=292517 width=175) (actual time=15436.019..22445.654 rows=2377535 loops=3)
Hash Cond: ((students.id)::text = (student_tracks.student_id)::text)
Buffers: shared hit=688458 read=354409
I/O Timings: shared/local read=658645.994
-> Parallel Seq Scan on students (cost=0.00..426288.80 rows=4323980 width=175) (actual time=0.008..3101.324 rows=3517892 loops=3)
Buffers: shared hit=28393 read=354409
I/O Timings: shared/local read=658645.994
-> Parallel Hash (cost=894892.34..894892.34 rows=12255405 width=32) (actual time=15361.488..15361.489 rows=9815722 loops=3)
Buckets: 33554432 Batches: 1 Memory Usage: 2337216kB
Buffers: shared hit=659943
-> Parallel Seq Scan on student_tracks (cost=0.00..894892.34 rows=12255405 width=32) (actual time=0.019..6883.347 rows=9815722 loops=3)
" Filter: ((connected_at >= '2023-02-01 00:00:00'::timestamp without time zone) AND (connected_at <= '2023-05-30 00:00:00'::timestamp without time zone))"
Rows Removed by Filter: 2982354
Buffers: shared hit=659943
SubPlan 2
-> Result (cost=4.69..4.70 rows=1 width=32) (actual time=0.026..0.026 rows=1 loops=7132605)
Buffers: shared hit=35596058 read=66967
I/O Timings: shared/local read=86084.242
InitPlan 1 (returns 1ドル)
-> Limit (cost=0.56..4.69 rows=1 width=8) (actual time=0.025..0.025 rows=1 loops=7132605)
Buffers: shared hit=35596058 read=66967
I/O Timings: shared/local read=86084.242
" -> Index Scan Backward using ""~student_tracks-a1b7f7a8"" on student_tracks student_tracks_1 (cost=0.56..173.61 rows=42 width=8) (actual time=0.023..0.023 rows=1 loops=7132605)"
" Index Cond: (((student_id)::text = (students.id)::text) AND (connected_at IS NOT NULL) AND (connected_at >= '2023-02-01 00:00:00'::timestamp without time zone) AND (connected_at <= '2023-05-30 00:00:00'::timestamp without time zone))"
Buffers: shared hit=35596058 read=66967
I/O Timings: shared/local read=86084.242
SubPlan 3
-> Limit (cost=0.43..8.45 rows=1 width=4) (actual time=0.010..0.010 rows=0 loops=7132605)
Buffers: shared hit=21556886 read=8996
I/O Timings: shared/local read=10695.264
" -> Index Scan using ""~chats-5ca320c0"" on chats (cost=0.43..8.45 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=7132605)"
" Index Cond: (((student_id)::text = (students.id)::text) AND (created_at >= '2023-02-01 00:00:00'::timestamp without time zone) AND (created_at <= '2023-05-30 00:00:00'::timestamp without time zone))"
Buffers: shared hit=21556886 read=8996
I/O Timings: shared/local read=10695.264
SubPlan 4
-> Limit (cost=0.85..29.36 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=7132605)
Buffers: shared hit=22225167 read=3824
I/O Timings: shared/local read=18630.459
-> Nested Loop (cost=0.85..29.36 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=7132605)
Buffers: shared hit=22225167 read=3824
I/O Timings: shared/local read=18630.459
" -> Index Scan using ""~chats-4f9644a3"" on chats chats_1 (cost=0.43..12.46 rows=2 width=33) (actual time=0.007..0.007 rows=0 loops=7132605)"
Index Cond: ((student_id)::text = (students.id)::text)
Buffers: shared hit=21581799 read=2191
I/O Timings: shared/local read=16709.616
-> Index Scan using chat_leads_pkey on chat_leads (cost=0.42..8.45 rows=1 width=33) (actual time=0.019..0.019 rows=0 loops=184067)
Index Cond: ((chat_id)::text = (chats_1.id)::text)
" Filter: ((created_at >= '2023-02-01 00:00:00'::timestamp without time zone) AND (created_at <= '2023-05-30 00:00:00'::timestamp without time zone))"
Rows Removed by Filter: 0
Buffers: shared hit=643368 read=1633
I/O Timings: shared/local read=1920.842
Planning:
Buffers: shared hit=266
Planning Time: 0.876 ms
Execution Time: 353064.689 ms
I would really welcome any suggestions on how to improve this query. I've tried all sorts of postgres query plan analyzers online and just can't get it down to a reasonable timeframe.
Thanks!
-
Please add your relevant table definitions and index definitions to your Post.J.D.– J.D.2023年06月05日 12:22:14 +00:00Commented Jun 5, 2023 at 12:22
-
Why is 6 minutes too long to wait for a monthly report?jjanes– jjanes2023年06月05日 19:49:16 +00:00Commented Jun 5, 2023 at 19:49
1 Answer 1
I would start by converting the subqueries into joins:
SELECT
"students"."id",
"students"."student_type_id",
"students"."user_agent",
"students"."user_agent_is_mobile",
TO_CHAR(
MAX(student_tracks.enrolled_at) :: DATE,
'Mon yy'
) as "last_enrolled",
( COUNT("chats"."id") > 0 ) as "has_chat",
( COUNT("chat_reports"."chat_id") > 0 ) as "has_chat_report"
FROM "students"
JOIN "student_tracks" ON student_tracks.student_id = students.id
LEFT JOIN "chats" ON (chats.student_id = students.id and "chats"."created_at" between '2023-02-01' and '2023-05-30')
LEFT JOIN "chat_reports" ON ("chats"."id" = "chat_reports"."chat_id" and "chat_reports"."created_at" between '2023-02-01' and '2023-05-30' )
WHERE
"student_tracks"."enrolled_at" between '2023-02-01' and '2023-05-30'
GROUP BY
"students"."id",
"students"."student_type_id",
"students"."user_agent",
"students"."user_agent_is_mobile"
;
Some obvious indexes would be:
students (id)
students_tracks (student_id)
chats (student_id, id)
chat_reports (chat_id)
Try this and, if the result is not yet fast enough, post all your index definitions and an updated query plan.
Explore related questions
See similar questions with these tags.