I have a SELECT query which when running produces the exact same plan in terms of indexes being used when doing an explain analyze however the duration of the run is different (2 seconds vs 30 seconds on average).
Why would this be the case?
Query
explain analyze SELECT SUM ((t0.item_cash_staked - t0.item_cash_won))
FROM item t0, product t1
WHERE (((( t0.item_rejection_code_id IS null)
AND (t0.item_created_on > '2019-08-01 17:38:33.613+01'))
AND (t1.customer_id = 123456))
AND (t1.product_id = t0.product_id));
Explain results
Aggregate (cost=1984710.64..1984710.65 rows=1 width=32) (actual time=26916.904..26916.905 rows=1 loops=1)
-> Nested Loop (cost=1.14..1983541.69 rows=233789 width=6) (actual time=4531.244..26765.752 rows=453812 loops=1)
-> Index Scan using product_idx_01 on product t1 (cost=0.57..172442.67 rows=539262 width=4) (actual time=6.213..4490.454 rows=500133 loops=1)
Index Cond: (customer_id = 123456)
-> Index Scan using item_idx_product_id on item t0 (cost=0.57..3.24 rows=12 width=10) (actual time=0.035..0.044 rows=1 loops=500133)
Index Cond: (product_id = t1.product_id)
Filter: ((item_rejection_code_id IS NULL) AND (item_created_on > '2019-08-01 17:38:33.613+01'::timestamp with time zone))
Rows Removed by Filter: 1
Planning time: 0.409 ms
Execution time: 26916.999 ms
(10 rows)
Explain Result Faster
Aggregate (cost=1984710.64..1984710.65 rows=1 width=32) (actual time=1786.816..1786.816 rows=1 loops=1)
-> Nested Loop (cost=1.14..1983541.69 rows=233789 width=6) (actual time=289.922..1687.398 rows=453812 loops=1)
-> Index Scan using product_idx_01 on product t1 (cost=0.57..172442.67 rows=539262 width=4) (actual time=0.013..202.082 rows=500133 loops=1)
Index Cond: (customer_id = 123456)
-> Index Scan using item_idx_product_id on item t0 (cost=0.57..3.24 rows=12 width=10) (actual time=0.002..0.003 rows=1 loops=500133)
Index Cond: (product_id = t1.product_id)
Filter: ((item_rejection_code_id IS NULL) AND (item_created_on > '2019-08-01 17:38:33.613+01'::timestamp with time zone))
Rows Removed by Filter: 1
Planning time: 0.275 ms
Execution time: 1786.866 ms
(10 rows)
I get running a query once will be slower than running it later as it will skip the planning phase, but I have had times where I have run the same query 1 after the other and it still takes approx 25 seconds to run? Why would this be?
Is there anyway of improving the query to make it better performing?
Any help is much appreciated.
1 Answer 1
You can create an index that is specifically designed for the query:
CREATE INDEX ON product (product_id, item_created_on)
INCLUDE (item_cash_staked, item_cash_won)
WHERE item_rejection_code_id IS NULL;
VACUUM product;
That should get you and index-only scan.
With old PostgreSQL versions, you can add the columns to the index instead:
CREATE INDEX ON product (product_id, item_created_on, item_cash_staked, item_cash_won)
WHERE item_rejection_code_id IS NULL;
VACUUM product;
-
Thanks @laurenz-albe will this work for Postgres 9.6? Get a syntax error from the INCLUDE part onwards.rdbmsNoob– rdbmsNoob2021年05月25日 10:08:45 +00:00Commented May 25, 2021 at 10:08
-
I have added instructions for old PostgreSQL versions.Laurenz Albe– Laurenz Albe2021年05月25日 10:12:56 +00:00Commented May 25, 2021 at 10:12
-
Thanks will give it a try :)rdbmsNoob– rdbmsNoob2021年05月25日 10:45:09 +00:00Commented May 25, 2021 at 10:45
-
Even after applying the index it still decides to use the item_idx_product_id.rdbmsNoob– rdbmsNoob2021年05月26日 08:11:14 +00:00Commented May 26, 2021 at 8:11
-
Did you
VACUUM
? Does it use the index if you omit theWHERE
condition inCREATE INDEX
?Laurenz Albe– Laurenz Albe2021年05月26日 09:16:39 +00:00Commented May 26, 2021 at 9:16
Explore related questions
See similar questions with these tags.
explain (analyze, buffers, verbose)