trivial query very slow on primary key on a PostgreSQL DB (instantaneous on MySQL)

Question 1

I have a table data:

with ~ 400 million rows
with data.id as int4, not null, and set as primary key
it's an aws RDS server, with ~128G of RAM
there are no row with id > 1e9

This query:

select count(*) from data where id > 1e9;

that returns 0, takes consistently ~ 25s to run. It uses to take up more than 2mns (I may have run analyze data during my investigations and then the time dropped to 25s).

On another server, an aws aurora Postgres with the exact same table, it takes ~ 3m30s.

Anyway the same query takes a few ms on a MySQL DB, as it should. I am very new to Postgres, so I am probably missing something obvious.

explain analyze select count(*) from data where id > 1e9;

QUERY PLAN
Finalize Aggregate (cost=11944927.24..11944927.25 rows=1 width=8) (actual time=23986.851..23988.183 rows=1 loops=1)
 -> Gather (cost=11944927.03..11944927.24 rows=2 width=8) (actual time=23986.779..23988.177 rows=3 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Partial Aggregate (cost=11943927.03..11943927.04 rows=1 width=8) (actual time=23984.078..23984.079 rows=1 loops=3)
 -> Parallel Index Only Scan using data_pkey on data (cost=0.57..11804489.05 rows=55775191 width=0) (actual time=23984.074..23984.075 rows=0 loops=3)
 Filter: ((id)::numeric > '1000000000'::numeric)
 Rows Removed by Filter: 133840000
 Heap Fetches: 100863313
Planning Time: 0.078 ms
Execution Time: 23988.219 ms

What am I doing wrong?

Question 2

Hi, and welcome to dba.se! You should always include full table definitions with any questions - the more detail you provide in your quesiton, the better chance you have of obtaining good answers!

Question 3

I assume that you have an index on id. If not, you need one.

Your problem is the 1e9. If you write a numeric constant with the scientific notation, it is considered to be of type numeric:

SELECT pg_typeof(1e9);
 pg_typeof 
═══════════
 numeric
(1 row)

So PostgreSQL has to cast id to type numeric to perform the comparison (the (id)::numeric in your execution plan) and cannot use the index.

Using a constant of type integer should speed up processing:

select count(*) from data where id > 1000000000;

Question 4

thanks a lot! Now it takes ~ 150ms. I knew there had to be some reason. And that may also explain some other issue I have using a timestamp wo tz field.

Laurenz Albe Laurenz Albe 62k4 gold badges57 silver badges93 bronze badges · Accepted Answer · 2024-03-01 13:41:08Z

I assume that you have an index on id. If not, you need one.

Your problem is the 1e9. If you write a numeric constant with the scientific notation, it is considered to be of type numeric:

SELECT pg_typeof(1e9);
 pg_typeof 
═══════════
 numeric
(1 row)

So PostgreSQL has to cast id to type numeric to perform the comparison (the (id)::numeric in your execution plan) and cannot use the index.

Using a constant of type integer should speed up processing:

select count(*) from data where id > 1000000000;

thanks a lot! Now it takes ~ 150ms. I knew there had to be some reason. And that may also explain some other issue I have using a timestamp wo tz field.

Stack Exchange Network

trivial query very slow on primary key on a PostgreSQL DB (instantaneous on MySQL)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

trivial query very slow on primary key on a PostgreSQL DB (instantaneous on MySQL)

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions