I am using Postgres with the following query:
select count(*) from image;
The primary key on this table is non-incrementing; it's a unique serial number for the images stored in the table. Our app often attempts to ingest images that have already been recorded in the database, so the primary key/serial number ensures they are only recorded once.
Now we are wondering if we should have gone with an incrementing primary key instead. We have 1,259,369 images in the database and it takes about 7 minutes for the count query to run.
Our app will never delete images from this table - so an incrementing primary key would allow us to check the value of the last ID which would equal the number of rows in the table.
2 Answers 2
Generally, if you don't need an exact count, there is a much faster way:
SELECT reltuples::bigint AS estimate
FROM pg_class
WHERE oid = 'image'::regclass;
See:
Strictly speaking, in a DB with concurrent write access every count is an estimate because the number may be outdated the instant you get it.
But there is something off in your DB. Counting a million rows should not take more than a few seconds in the worst case. (Much less with modern Postgres on modern hardware.)
That your "app will never delete images from this table" makes this even more suspicious because there shouldn't be many dead rows then. (Or are you updating a lot?) A huge amount of dead tuples could slow you down - and call for VACUUM
. Normally, autovacuum
takes care of this. Did you enable it? It's on by default in modern Postgres.
Check for dead tuples:
-
1Thanks for all your comments. Turning on autovacuum seems to have done the trick - a full count now takes about 6 seconds or so. I also learned through the process of tuning Postgres that I am running 32-bit instead of 64-bit, which means I will have to rebuild my Lubuntu install so I can take full advantage of my 16GB of RAM. Thanks for helping out this recent convert from MySQL!andrewniesen– andrewniesen2015年03月23日 03:45:16 +00:00Commented Mar 23, 2015 at 3:45
Using pg_stat_user_tables you can find row count of a PostgreSQL Table:
SELECT
schemaname
,relname
,n_live_tup AS EstimatedCount
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;
Explore related questions
See similar questions with these tags.
explain (analyze, verbose, buffers) select count(*) from image
. And please post the definition of the table (create table image (...)
)