What considerations should be made when deciding on indexes for PostgreSQL databases?

Question 1

My understanding is that you want to decide on what to include in your index based what will trigger the index being used.

The following considerations are what I am aware of, in order of relevance:

(Most common) Any columns used with the following query clauses: WHERE, ORDER BY, JOIN
Index-only scans: getting data from a small selection of columns included in a covering index. In this case, data from an index could be used from a column just included in the SELECT clause, but the query will likely still contain a column in a WHERE clause that should also be indexed. See PostgreSQL's implementation for more details.
Getting a full count: something like SELECT count(*) FROM tbl; (It's not clear to me why an index would be used for this one)
For the checking of Foreign Key constraints: for verifying a record exists before allowing it to be added as a Foreign Key, or for deleting/setting to null/etc. records in tables WHERE colx_withFK = _id if a record with id _id is deleted in the source table, etc. PostgreSQL auto-creates indices for FKs, but other RDBMSs (like SQL Server) might not, so this one might not be something a developer has to actually worry about.

And for all of these considerations, of course, anything you execute often is more important than a less used query.

Question: Did I get something wrong? Is there anything else?

As details can be very different between SQL implementations, answers can focus on PostgreSQL.

Sources: helpful article, SE DBA answer and comments

Question 2

"All RDBMSs using SQL should auto-create indices for FKs so this one shouldn't be something a developer has to actually worry about." - Thank goodness SQL Server doesn't do that behind your back. As with nearly all things DBMS related, "it depends". GROUP BY can also benefit from ordered indexes.

Question 3

"(It's not clear to me why an index would be used for this one)" One line above you mentioned Index-only scans

Question 4

@SergeyA Why would that be faster than a normal table scan?

Question 5

@Akaisteph7 Because the index usually is smaller than a table

Question 6

@SergeyA Idk what you mean by that. It seems in the case of a full count the number of rows to traverse for a table scan should be even less than the number of nodes to traverse for an index scan. I guess maybe you're trying to say its faster to traverse the index even in this case because of how its structured and stored maybe. But how would you know this?

Question 7

There are a few things you forgot:

creating an index on an expression makes PostgreSQL collect statistics for the expression (from v15 on, you can alternatively use extended statistics for that purpose)
unique indexes can implement constraints

There are also partial indexes, where part of the WHERE condition can be implemented as a WHERE clause for the index.

But I think that you won't be able to come up with a useful list that "semi-automatically" enables you to index a database well without further considerations, like

Which queries are executed how often, and how important is their performance?
Is the benefit in query speed worth the performance loss in data modifications?
Shall I add two specific indexes that ideally support two queries, or is it better to add a single index that is not perfect for either query, but supports both somewhat?

Laurenz Albe Laurenz Albe 62k4 gold badges57 silver badges93 bronze badges · Answer 1 · 2023-08-28 07:28:55Z

There are a few things you forgot:

creating an index on an expression makes PostgreSQL collect statistics for the expression (from v15 on, you can alternatively use extended statistics for that purpose)
unique indexes can implement constraints

There are also partial indexes, where part of the WHERE condition can be implemented as a WHERE clause for the index.

But I think that you won't be able to come up with a useful list that "semi-automatically" enables you to index a database well without further considerations, like

Which queries are executed how often, and how important is their performance?
Is the benefit in query speed worth the performance loss in data modifications?
Shall I add two specific indexes that ideally support two queries, or is it better to add a single index that is not perfect for either query, but supports both somewhat?

Stack Exchange Network

What considerations should be made when deciding on indexes for PostgreSQL databases?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

What considerations should be made when deciding on indexes for PostgreSQL databases?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions