Optimizing Postgres query

Question 1

I've got a one to one relation from users to addresses table. Where one user can have one search address and one verified address.

I've got two indexes on addresses table :

Index on state field
Index on user_id

I'm trying to fetch addresses for certain users only, and those addresses whose state is other than manual_verification.

Here is my query:

SELECT users.id 
FROM "users" INNER JOIN addresses 
 ON addresses.user_id = users.id 
 and addresses.type = 'VerifiedAddress' 
WHERE ("users".deleted_at IS NULL) 
 AND (users.id in (11144,10569,21519,783,15671,21726,17787,11665,
 19579,12226,1324,9413,5461,20981,12906) 
 and addresses.state != 'manual_verification')

Explain for the query above : http://explain.depesz.com/s/rTj

It takes 37 ms. Sometimes more depending on the number of users.

I think this is a good query, however our team needs to investigate around this, and I'm looking for some optimization tips. I mean I did the one field select, there is a index on user_id (addresses) and state (addresses).

Is there anything else I can do/try?

Update

I found out that this query works much faster:

 SELECT "addresses"."user_id" 
 FROM "addresses" 
 WHERE "addresses"."type" IN ('VerifiedAddress') 
 AND (user_id in (9681,23824,23760,20098,962,14730,12294,9552,534,
 553,5837,6768,6583,956,24179) and state != 'manual_verification')

Explain for this query: http://explain.depesz.com/s/nHrr

Question 2

Please provide the plan for the query. Preferably here - explain.depesz.com

Question 3

Of course, it uses only 1 of the 2 tables. If there is a foreign key on user_id that references the users(id), which should be, then you are fine. But the second query does not take into account the (users.deleted_at IS NULL) condition.

Question 4

Do you have an index on (deleted_at) ?

Question 5

@ypercube no I don't have index at delete_at, should I have it?

Question 6

Its ok that it doesn't take into account delete at because the address is dependent relation, it will get deleted if the user is deleted

Question 7

I'm trying to fetch addresses for certain users only ...

IN does not scale well with big lists. Compare performance to unnest() / JOIN

SELECT a.*
FROM unnest(ARRAY[9681,23824,23760,20098,962,14730,12294,9552,534,
 553,5837,6768,6583,956,24179]) AS t(user_id)
JOIN addresses a USING (user_id)
WHERE a.type = 'VerifiedAddress'
AND a.state <> 'manual_verification'

I hope state and type are at least enum types. Otherwise you should normalize your database design. Have lookup-tables for state and type and only use a small IDs referencing entries there. Makes the addresses table smaller and everything faster.

Assuming all columns NOT NULL for lack of information.

Partial index

If you need it faster still, and if there are more than a few rows with type <> 'VerifiedAddress' or state = 'manual_verification', a partial index would help some more. The gain grows with the percentage of rows you can exclude from the index.

CREATE INDEX addresses_selection_idx ON addresses (user_id)
WHERE type = 'VerifiedAddress'
AND state <> 'manual_verification';

Be sure to use matching WHERE conditions in your query to allow Postgres to use this index. And weigh the benefit of tailored indexes against their cost (maintenance, extra disk space, small penalty for write operations).

Question 8

hi Erwin, what do you consider to be a big list for in not to scale well?

Question 9

@GandalfStormCrow: Actual numbers depend on the details of your installation and query. In a quick test on Postgres 9.1 with integer categories on a big table (1MM rows) with index , IN was about twice as fast with a hand full of items. unnest() took the lead starting at ~ 160 items. Run a few tests with EXPLAIN ANALYZE to find the characteristics in your installation.

score 7 · Accepted Answer · 2014-02-18 20:48:42Z

I'm trying to fetch addresses for certain users only ...

IN does not scale well with big lists. Compare performance to unnest() / JOIN

SELECT a.*
FROM unnest(ARRAY[9681,23824,23760,20098,962,14730,12294,9552,534,
 553,5837,6768,6583,956,24179]) AS t(user_id)
JOIN addresses a USING (user_id)
WHERE a.type = 'VerifiedAddress'
AND a.state <> 'manual_verification'

I hope state and type are at least enum types. Otherwise you should normalize your database design. Have lookup-tables for state and type and only use a small IDs referencing entries there. Makes the addresses table smaller and everything faster.

Assuming all columns NOT NULL for lack of information.

Partial index

If you need it faster still, and if there are more than a few rows with type <> 'VerifiedAddress' or state = 'manual_verification', a partial index would help some more. The gain grows with the percentage of rows you can exclude from the index.

CREATE INDEX addresses_selection_idx ON addresses (user_id)
WHERE type = 'VerifiedAddress'
AND state <> 'manual_verification';

Be sure to use matching WHERE conditions in your query to allow Postgres to use this index. And weigh the benefit of tailored indexes against their cost (maintenance, extra disk space, small penalty for write operations).

hi Erwin, what do you consider to be a big list for in not to scale well?
@GandalfStormCrow: Actual numbers depend on the details of your installation and query. In a quick test on Postgres 9.1 with integer categories on a big table (1MM rows) with index , IN was about twice as fast with a hand full of items. unnest() took the lead starting at ~ 160 items. Run a few tests with EXPLAIN ANALYZE to find the characteristics in your installation.

Stack Exchange Network

Optimizing Postgres query

1 Answer 1

Partial index

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related