I've got a one to one relation from users to addresses table. Where one user can have one search address and one verified address.
I've got two indexes on addresses table :
- Index on state field
- Index on user_id
I'm trying to fetch addresses for certain users only, and those addresses whose state is other than manual_verification
.
Here is my query:
SELECT users.id
FROM "users" INNER JOIN addresses
ON addresses.user_id = users.id
and addresses.type = 'VerifiedAddress'
WHERE ("users".deleted_at IS NULL)
AND (users.id in (11144,10569,21519,783,15671,21726,17787,11665,
19579,12226,1324,9413,5461,20981,12906)
and addresses.state != 'manual_verification')
Explain for the query above : http://explain.depesz.com/s/rTj
It takes 37 ms. Sometimes more depending on the number of users.
I think this is a good query, however our team needs to investigate around this, and I'm looking for some optimization tips. I mean I did the one field select, there is a index on user_id (addresses) and state (addresses).
Is there anything else I can do/try?
Update
I found out that this query works much faster:
SELECT "addresses"."user_id"
FROM "addresses"
WHERE "addresses"."type" IN ('VerifiedAddress')
AND (user_id in (9681,23824,23760,20098,962,14730,12294,9552,534,
553,5837,6768,6583,956,24179) and state != 'manual_verification')
Explain for this query: http://explain.depesz.com/s/nHrr
1 Answer 1
I'm trying to fetch addresses for certain users only ...
IN
does not scale well with big lists. Compare performance to unnest()
/ JOIN
SELECT a.*
FROM unnest(ARRAY[9681,23824,23760,20098,962,14730,12294,9552,534,
553,5837,6768,6583,956,24179]) AS t(user_id)
JOIN addresses a USING (user_id)
WHERE a.type = 'VerifiedAddress'
AND a.state <> 'manual_verification'
I hope state
and type
are at least enum
types. Otherwise you should normalize your database design. Have lookup-tables for state
and type
and only use a small IDs referencing entries there. Makes the addresses
table smaller and everything faster.
Assuming all columns NOT NULL
for lack of information.
Partial index
If you need it faster still, and if there are more than a few rows with type <> 'VerifiedAddress'
or state = 'manual_verification'
, a partial index would help some more. The gain grows with the percentage of rows you can exclude from the index.
CREATE INDEX addresses_selection_idx ON addresses (user_id)
WHERE type = 'VerifiedAddress'
AND state <> 'manual_verification';
Be sure to use matching WHERE
conditions in your query to allow Postgres to use this index. And weigh the benefit of tailored indexes against their cost (maintenance, extra disk space, small penalty for write operations).
-
hi Erwin, what do you consider to be a big list for in not to scale well?Gandalf StormCrow– Gandalf StormCrow2014年02月19日 01:49:47 +00:00Commented Feb 19, 2014 at 1:49
-
@GandalfStormCrow: Actual numbers depend on the details of your installation and query. In a quick test on Postgres 9.1 with integer categories on a big table (1MM rows) with index ,
IN
was about twice as fast with a hand full of items.unnest()
took the lead starting at ~ 160 items. Run a few tests withEXPLAIN ANALYZE
to find the characteristics in your installation.Erwin Brandstetter– Erwin Brandstetter2014年02月19日 03:37:03 +00:00Commented Feb 19, 2014 at 3:37
Explore related questions
See similar questions with these tags.
user_id
that references theusers(id)
, which should be, then you are fine. But the second query does not take into account the(users.deleted_at IS NULL)
condition.(deleted_at)
?