Postgres newbie here.
I'm wondering if this query is optimized or not? I tried to JOIN ON only the values that are 100% necessary and leaving all the dynamic conditions in the WHERE clause. See below.
SELECT *
FROM
myapp_employees
JOIN myapp_users ON
myapp_users.user_id=myapp_employees.user_id
JOIN myapp_contacts_assoc ON
myapp_contacts_assoc.user_id=myapp_users.user_id
JOIN myapp_contacts ON
myapp_contacts.contact_id=myapp_contacts_assoc.contact_id
WHERE
myapp_contacts.value='[email protected]' AND
myapp_contacts.type=(1)::INT2 AND
myapp_contacts.is_primary=(1)::INT2 AND
myapp_contacts.expired_at IS NULL AND
myapp_employees.status=(1)::INT2 AND
myapp_users.status=(1)::INT2
LIMIT 1;
Note: For context, this proc is checking to see if a user is also an employee (elevated privs/different user type).
Anyways, is this the right way to go? Should the JOIN ON contain more statements like checking for expired_at IS NULL, for example? Why or why doesn't this make sense?
2 Answers 2
Logically, it makes no difference at all whether you place conditions in the join clause of an INNER JOIN
or the WHERE
clause of the same SELECT
. The effect is the same.
(Not the case for OUTER JOIN
- i.e. LEFT JOIN
, RIGHT JOIN
, FULL JOIN
!)
While operating with default settings it also makes no difference for the query plan or performance. Postgres is free to rearrange predicates in JOIN
& WHERE
clauses in its quest for the best query plan - as long as the number of tables is not greater than the join_collapse_limit
(default 8
). Details:
For readability and maintainability it makes sense to place conditions that connect tables in the respective JOIN
clause and general conditions in the WHERE
clause.
Your query looks just fine. I would use table aliases to cut back the noise, though.
Minor detail:
int2 '1'
or even 1::int2
are more sensible than (1)::INT2
. And while comparing to a value of well defined numeric data type, a plain numerical constant 1
is good enough, too.
A couple of points..
If you're joining on a condition by the same name (
user_id
) in your case, you can useUSING (user_id)
rather thanON (a.user_id = b.user_id)
. This also saves a redundant column from potentially being outputted (if you're runningSELECT *
in production).1::int2
is problematic. Eitherstatus
, andis_primary
and others are alreadyint2
in which case the literal 1 will be automatically be casted to int2, or int2 casted to int as pg sees fit. Or, if you're storing them as regular ints, and casting them down as if that made a difference in computation -- which it doesn't, the cast alone makes that a losing proposition.When possible, all of the ::int2 should probably be stored as
boolean
. Then you can write yourWHERE
condition to be simpler too.For your type and status, you may want an
ENUM
type.
SELECT version();
)