29

Postgres newbie here.

I'm wondering if this query is optimized or not? I tried to JOIN ON only the values that are 100% necessary and leaving all the dynamic conditions in the WHERE clause. See below.

SELECT *
 FROM
 myapp_employees
 JOIN myapp_users ON
 myapp_users.user_id=myapp_employees.user_id
 JOIN myapp_contacts_assoc ON
 myapp_contacts_assoc.user_id=myapp_users.user_id
 JOIN myapp_contacts ON
 myapp_contacts.contact_id=myapp_contacts_assoc.contact_id
 WHERE
 myapp_contacts.value='[email protected]' AND
 myapp_contacts.type=(1)::INT2 AND
 myapp_contacts.is_primary=(1)::INT2 AND
 myapp_contacts.expired_at IS NULL AND
 myapp_employees.status=(1)::INT2 AND
 myapp_users.status=(1)::INT2
 LIMIT 1;

Note: For context, this proc is checking to see if a user is also an employee (elevated privs/different user type).

Anyways, is this the right way to go? Should the JOIN ON contain more statements like checking for expired_at IS NULL, for example? Why or why doesn't this make sense?

Erwin Brandstetter
186k28 gold badges463 silver badges636 bronze badges
asked Nov 22, 2016 at 3:03
3
  • What about your version of Postgres? (SELECT version();) Commented Nov 22, 2016 at 3:50
  • @ErwinBrandstetter I'm running PostgreSQL 9.3.14. Should this be something that I require in each function? Commented Nov 22, 2016 at 3:54
  • 2
    No, we here on dba.SE require of you to declare relevant software versions, because they make a difference for many questions. Commented Nov 22, 2016 at 4:13

2 Answers 2

39

Logically, it makes no difference at all whether you place conditions in the join clause of an INNER JOIN or the WHERE clause of the same SELECT. The effect is the same.

(Not the case for OUTER JOIN - i.e. LEFT JOIN, RIGHT JOIN, FULL JOIN!)

While operating with default settings it also makes no difference for the query plan or performance. Postgres is free to rearrange predicates in JOIN & WHERE clauses in its quest for the best query plan - as long as the number of tables is not greater than the join_collapse_limit (default 8). Details:

For readability and maintainability it makes sense to place conditions that connect tables in the respective JOIN clause and general conditions in the WHERE clause.

Your query looks just fine. I would use table aliases to cut back the noise, though.

Minor detail:

int2 '1' or even 1::int2 are more sensible than (1)::INT2. And while comparing to a value of well defined numeric data type, a plain numerical constant 1 is good enough, too.

answered Nov 22, 2016 at 3:42
5

A couple of points..

  1. If you're joining on a condition by the same name (user_id) in your case, you can use USING (user_id) rather than ON (a.user_id = b.user_id). This also saves a redundant column from potentially being outputted (if you're running SELECT * in production).

  2. 1::int2 is problematic. Either status, and is_primary and others are already int2 in which case the literal 1 will be automatically be casted to int2, or int2 casted to int as pg sees fit. Or, if you're storing them as regular ints, and casting them down as if that made a difference in computation -- which it doesn't, the cast alone makes that a losing proposition.

  3. When possible, all of the ::int2 should probably be stored as boolean. Then you can write your WHERE condition to be simpler too.

  4. For your type and status, you may want an ENUM type.

answered Nov 22, 2016 at 6:42

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.