For context, I'm using an ORM (ActiveRecord for Ruby On Rails), but having to dip down into MySQL to run some rather complex queries. As a result, it's more difficult to do some things than it would be elsewhere, as it's easier to add variables to the 'where' part of a query than it is to add them to the join part. (I can bind variables for prepared statements in WHERE, but not in the JOIN ON). I'm having performance issues and trying to tune indexes, and I need to know how MySQL (specifically, the Aurora AWS implementation of it) will handle building a query over a join. A (vastly) simplified version of the query might read:
SELECT *
FROM foos
LEFT JOIN foosbarz on foos.id = foosbars.foo_id
INNER JOIN bars ON bars.id = foosbars.bar_id.id AND bars.deleted_at = false AND bars.publically_visible = true
WHERE bars.baz = ?input
OR foos.secondary_condition = ?second_input
If I remove the deleted at / publically visible filtering conditions the performance is good, but with them the performance tanks. I have an index on deleted_at, publically_visible, source
that isn't getting used (in favor of one with only deleted_at
?), and I can only guess that it's because the joins are being assembled before applying the filters. I really feel that the filter belongs where it is -- it's a bound variable that changes from query to query. Unfortunately, my local dev environment's sample data is so vastly different than productions, my local MySQL database generates a completely different execution plan than the AWS instance.
1 Answer 1
SELECT *
FROM foos
LEFT JOIN bars ON bars.foo_id = foos.id
AND bars.deleted_at = false
AND bars.publically_visible = true
WHERE bars.baz = ?input
First, note that "LEFT" is irrelevant because you are demanding that baz
have a particular value. LEFT JOIN
is useful only when you want to get NULLs from the 'righthand' table if the there is no row there.
Once it is a JOIN
(aka INNER JOIN
), it does not matter whether the conditions are in the ON
clause or in the WHERE
clause. By convention, ON
is used for describing how the tables are related, and WHERE
is used for filtering.
Also, once it is a JOIN
, the optimizer is likely to pick on bars
first, then reach into foos
as needed. In that case, this index, with the columns in any order, is beneficial for bars
:
INDEX(beleted_at, publically_visible, baz)
But, if really needed LEFT
, then show us the intended SQL so we can discuss it.
-
Thanks for the help. Your question made me realize that I was overlooking an important part of this question by focusing on the one problem (bad index choice by MySQL). Depending on inputs, I can have one or many filters, and one of them is chained by an 'OR', not an 'AND' --
bars.baz = ?input OR foos.second_condition = ?second_input
RonLugge– RonLugge2022年07月15日 12:15:27 +00:00Commented Jul 15, 2022 at 12:15 -
@RonLugge -
OR
usually cannot use indexes.AND
usually benefits from a "composite" index (such as my recommendation). More information: Index CookbookRick James– Rick James2022年07月15日 16:07:54 +00:00Commented Jul 15, 2022 at 16:07
Explore related questions
See similar questions with these tags.