0

I am looking to optimize a JOIN between tables that uses a small subset of columns in the second table.

CREATE TABLE `employees` (
 `employee_id` bigint NOT NULL,
 `manager_id` bigint NOT NULL,
 `org_id` bigint NOT NULL,
 `union_id` bigint NOT NULL
 ...
 PRIMARY KEY (employee_id),
 INDEX (union_id)
);
CREATE TABLE `managers` (
 `manager_id` bigint NOT NULL,
 `org_id` bigint NOT NULL,
 `some_condition` boolean NOT NULL,
 PRIMARY KEY (manager_id) 
);

Now I want to optimize two types of queries. Both join the tables together on manager_id and org_id, and optionally apply a filter to the some_condition column.

SELECT employees.* 
FROM employees 
JOIN managers
ON (employees.manager_id = managers.manager_id AND employees.org_id = managers.org_id)
WHERE (employees.union_id = ? AND managers.some_condition);
SELECT employees.* 
 FROM employees 
 JOIN managers
 ON (employees.manager_id = managers.manager_id AND employees.org_id = managers.org_id)
 WHERE (employees.union_id = ?);

Assuming these are very large tables and employees > manager. I am trying to create an index on managers that will speed up the query. Right now the query is slow because for each row it has to read org_id and some_condition. I want to avoid going to disk if possible.

So far I have two indexes that might work:

INDEX `join_index` (`org_id`,`some_condition`)
INDEX `id_join_index` (`manager_id`, `org_id`, `some_condition`)

My main issue is that MySQL does not use either index in the EXPLAIN statement unless I force it to with use index (...).

Which index (if either) will speed up my query, and do I need manager_id in the index to speed up the join if I do not filter on some_condition?

asked Nov 20, 2024 at 21:36
1
  • You need in composite index by (manager_id, org_id) in both tables at least. Maybe (manager_id, org_id, union_id) in employees and (manager_id, org_id, some_condition) in managers (depends on the data statistic, should be tested). Commented Nov 21, 2024 at 4:34

1 Answer 1

0

As said in the comment, you need the composite index for both tables. Also, you do need manager_id in the index even if you're not filtering on some_condition. Without it, MySQL can't efficiently locate the matching rows for the join.

answered Nov 21, 2024 at 10:00

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.