Optimise MySQL SELECT with LEFT JOIN subquery

Question 1

I have a query which combines a LEFT JOIN and subquery. The dataset is quite big and the time to execute the statement is over 70 seconds.

SELECT
 s.siblings,
 l.id
FROM
 `list` l
 INNER JOIN
 child c ON c.id = l.child_id
 INNER JOIN
 parent p ON p.id = c.parent_id
 LEFT JOIN (
 SELECT COUNT(c.id) AS siblings, c.id, c.parent_id
 FROM child c
 GROUP BY c.id
 ) AS s ON s.parent_id = c.parent_id AND s.id != c.id
WHERE
 l.country = 1
GROUP BY l.id, s.siblings
ORDER BY l.dateadded

This query should return all lists for a country. Each list is specific to a unique baby. For each list I would like to return a count of the number of children that have the same parent.

If I remove the LEFT JOIN subquery the fetch time is 0.1 seconds. Is there a way to make the query more efficient?

Question 2

This query would result in syntax error. Please correct it first. Then try to explain what it is supposed to do because there are two places (one inside the subquery and second in the external query) where it uses columns in the SELECT that are not in the GROUP BY list. This is non-standard SQL and can give unpredictable, non-repeatable results (i.e.: useless results).

Question 3

@ypercubeTM would the syntax error be caused by list as it is a reserved keyword? I have updated question to fix this and added an explanation of the statement.

Question 4

Could you include an EXPLAIN of your query in your question?

Question 5

How many rows does the child table have and how many in each country? Does the query return anything else but 1 in the siblings column?

Question 6

The main reason for the slow query is the join on a subquery. This will not use indexes. Then, you not only join with a derived table (subquery), but as well group total result based on subquery column - GROUP BY l.id, s.Siblings

In this case it could help to:

create temporary table from subquery, it also could include subquery for return correct parent_id
create index on this table
use temporary table in join
drop temporary table

This could have variants, but it is often faster and less server-loading than a complicated set of subqueries with joins.

Question 7

The query has lot of unnecessary complications. Plus the GROUP BY c.id in the derived table (assuming that child (id) is the primary key of that table) seems completely redundant. The result of the query should be always 1 for the siblings column, which is probably not the wanted result.

This would do what you want (find the number of siblings) in a much simpler way:

SELECT
 s.cnt - 1 AS siblings,
 l.id
FROM
 `list` AS l
 INNER JOIN
 child AS c ON c.id = l.child_id
 INNER JOIN 
 ( SELECT c.parent_id, COUNT(*) AS cnt
 FROM child AS c
 GROUP BY c.parent_id
 ) AS s ON s.parent_id = c.parent_id
WHERE
 l.country = 1
ORDER BY l.dateadded ;

Question 8

s.siblings,
...
JOIN ( SELECT ... )
 -->
( SELECT COUNT(*) FROM ... ) AS siblings

That is, get rid of the LEFT JOIN and replace it with a subquery in the SELECT.

a_vlad a_vlad 3,7052 gold badges14 silver badges17 bronze badges · Answer 1 · 2016-03-11 12:50:09Z

The main reason for the slow query is the join on a subquery. This will not use indexes. Then, you not only join with a derived table (subquery), but as well group total result based on subquery column - GROUP BY l.id, s.Siblings

In this case it could help to:

create temporary table from subquery, it also could include subquery for return correct parent_id
create index on this table
use temporary table in join
drop temporary table

This could have variants, but it is often faster and less server-loading than a complicated set of subqueries with joins.

ypercubeTM ypercubeTM 99.7k13 gold badges217 silver badges306 bronze badges · Answer 2 · 2016-03-11 16:57:51Z

The query has lot of unnecessary complications. Plus the GROUP BY c.id in the derived table (assuming that child (id) is the primary key of that table) seems completely redundant. The result of the query should be always 1 for the siblings column, which is probably not the wanted result.

This would do what you want (find the number of siblings) in a much simpler way:

SELECT
 s.cnt - 1 AS siblings,
 l.id
FROM
 `list` AS l
 INNER JOIN
 child AS c ON c.id = l.child_id
 INNER JOIN 
 ( SELECT c.parent_id, COUNT(*) AS cnt
 FROM child AS c
 GROUP BY c.parent_id
 ) AS s ON s.parent_id = c.parent_id
WHERE
 l.country = 1
ORDER BY l.dateadded ;

Rick James Rick James 80.7k5 gold badges52 silver badges119 bronze badges · Answer 3 · 2016-03-12 04:17:22Z

s.siblings,
...
JOIN ( SELECT ... )
 -->
( SELECT COUNT(*) FROM ... ) AS siblings

That is, get rid of the LEFT JOIN and replace it with a subquery in the SELECT.

Stack Exchange Network

Optimise MySQL SELECT with LEFT JOIN subquery

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Optimise MySQL SELECT with LEFT JOIN subquery

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions