With the following query I am getting the error unknown column a.id in where clause. I'm basically trying to add a limit of 5 to what would have been two left joins. Is it possible to rewrite this query so it can work?
SELECT
CONCAT_WS(' ',(
SELECT GROUP_CONCAT(body,' ') FROM (
SELECT c.body FROM c WHERE c.id IN (
SELECT id_c FROM b WHERE b.id_a=a.id
)
LIMIT 5
) c
)) AS contents
FROM
a
Full SQLfiddle at http://sqlfiddle.com/#!9/c1822/3
Find_in_set works http://sqlfiddle.com/#!9/2d43bb/1 but it is extremely slow with large datasets
2 Answers 2
Another way:
SELECT a.id,
GROUP_CONCAT(c.body,' ') AS contents
FROM a
LEFT JOIN b
ON b.id_a = a.id
LEFT JOIN c
ON b.id_c = c.id
AND c.id <= COALESCE(
( SELECT ci.id
FROM c AS ci
JOIN b AS bi
ON bi.id_c = ci.id
WHERE b.id_a = bi.id_a
ORDER BY ci.id
LIMIT 1 OFFSET 4
), 10000000000)
GROUP BY a.id ;
a variation:
-- variation 2
SELECT a.id,
GROUP_CONCAT(c.body,' ') AS contents
FROM a
LEFT JOIN b
JOIN c
ON b.id_c = c.id
AND c.id <= COALESCE(
( SELECT ci.id
FROM c AS ci
JOIN b AS bi
ON bi.id_c = ci.id
WHERE b.id_a = bi.id_a
ORDER BY ci.id
LIMIT 1 OFFSET 4
), 10000000000)
ON b.id_a = a.id
GROUP BY a.id ;
and two more, all using the same basic pattern:
-- variation 3
SELECT a.id,
GROUP_CONCAT(c.body,' ') AS contents
FROM a
LEFT JOIN b
ON b.id_a = a.id
AND b.id_c <= COALESCE(
( SELECT bi.id_c
FROM b AS bi
WHERE b.id_a = bi.id_a
ORDER BY bi.id_c
LIMIT 1 OFFSET 4
), 10000000000)
LEFT JOIN c
ON b.id_c = c.id
GROUP BY a.id ;
-- variation 4
SELECT a.id,
( SELECT GROUP_CONCAT(c.body,' ')
FROM b
LEFT JOIN c
ON b.id_c = c.id
WHERE b.id_a = a.id
AND b.id_c <= COALESCE(
( SELECT bi.id_c
FROM b AS bi
WHERE b.id_a = bi.id_a
ORDER BY bi.id_c
LIMIT 1 OFFSET 4
), 10000000000)
) AS contents
FROM a ;
Tested in sqlfiddle .
-
variation 3 has a join less than the first two. It may be a bit more efficient.ypercubeᵀᴹ– ypercubeᵀᴹ2018年05月05日 11:55:31 +00:00Commented May 5, 2018 at 11:55
-
Yes, variation 3 seems most efficient. In fact it is loads fasterJJJ– JJJ2018年05月05日 12:24:13 +00:00Commented May 5, 2018 at 12:24
Not sure why it does not work (there seems to be limitations in MySQL that hide variables that aught to be visible). I would try to rewrite it using JOINs. This is tested on 10.2.14-MariaDB
:
SELECT CONCAT_WS(' ',GROUP_CONCAT(x.body,' '))
FROM (
SELECT b.id_a, c.body, row_number() over (partition by b.id_a) as n
FROM c
JOIN b
ON c.id = b.id_c
GROUP BY b.id_a, c.body
) x
JOIN a
ON x.id_a=a.id
WHERE n <= 5;
Here a window function row_number()
is used to enumerate c.body per b.id_a. This attribute can then be used to limit number of bodys that gets concatenated.
I'm not sure why GROUP BY b.id_a, c.body
is required when adding row_number()
, looks like a bug in the implementation (haven't checked, perhaps it is mentioned in the docs).
-
group_concat_max_len is already set by us but the query is slow with large datasets. If we LIMIT the join it only selects a subset and is a lot quicker. If there were 1M left join rows it would select all of them and then group_concat after up until the limit which is inefficient. If we replace a.id with an actual number it works perfectly and selects a lot quicker. But of course mysql doesn't let me add a.id to the nested subqueryJJJ– JJJ2018年05月05日 09:14:42 +00:00Commented May 5, 2018 at 9:14
-
The edited answer seems to work but unfortunately it is a multitude slower then just left joining without a limit.JJJ– JJJ2018年05月05日 09:59:51 +00:00Commented May 5, 2018 at 9:59
-
What indexes are there?Lennart - Slava Ukraini– Lennart - Slava Ukraini2018年05月05日 10:13:59 +00:00Commented May 5, 2018 at 10:13
-
Primary on a.id and Primary on b.id_a,b.id_c and Primary on c.idJJJ– JJJ2018年05月05日 10:16:46 +00:00Commented May 5, 2018 at 10:16