Multiple GROUP_CONCAT statements within single MySQL Query

Question 1

I have a table where each record has a one to many relationship with 3 other tables (with further one to one branching) leading to many rows for each main record with many columns of duplicate information. In PHP, I take the result set and flatten it to a multidimensional array.

I am weighing the benefits of rewriting the query to let MySQL do the flattening using GROUP_CONCAT statements. I'd end up with one row per main record with 3 fields of concatenated data (files, grades + pages, and categories). I am not using any GROUP BY statements; I'm only using GROUP_CONCAT to flatten.

I've done this before for a single GROUP_CONCAT but am curious if this a "normal" use of the technology. I am asking from a design standards and maintainability point of view or if there are any gotchas I'm overlooking. Is it personal preference? Performance appears to be about the same.

As I see it from a programming standpoint Benefits of GROUP_CONCAT:

no duplicated data to send across the internet
simplified processing in PHP: even though I have to massage the data afterwards using explode(), it seems less obtuse than the code I have to step through, compiling the distinct values of file, grade + page, and category for each record
the query actually appears to better represent what is happening by putting the many joins in context

Downsides:

There are multiple columns being combined within the GROUP_CONCAT output, so complexity is added with delimiters and nested explode() statements needed in PHP to separate out the fields.
If it's not broken... I've been using the code without GROUP_CONCAT for many years. A pain to change, but I get there eventually.

The query below is much simplified. The reason for the nested query is a calculation subquery I've removed.

Query without GROUP_CONCAT

SELECT
 g.gemid,
 g.title,
 gd.filename,
 gd.license,
 gp.grade,
 gp.page,
 gp.page2,
 gc.category,
 mg.topid,
 mg.title AS gradetitle,
 mp.license AS pagelicense,
 mp2.license AS page2license,
 mp.title AS pagetitle,
 mp2.title AS page2title
FROM (
 SELECT DISTINCT
 gems.gemid,
 gems.title,
 gp.sort
 FROM
 gems
 LEFT JOIN gempage gp ON gems.gemid = gp.gemid
 WHERE gp.grade = 1
 ORDER BY gp.sort
 ) g
LEFT JOIN gempage gp ON g.gemid = gp.gemid
LEFT JOIN mgrade mg ON gp.grade = mg.name
LEFT JOIN mpage mp ON gp.page = mp.name AND mg.gradeid = mp.gradeid
LEFT JOIN mpage2 mp2 ON gp.page2 = mp2.name AND mp.pageid = mp2.pageid AND mg.gradeid = mp.gradeid
LEFT JOIN gemcategory gc ON g.gemid = gc.gemid
LEFT JOIN gemdetail gd ON g.gemid = gd.gemid
WHERE gp.grade = 1
ORDER BY gp.sort

Query with GROUP_CONCAT

SELECT
 (SELECT GROUP_CONCAT(CONCAT_WS(":",IFNULL(filename,''), IFNULL(license,''))) FROM gemdetail gd WHERE g.gemid = gd.gemid) as filelist,
 (SELECT GROUP_CONCAT(category ORDER BY sort, gemcategoryid SEPARATOR ', ') FROM gemcategory gc WHERE gc.gemid = g.gemid) as catlist,
 (SELECT DISTINCT GROUP_CONCAT(CONCAT_WS(",", gp.grade, gp.page, IFNULL(gp.page2,''), mg.topid, IFNULL(mg.title,''), IFNULL(mp.license,''), IFNULL(mp.title,''), IFNULL(mp2.license,''), IFNULL(mp2.title,''))) 
 FROM gempage gp 
 LEFT JOIN mgrade mg ON gp.grade = mg.name
 LEFT JOIN mpage mp ON gp.page = mp.name AND mg.gradeid = mp.gradeid
 LEFT JOIN mpage2 mp2 ON gp.page2 = mp2.name AND mp.pageid = mp2.pageid AND mg.gradeid = mp.gradeid
 WHERE g.gemid = gp.gemid AND gp.grade = 1) as gradepage,
 g.gemid,
 g.title
FROM (
 SELECT DISTINCT
 gems.gemid,
 gems.title,
 gp.sort
 FROM
 gems
 LEFT JOIN gempage gp ON gems.gemid = gp.gemid
 WHERE gp.grade = 1
 ORDER BY gp.sort
 ) g

Question 2

Can't use , in any of the values.
You may need to set a larger value for the Variable group_concat_max_len.
You may need DISTINCT inside GROUP_CONCAT().
FROM ( SELECT ... ORDER BY ) -- The ORDER BY will be ignored and should be removed. You may desire the ORDER BY on the outside.

These indexes may speed it up:

g: INDEX(gemid, title)
gd: INDEX(gemid, filename, license)
gc: INDEX(gemid, category)
gp: INDEX(grade, gemid, sort)
gp: INDEX(grade, gemid, page, page2)
mg: INDEX(name, topid, title, gradeid)
mp: INDEX(name, license, title, gradeid, pageid)
mp2: INDEX(name, license, title, pageid)
gems: INDEX(gemid, title)

Question 3

Yes, thank you for mentioning not being able to use the chosen delimiters in the data. I forgot to mention that I have control over the field data. I also appreciate the list of indexes. It hadn't occurred to me that all fields being concatenated would benefit from indexes. Why is this? I thought that only fields which are part of a WHERE, ORDER BY, or GROUP BY would benefit from indexes.

Question 4

@mseifert - Search for "covering index".

Question 5

Thank you! That was a huge gap in my DB knowledge. Makes lots of sense after reading about it. A bit humbling that I wasn't aware of this really, after 30 years in the industry - on and off and now designing again. Thanks for taking the time to answer thoroughly - really. I may or may not switch to GROUP_CONCAT, but the indexing will help tremendously regardless.

Question 6

@mseifert - Yeah, it took me years to digest indexing enough to write this Index Cookbook

Rick James Rick James 80.7k5 gold badges52 silver badges119 bronze badges · Accepted Answer · 2022-12-08 06:53:49Z

1

Can't use , in any of the values.
You may need to set a larger value for the Variable group_concat_max_len.
You may need DISTINCT inside GROUP_CONCAT().
FROM ( SELECT ... ORDER BY ) -- The ORDER BY will be ignored and should be removed. You may desire the ORDER BY on the outside.

These indexes may speed it up:

g: INDEX(gemid, title)
gd: INDEX(gemid, filename, license)
gc: INDEX(gemid, category)
gp: INDEX(grade, gemid, sort)
gp: INDEX(grade, gemid, page, page2)
mg: INDEX(name, topid, title, gradeid)
mp: INDEX(name, license, title, gradeid, pageid)
mp2: INDEX(name, license, title, pageid)
gems: INDEX(gemid, title)

Share

Improve this answer

answered Dec 8, 2022 at 6:53

Rick James's user avatar

Rick James Rick James

80.7k5 gold badges52 silver badges119 bronze badges

4

Yes, thank you for mentioning not being able to use the chosen delimiters in the data. I forgot to mention that I have control over the field data. I also appreciate the list of indexes. It hadn't occurred to me that all fields being concatenated would benefit from indexes. Why is this? I thought that only fields which are part of a WHERE, ORDER BY, or GROUP BY would benefit from indexes.

mseifert
– mseifert

2022年12月08日 21:24:16 +00:00
Commented Dec 8, 2022 at 21:24
@mseifert - Search for "covering index".

Rick James
– Rick James

2022年12月08日 23:54:10 +00:00
Commented Dec 8, 2022 at 23:54
Thank you! That was a huge gap in my DB knowledge. Makes lots of sense after reading about it. A bit humbling that I wasn't aware of this really, after 30 years in the industry - on and off and now designing again. Thanks for taking the time to answer thoroughly - really. I may or may not switch to GROUP_CONCAT, but the indexing will help tremendously regardless.

mseifert
– mseifert

2022年12月09日 00:26:02 +00:00
Commented Dec 9, 2022 at 0:26
@mseifert - Yeah, it took me years to digest indexing enough to write this Index Cookbook

Rick James
– Rick James

2022年12月09日 01:11:09 +00:00
Commented Dec 9, 2022 at 1:11

Add a comment |

Stack Exchange Network

Multiple GROUP_CONCAT statements within single MySQL Query

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Multiple GROUP_CONCAT statements within single MySQL Query

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions