Converting multiple comma-separated columns into rows

Question 1

I have an SQL Server database table that holds comma separated values in many columns. For example:

id	Column B	column c
1	a,b,c,	1,2,3,
2	d, ,f,	4,5,6,
3	g,h,i,j,	7, ,9,8,

I want to split all the columns into rows and the output should be this:

id	Column B	column c
1	a	1
1	b	2
1	c	3
2	d	4
2	5
2	f	6
3	g	7
3	h
3	i	9
3	j	8

I have just given the idea of how to convert these into rows, but my actual columns are more than 30 that need to be separated by comma.

Question 2

Microsoft SQL Server 2017 (RTM-CU31-GDR) (KB5021126) - 14.0.3460.9 (X64) Jan 25 2023 08:42:43 Copyright (C) 2017 Microsoft Corporation Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2019 Standard 10.0 <X64> (Build 17763: ) (Hypervisor)

Question 3

Having comma delimited lists in the database is an anti pattern and having multiple such comma delimited lists that need to be correlated based on order in the list is a huge anti pattern.

This is something that should be represented in a different table.

Hopefully the purpose of this query is to do so. A couple of alternatives....

OPENJSON

As the ordinal to STRING_SPLIT is apparently not available to you - you can use OPENJSON as an alternative.

The below will work for SQL Server 2016+ (and you must be on at least that as you do have STRING_SPLIT sans ordinal).

You may need to add code to escape characters in b and c if you find that they contain characters that lead to invalid JSON arrays being constructed.

SELECT id,
 ca.b,
 ca.c,
 [key]
FROM test t
 CROSS APPLY (SELECT b = MAX(CASE WHEN col = 'b' THEN value END),
 c = MAX(CASE WHEN col = 'c' THEN value END),
 [key] = cast([key] as int)
 FROM (SELECT value,
 [key],
 'b' AS col
 FROM OPENJSON(N'["' + REPLACE(t.b, ',', N'","') + N'"]') AS x
 UNION ALL
 SELECT value,
 [key],
 'c' AS col
 FROM OPENJSON(N'["' + REPLACE(t.c, ',', N'","') + N'"]') AS x) vals
 GROUP BY [key]) ca 
ORDER BY id,[key]

Recursive CTE

If you were to have many such columns to deal with you could even consider a recursive CTE approach. This does add some overhead vs other approaches but it does allow you to do multiple columns per iteration and avoids any need to group or join them by ordinal afterwards.

WITH R AS
(
SELECT id,
 b,
 c,
 bpos0 = 0,
 bpos = CHARINDEX(',', b), 
 cpos0 = 0,
 cpos = CHARINDEX(',', c),
 lvl = 1
FROM test t
UNION ALL
SELECT id,
 b,
 c,
 bpos0 = R.bpos,
 bpos = CASE WHEN R.bpos > 0 THEN CHARINDEX(',', b, R.bpos + 1) END, 
 cpos0 = R.cpos,
 cpos = CASE WHEN R.cpos > 0 THEN CHARINDEX(',', c, R.cpos + 1) END, 
 lvl = lvl+1
FROM R
WHERE R.bpos > 0 OR R.cpos > 0 
)
SELECT id, 
 b = SUBSTRING(b, bpos0 + 1, case when bpos = 0 then 80000 else bpos - bpos0 -1 end),
 c = SUBSTRING(c, cpos0 + 1, case when cpos = 0 then 80000 else cpos - cpos0 -1 end)
FROM R
ORDER BY id, lvl

db<>fiddle 🎻

Question 4

For a JSON array the key is the ordinal in the array. And this is used as the grouping value in the PIVOT

Question 5

When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.

Question 6

You need to decide what the "correct" results are for that example data and make the required changes to the code to fix it. I answered the original question you asked but that doesn't commit me to running a help desk for infinite variations of it

Question 7

I've no time to look at it. Next time make sure the question you ask is the one you actually want answered

Question 8

CREATE TABLE test (id INT, b VARCHAR(100), c VARCHAR(100));
INSERT INTO test VALUES
(1, 'a,b,c', '1,2,3'),
(2, 'd, ,f', '4,5,6'),
(3, 'g,h,i,j', '7, ,9,8');
SELECT * FROM test;

id	b	c
1	a,b,c	1,2,3
2	d, ,f	4,5,6
3	g,h,i,j	7, ,9,8

SELECT test.id, b.value b, c.value c
FROM test
CROSS APPLY STRING_SPLIT(b, ',', 1) b
CROSS APPLY STRING_SPLIT(c, ',', 1) c
WHERE b.ordinal = c.ordinal
ORDER BY test.id, b.ordinal;

id	b	c
1	a	1
1	b	2
1	c	3
2	d	4
2	5
2	f	6
3	g	7
3	h
3	i	9
3	j	8

fiddle

ERROR: Procedure or function STRING_SPLIT has too many arguments specified. and also error occur: ordinal is not recognized. In my SQL version, the STRING_SPLIT function accepts only 2 parameters STRING_SPLIT(sentence, ' '); so how to deal with this – jawad riaz

If so then your SQL Server version is not actual. See @Luuk's comment.

WITH 
b AS (
 SELECT test.id, b.value, ROW_NUMBER() OVER (PARTITION BY test.id ORDER BY test.id) ordinal
 FROM test
 CROSS APPLY STRING_SPLIT(b, ',') b 
),
c AS (
 SELECT test.id, c.value, ROW_NUMBER() OVER (PARTITION BY test.id ORDER BY test.id) ordinal
 FROM test
 CROSS APPLY STRING_SPLIT(c, ',') c
)
SELECT test.id, b.value b, c.value c
FROM test
JOIN b ON test.id = b.id 
JOIN c ON test.id = c.id 
WHERE b.ordinal = c.ordinal
ORDER BY test.id, b.ordinal;

id	b	c
1	a	1
1	b	2
1	c	3
2	d	4
2	5
2	f	6
3	g	7
3	h
3	i	9
3	j	8

fiddle

But this query is not deterministic.

Martin Smith Martin Smith 88.4k15 gold badges257 silver badges357 bronze badges · Accepted Answer · 2023-08-29 10:27:31Z

Having comma delimited lists in the database is an anti pattern and having multiple such comma delimited lists that need to be correlated based on order in the list is a huge anti pattern.

This is something that should be represented in a different table.

Hopefully the purpose of this query is to do so. A couple of alternatives....

OPENJSON

As the ordinal to STRING_SPLIT is apparently not available to you - you can use OPENJSON as an alternative.

The below will work for SQL Server 2016+ (and you must be on at least that as you do have STRING_SPLIT sans ordinal).

You may need to add code to escape characters in b and c if you find that they contain characters that lead to invalid JSON arrays being constructed.

SELECT id,
 ca.b,
 ca.c,
 [key]
FROM test t
 CROSS APPLY (SELECT b = MAX(CASE WHEN col = 'b' THEN value END),
 c = MAX(CASE WHEN col = 'c' THEN value END),
 [key] = cast([key] as int)
 FROM (SELECT value,
 [key],
 'b' AS col
 FROM OPENJSON(N'["' + REPLACE(t.b, ',', N'","') + N'"]') AS x
 UNION ALL
 SELECT value,
 [key],
 'c' AS col
 FROM OPENJSON(N'["' + REPLACE(t.c, ',', N'","') + N'"]') AS x) vals
 GROUP BY [key]) ca 
ORDER BY id,[key]

Recursive CTE

If you were to have many such columns to deal with you could even consider a recursive CTE approach. This does add some overhead vs other approaches but it does allow you to do multiple columns per iteration and avoids any need to group or join them by ordinal afterwards.

WITH R AS
(
SELECT id,
 b,
 c,
 bpos0 = 0,
 bpos = CHARINDEX(',', b), 
 cpos0 = 0,
 cpos = CHARINDEX(',', c),
 lvl = 1
FROM test t
UNION ALL
SELECT id,
 b,
 c,
 bpos0 = R.bpos,
 bpos = CASE WHEN R.bpos > 0 THEN CHARINDEX(',', b, R.bpos + 1) END, 
 cpos0 = R.cpos,
 cpos = CASE WHEN R.cpos > 0 THEN CHARINDEX(',', c, R.cpos + 1) END, 
 lvl = lvl+1
FROM R
WHERE R.bpos > 0 OR R.cpos > 0 
)
SELECT id, 
 b = SUBSTRING(b, bpos0 + 1, case when bpos = 0 then 80000 else bpos - bpos0 -1 end),
 c = SUBSTRING(c, cpos0 + 1, case when cpos = 0 then 80000 else cpos - cpos0 -1 end)
FROM R
ORDER BY id, lvl

db<>fiddle 🎻

For a JSON array the key is the ordinal in the array. And this is used as the grouping value in the PIVOT
When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.
You need to decide what the "correct" results are for that example data and make the required changes to the code to fix it. I answered the original question you asked but that doesn't commit me to running a help desk for infinite variations of it
I've no time to look at it. Next time make sure the question you ask is the one you actually want answered

Stack Exchange Network

Converting multiple comma-separated columns into rows

2 Answers 2

OPENJSON

Recursive CTE

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Converting multiple comma-separated columns into rows

2 Answers 2

OPENJSON

Recursive CTE

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions