0

I have an SQL Server database table that holds comma separated values in many columns. For example:

id Column B column c
1 a,b,c, 1,2,3,
2 d, ,f, 4,5,6,
3 g,h,i,j, 7, ,9,8,

I want to split all the columns into rows and the output should be this:

id Column B column c
1 a 1
1 b 2
1 c 3
2 d 4
2 5
2 f 6
3 g 7
3 h
3 i 9
3 j 8

I have just given the idea of how to convert these into rows, but my actual columns are more than 30 that need to be separated by comma.

Martin Smith
88.4k15 gold badges257 silver badges357 bronze badges
asked Aug 29, 2023 at 7:51
1
  • Microsoft SQL Server 2017 (RTM-CU31-GDR) (KB5021126) - 14.0.3460.9 (X64) Jan 25 2023 08:42:43 Copyright (C) 2017 Microsoft Corporation Enterprise Edition: Core-based Licensing (64-bit) on Windows Server 2019 Standard 10.0 <X64> (Build 17763: ) (Hypervisor) Commented Aug 30, 2023 at 10:22

2 Answers 2

2

Having comma delimited lists in the database is an anti pattern and having multiple such comma delimited lists that need to be correlated based on order in the list is a huge anti pattern.

This is something that should be represented in a different table.

Hopefully the purpose of this query is to do so. A couple of alternatives....

OPENJSON

As the ordinal to STRING_SPLIT is apparently not available to you - you can use OPENJSON as an alternative.

The below will work for SQL Server 2016+ (and you must be on at least that as you do have STRING_SPLIT sans ordinal).

You may need to add code to escape characters in b and c if you find that they contain characters that lead to invalid JSON arrays being constructed.

SELECT id,
 ca.b,
 ca.c,
 [key]
FROM test t
 CROSS APPLY (SELECT b = MAX(CASE WHEN col = 'b' THEN value END),
 c = MAX(CASE WHEN col = 'c' THEN value END),
 [key] = cast([key] as int)
 FROM (SELECT value,
 [key],
 'b' AS col
 FROM OPENJSON(N'["' + REPLACE(t.b, ',', N'","') + N'"]') AS x
 UNION ALL
 SELECT value,
 [key],
 'c' AS col
 FROM OPENJSON(N'["' + REPLACE(t.c, ',', N'","') + N'"]') AS x) vals
 GROUP BY [key]) ca 
ORDER BY id,[key] 

Recursive CTE

If you were to have many such columns to deal with you could even consider a recursive CTE approach. This does add some overhead vs other approaches but it does allow you to do multiple columns per iteration and avoids any need to group or join them by ordinal afterwards.

WITH R AS
(
SELECT id,
 b,
 c,
 bpos0 = 0,
 bpos = CHARINDEX(',', b), 
 cpos0 = 0,
 cpos = CHARINDEX(',', c),
 lvl = 1
FROM test t
UNION ALL
SELECT id,
 b,
 c,
 bpos0 = R.bpos,
 bpos = CASE WHEN R.bpos > 0 THEN CHARINDEX(',', b, R.bpos + 1) END, 
 cpos0 = R.cpos,
 cpos = CASE WHEN R.cpos > 0 THEN CHARINDEX(',', c, R.cpos + 1) END, 
 lvl = lvl+1
FROM R
WHERE R.bpos > 0 OR R.cpos > 0 
)
SELECT id, 
 b = SUBSTRING(b, bpos0 + 1, case when bpos = 0 then 80000 else bpos - bpos0 -1 end),
 c = SUBSTRING(c, cpos0 + 1, case when cpos = 0 then 80000 else cpos - cpos0 -1 end)
FROM R
ORDER BY id, lvl

db<>fiddle 🎻

answered Aug 29, 2023 at 10:27
4
  • For a JSON array the key is the ordinal in the array. And this is used as the grouping value in the PIVOT Commented Aug 29, 2023 at 10:51
  • 1
    When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys. Commented Aug 29, 2023 at 10:56
  • You need to decide what the "correct" results are for that example data and make the required changes to the code to fix it. I answered the original question you asked but that doesn't commit me to running a help desk for infinite variations of it Commented Aug 30, 2023 at 10:50
  • I've no time to look at it. Next time make sure the question you ask is the one you actually want answered Commented Aug 30, 2023 at 12:21
2
CREATE TABLE test (id INT, b VARCHAR(100), c VARCHAR(100));
INSERT INTO test VALUES
(1, 'a,b,c', '1,2,3'),
(2, 'd, ,f', '4,5,6'),
(3, 'g,h,i,j', '7, ,9,8');
SELECT * FROM test;
id b c
1 a,b,c 1,2,3
2 d, ,f 4,5,6
3 g,h,i,j 7, ,9,8
SELECT test.id, b.value b, c.value c
FROM test
CROSS APPLY STRING_SPLIT(b, ',', 1) b
CROSS APPLY STRING_SPLIT(c, ',', 1) c
WHERE b.ordinal = c.ordinal
ORDER BY test.id, b.ordinal;
id b c
1 a 1
1 b 2
1 c 3
2 d 4
2 5
2 f 6
3 g 7
3 h
3 i 9
3 j 8

fiddle


ERROR: Procedure or function STRING_SPLIT has too many arguments specified. and also error occur: ordinal is not recognized. In my SQL version, the STRING_SPLIT function accepts only 2 parameters STRING_SPLIT(sentence, ' '); so how to deal with this – jawad riaz

If so then your SQL Server version is not actual. See @Luuk's comment.

WITH 
b AS (
 SELECT test.id, b.value, ROW_NUMBER() OVER (PARTITION BY test.id ORDER BY test.id) ordinal
 FROM test
 CROSS APPLY STRING_SPLIT(b, ',') b 
),
c AS (
 SELECT test.id, c.value, ROW_NUMBER() OVER (PARTITION BY test.id ORDER BY test.id) ordinal
 FROM test
 CROSS APPLY STRING_SPLIT(c, ',') c
)
SELECT test.id, b.value b, c.value c
FROM test
JOIN b ON test.id = b.id 
JOIN c ON test.id = c.id 
WHERE b.ordinal = c.ordinal
ORDER BY test.id, b.ordinal;
id b c
1 a 1
1 b 2
1 c 3
2 d 4
2 5
2 f 6
3 g 7
3 h
3 i 9
3 j 8

fiddle

But this query is not deterministic.

answered Aug 29, 2023 at 8:09
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.