I have a table with the following structure:
set_id | value | set_row_number |
---|---|---|
1 | Foo | 1 |
1 | Bar | 2 |
1 | Baz | 3 |
2 | Foo2 | 1 |
2 | Bar2 | 2 |
2 | Baz2 | 3 |
2 | Qux2 | 4 |
2 | Thud2 | 5 |
3 | Foo3 | 1 |
I would like to take this data and place it into a new table where the number of columns matches the max number of values in one set, the value with set_row_number
of 1
will always be in the column called start_node
and the maximum set_row_number
in the set will always be in the column called end_node
. Any other middle set_row_number
s should fill in from max set_row_number
- 1 starting with 1_from_end_node
, 2_from_end_node
, etc.
The end result should like like:
start_node | 3_from_end_node | 2_from_end_node | 1_from_end_node | end_node |
---|---|---|---|---|
Foo | NULL | NULL | Bar | Baz |
Foo2 | Bar2 | Baz2 | Qux2 | Thud2 |
Foo3 | NULL | NULL | NULL | NULL |
I have already tried to use PIVOT
to accomplish this but I have not been able to get expected results due to the variable length sets and the requirement to have the first in a specific column and the max set_row_number
in another.
I can change the structure of the starting table, if needed. This table was derived from another table that had the hierarchy in a pipe separated format. I used CROSS_APPLY STRING_SPLIT(hierarchy_data, '|')
to achieve my starting point below.
Thanks in advance to anyone able to help. Please let me know if additional clarification is needed.
1 Answer 1
Here's one way to do it statically, which is only useful if you know that there will never be more than 5 nodes for any set_id
:
SELECT * FROM
(
SELECT start_node = src.value, the_rest.value,
col = CASE WHEN rn = 0 THEN 'end_node'
ELSE CONCAT(the_rest.rn, '_from_end_node') END
FROM dbo.[a table] AS src
OUTER APPLY
(
SELECT *, rn = ROW_NUMBER() OVER
(PARTITION BY set_id ORDER BY set_row_number DESC)-1
FROM dbo.[a table] AS rel
WHERE rel.set_id = src.set_id
AND rel.set_row_number > 1
) AS the_rest WHERE src.set_row_number = 1
)
AS x
PIVOT (MAX(value) FOR col IN
(
[3_from_end_node],
[2_from_end_node],
[1_from_end_node],
[end_node]
)) AS p;
Results from db<>fiddle:
start_node | 3_from_end_node | 2_from_end_node | 1_from_end_node | end_node |
---|---|---|---|---|
Foo | null | null | Bar | Baz |
Foo2 | Bar2 | Baz2 | Qux2 | Thud2 |
Foo3 | null | null | null | null |
If you have to plan for any number of nodes (well, up to 255 probably, then you get into query text and parsing/compiling limits), you'll have to use dynamic SQL. Here's one way to do that (similar results in this db<>fiddle):
DECLARE @num tinyint, @cols nvarchar(max);
SELECT @num = MAX(c) - 2 FROM
(SELECT c = COUNT(*) FROM dbo.[a table] GROUP BY set_id) AS x;
;WITH src AS
(
SELECT rn = ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM STRING_SPLIT(REPLICATE('.',@num-1),'.')x
),
f AS
(
SELECT rn, col = QUOTENAME(CONCAT(rn, '_from_end_node'))
FROM src
)
SELECT @cols = (SELECT col + N',' FROM f
ORDER BY rn DESC
FOR XML PATH(''),
TYPE).value(N'.[1]', N'nvarchar(max)')
FROM src;
DECLARE @sql nvarchar(max) = N'SELECT * FROM
(
SELECT start_node = src.value, the_rest.value,
col = CASE WHEN rn = 0 THEN ''end_node''
ELSE CONCAT(the_rest.rn, ''_from_end_node'') END
FROM dbo.[a table] AS src
OUTER APPLY
(
SELECT *, rn = ROW_NUMBER() OVER
(PARTITION BY set_id ORDER BY set_row_number DESC)-1
FROM dbo.[a table] AS rel
WHERE rel.set_id = src.set_id
AND rel.set_row_number > 1
) AS the_rest WHERE src.set_row_number = 1
)
AS x
PIVOT (MAX(value) FOR col IN
(' + @cols + N'[end_node])) AS p;';
EXEC sys.sp_executesql @sql;