6

I am working on the database dump of this exact stack exchange section. While I am working on it I have encountered one issue that I am currently unable to solve.

In the XML File Posts.xml the contents look like this

enter image description here

There are of course multiple rows, but that's how one looks like. There's already a Tags.xml file given in the dump, which makes it even more obvious that the "Tags" attribute in that picture is in fact supposed to be its separate table (many to many).

So right now I am trying to figure out a way how to extract the tags. Here's what I tried to do:

CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)
GO

↑ The table I created to test out my code. I have already filled it with the Tags and PostIds

SELECT T1.PostId,
 S.SplitTag
FROM (
 SELECT T.PostId, 
 cast('<X>'+ REPLACE(T.Tag,'>','</X><X>') + '</X>' as XML) AS NewTag
 FROM #TestingIdea AS T
 ) AS T1
CROSS APPLY (
 SELECT tData.value('.','nvarchar(30)') SplitTag
 FROM T1.NewTag.nodes('X') AS T(tData)
 ) AS S
GO

Yet this code returns this error

XML parsing: line 1, character 37, illegal qualified name character

After googling this error (including here), whatever people had (like extra " marks or different CHAR sets) I didn't have. So I am kind of stuck. Maybe I missed something extremely obvious from previous answers I found T_T In any case I appreciate any help and advice on how to tackle this. It's the last table I have yet to normalize.

Small Sample Data From the XML File https://pastebin.com/AW0Z8Be2 For anyone interested in the program I use to view XML files (so it's much easier to read like in that picture above). It's called FOXE XML Reader (Free XML Editor - First Object)

Paul White
95.3k30 gold badges439 silver badges689 bronze badges
asked Dec 15, 2018 at 19:16
2
  • So do you need the tags one by one as a result? What is the exact result you need? Do you have sample data to work with? Commented Dec 15, 2018 at 19:22
  • 1
    Yea I need them to be 1 by 1. This is how the data looks like in my database right now i.gyazo.com/6b7408201f18ebcbf888cc0f8b36cb27.png All I have is the XML file for my data Commented Dec 15, 2018 at 19:28

1 Answer 1

8

Does something like this satisfy the resultset?

Table & Data

CREATE TABLE #TestingIdea (
Id int PRIMARY KEY IDENTITY (1,1),
PostId int NULL,
Tag nvarchar (MAX) NULL
)
INSERT INTO #TestingIdea(PostId,Tag)
VALUES(1,'<mysql><innodb><myisam>')
GO

Query

SELECT PostId, RIGHT(value,len(value)-1) as SplitTag
FROM #TestingIdea 
CROSS APPLY string_split(tag,'>')
WHERE value != ''

Result

PostId SplitTag
1 mysql
1 innodb
1 myisam
answered Dec 15, 2018 at 19:42
4
  • 1
    Indeed it does O_O First of all, thank you! Makes my "complicated" code look like sht lol Would be kind enough as to explain how that query works? I am obvious still new with xQuery stuff. I understand that you removed the last '>' on the far right. But I don't understand how the '<' at the beginning got removed? Commented Dec 15, 2018 at 19:51
  • 1
    Hey, no problem! I was lucky that you used a version of SQL Server 2016 or above. The String_Split gets, like you saw rid of the '>' tag, and creates a new row by 'cross applying' the function. So when i just get the 'value' from the string split, without applying the RIGHT() function, it will show as <mysql for example. So what i did, and there might be better solutions, is just get the value, and the length of the value - 1, and applied the RIGHT() function to that. This means that the first character of the value returned will not be returned. resulting into 'mysql' instead of <mysql. Commented Dec 15, 2018 at 19:55
  • 1
    I am assuming this means that versions prior to SQL 2016 didn't have the string_split function? Commented Dec 15, 2018 at 20:02
  • 1
    Indeed, they had to create a custom function like the one in this link: stackoverflow.com/questions/10914576/t-sql-split-string Commented Dec 15, 2018 at 20:05

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.