I have HTML code stored in the data base, and I want to read it as XML.
My codes:
http://rextester.com/RMEHO89992
This is an example of the HTML code I have:
<div>
<section>
<h4>
<span> A </span>
</h4>
<ul>
<li>
<span> Ab</span>
AD
<span> AC </span>
</li>
<li>
<span> Ag</span>
<span> AL </span>
</li>
</ul>
<h4>
<span> B </span>
</h4>
<ul>
<li>
<span> Bb</span>
BD
<span> BC </span>
</li>
<li>
<span> Bg</span>
<span> BL </span>
</li>
</ul>
</section>
</div>
and this is an example of the output I need:
Category Selection Value
--------- --------- ------------
A Ab AD
A Ag AL
B Bb BD
B Bg BL
I need to get the value inside the <h4>
tag as a Category
, the first <span>
tag as Selection, and the rest of the values as a concatenated string.
I've tried the following query:
SELECT
( isnull(t.v.value('(h4/span/span[1]/text())[1]','nvarchar(max)'),'')
+ isnull(t.v.value('(h4/span/text())[1]','nvarchar(max)'),'')
+ isnull(t.v.value('(h4/span/span[2]/text())[2]','nvarchar(max)'),'')
) AS [Category],
( isnull(c.g.value('(span[1]/text())[1]','nvarchar(max)'),'')
+ isnull(c.g.value('(span[1]/span/text())[1]','nvarchar(max)'),'')
+ isnull(c.g.value('(span[1]/text())[2]','nvarchar(max)'),'')
) AS [Selection],
( isnull(c.g.value('(span[2]/text())[1]','nvarchar(max)'),'')
+ isnull(c.g.value('(span[2]/span/text())[1]','nvarchar(max)'),'')
+ isnull(c.g.value('(span[2]/text())[2]','nvarchar(max)'),'')
) AS [Value]
FROM @htmlXML.nodes('div/section') as t(v)
CROSS APPLY t.v.nodes('./ul/li') AS c(g)
and :
SELECT
t.v.value('.','nvarchar(max)')
,
--( isnull(t.v.value('(h4/span/span[1]/text())[1]','nvarchar(max)'),'')+isnull(t.v.value('(h4/span/text())[1]','nvarchar(max)'),'')+isnull(t.v.value('(h4/span/span[2]/text())[2]','nvarchar(max)'),''))AS [Category],
( isnull(c.g.value('(span[1]/text())[1]','nvarchar(max)'),'')+isnull(c.g.value('(span[1]/span/text())[1]','nvarchar(max)'),'')+isnull(c.g.value('(span[1]/text())[2]','nvarchar(max)'),''))AS [Selection]
,
( isnull(c.g.value('(span[2]/text())[1]','nvarchar(max)'),'')+isnull(c.g.value('(span[2]/span/text())[1]','nvarchar(max)'),'')+isnull(c.g.value('(span[2]/text())[2]','nvarchar(max)'),''))AS [Value]
FROM @htmlXML.nodes('div/section/h4/span') as t(v)
CROSS APPLY @htmlXML.nodes('div/section/ul/li') AS c(g)
But it only gets the first category, and doesn't get all the values togheter.
Category Selection Value
--------- --------- ------------
A Ab AC
B Ab AC
A Ag AL
B Ag AL
A Bb BC
B Bb BC
A Bg BL
B Bg BL
There can be N categories, and the values might or might not be inside <span>
tags.
How can I get all the categories with their corresponding value?
or get :
category h4 number
-------- -----------
A 1
B 2
- 1 ,mean = h4 first , 2 ,mean = h4 second
ul number Selection Value
--------- --------- ------------
1 Ab AD
1 Ag AL
2 Bb BD
2 Bg BL
relation between column ul number and h4 number. i cannt.
2 Answers 2
This is not exactly elegant but seems to do the job.
DECLARE @X XML = REPLACE(REPLACE(@S, '<h4>', '<foo><h4>'), '</ul>', '</ul></foo>')
SELECT Category = x.value('../../h4[1]/span[1]', 'varchar(10)'),
Selection = x.value('descendant-or-self::text()[1]', 'varchar(10)'),
Value = REPLACE(
REPLACE(
REPLACE(
LTRIM(
RTRIM(
REPLACE(
REPLACE(
CAST(x.x.query('fn:data(descendant-or-self::text()[fn:position() > 1])') AS VARCHAR(MAX))
, char(10), '')
, char(13), '')
)
)
, ' ', ' |')
, '| ', '')
, '|', '')
FROM @X.nodes('div/section/foo/ul/li') x(x)
ORDER BY Category,
Selection
+----------+-----------+-------+
| Category | Selection | Value |
+----------+-----------+-------+
| A | Ab | AD AC |
| A | Ag | AL |
| B | Bb | BD BC |
| B | Bg | BL |
+----------+-----------+-------+
I'm assuming this is what you want as the desired results table in the question does not return the "rest of the values as a concatenated string"
I am trying to establish communication between nodes
h4
andul
.
You can use the <<
and >>
operator to check if a node is before or after another node in document order. Combine that with a predicate on position, [1]
, to get the first occurrence also in document order.
select H4.X.value('(span/text())[1]', 'varchar(10)') as Section,
UL.X.query('.') as UL
from @X.nodes('/div/section/h4') as H4(X)
cross apply H4.X.nodes('(let $h4 := . (: Save current h4 node :)
return /div/section/ul[$h4 << .])[1]') as UL(X);
<<
and >>
are called Node Order Comparison Operators
If you have an XML fragment like this:
<N1>1</N1>
<N2>2</N2>
<N3>3</N3>
<N4>4</N4>
<N5>5</N5>
you can get all nodes before the first occurrence of N3
with this query:
select @X.query('/*[. << /N3[1]]');
Result:
<N1>1</N1>
<N2>2</N2>
/*
will give you all root nodes. What is enclosed in []
is a predicate. .
is the current node and /N3[1]
is the first N3 node in document order at the root level. So from each root node you get the nodes that precede N3
.
Here is almost the same query, only you get the nodes that follow the first N3
node:
select @X.query('/*[. >> /N3[1]]');
<N4>4</N4>
<N5>5</N5>
To only get the first node after the first N3
node, you add the predicate [1]
:
select @X.query('/*[. >> /N3[1]][1]');
<N4>4</N4>
AD AC
for the first row in the third column?