10

Consider the following simple XML:

<xml>
 <customer name="Max">
 <email address="[email protected]" />
 </customer>
 <customer name="Erik">
 <email address="[email protected]" />
 </customer>
 <customer name="Brent">
 <email address="brentcom" />
 </customer>
</xml>

I want to get a list of <Customer> sequences where the address attribute of the <email> item does not contain an @.

So, I want output that looks like:

<customer name="Brent">
 <email address="brentcom" />
</customer>

mcve:

DECLARE @x XML = '<xml>
<customer name="Max"><email address="[email protected]" /></customer>
<customer name="Erik"><email address="[email protected]" /></customer>
<customer name="Brent"><email address="brentcom" /></customer>
</xml>';

This query:

SELECT WithValidEmail = @x.query('/xml/customer/email[contains(@address, "@")]')
 , WithInvalidEmail = @x.query('/xml/customer/email[contains(@address, "@")] = False');

Returns:

╔═══════════════════════════════════════╦══════════════════╗
║ WithValidEmail ║ WithInvalidEmail ║
╠═══════════════════════════════════════╬══════════════════╣
║ <email address="[email protected]" /> ║ ║
║ <email address="[email protected]" /> ║ false ║
╚═══════════════════════════════════════╩══════════════════╝

This query:

SELECT WithInValidEmail = @x.query('/xml/customer/email')
WHERE @x.exist('/xml/customer/email[contains(@address, "@")]') = 0;

Returns:

╔══════════════════╗
║ WithInValidEmail ║
╚══════════════════╝
 (no results)

The WHERE clause in the query above is eliminating the entire set of XML because at least a single sequence exists where the email address contains an "@" sign.

asked Apr 25, 2018 at 14:35

2 Answers 2

11

An easy way to do this is to use the nodes method to get right to the address attribute and check for your @ sign.

The problem with the way you're looking now is that it's only checking that any email address has an @ in it. Parsing the XML nodes out lets you check individual emails for it.

DECLARE @x XML
 = '<xml>
<customer name="Max"><email address="[email protected]" /></customer>
<customer name="Erik"><email address="[email protected]" /></customer>
<customer name="Brent"><email address="brentcom" /></customer>
</xml>';
SELECT x.c.value('@address', 'VARCHAR(100)') AS [email]
FROM @x.nodes('/xml/customer/email') AS x(c)
WHERE x.c.exist('@address[contains(., "@")]') = 0;

If you need to query an actual table with an XML column like this, you'd just CROSS APPLY the nodes method like thusly:

SELECT x.c.value('@address', 'VARCHAR(100)') AS [email]
FROM @x_table AS xt
CROSS APPLY xt.x.nodes('/xml/customer/email') AS x(c)
WHERE x.c.exist('@address[contains(., "@")]') = 0;

If you want to bring all the <customer>...</customer> XML for that "row" back, you can walk the axis back. Just be aware that walking back can make performance a bit woogy for large XML blocks.

SELECT x.c.query('..')
FROM @x_table AS xt
CROSS APPLY xt.x.nodes('/xml/customer/email') AS x(c)
WHERE x.c.exist('@address[contains(., "@")]') = 0;

Another way of doing it is:

SELECT @x.query('/xml/customer[email/@address[not(contains(., "@"))]]') answer

Moving the square brackets to wrap around the email node effectively make that the WHERE clause applied to the customer node. Translating this XQuery to English looks like:

Get me all xml/customer nodes with an email node which has an address attribute which does not contain the @ symbol

wBob
10.4k2 gold badges26 silver badges44 bronze badges
answered Apr 25, 2018 at 14:38
0
4

You were oh so close. You were definitely on the right track with using the .query() function and using the contains XQuery function. What you got wrong was:

  1. Putting the = False outside of the [...] (meaning, it was not part of the contains() expression)
  2. Using the word False instead of the function false()
  3. Not specifying the parent node by adding /.. to the end of the path (so that the result will include the <customer> element and not just the <email> element)

Correcting those three things results in the following XQuery expression that gets you what you are wanting:

'/xml/customer/email[contains(@address, "@") = false()]/..'

Putting that into your original example from the question gives you:

DECLARE @x XML = '<xml>
<customer name="Max"><email address="[email protected]" /></customer>
<customer name="Erik"><email address="[email protected]" /></customer>
<customer name="Brent"><email address="brentcom" /></customer>
</xml>';
SELECT
@x.query('/xml/customer/email[contains(@address, "@")]/..') AS [WithValidEmail],
@x.query('/xml/customer/email[contains(@address, "@")=false()]/..') AS [WithInvalidEmail;

That query returns the following result set of a single row with two XML fields:

WithValidEmail | WithInvalidEmail
<customer name="Max"> | <customer name="Brent">
 <email address="[email protected]" /> | <email address="brentcom" />
</customer> | </customer>
<customer name="Erik"> |
 <email address="[email protected]" /> |
</customer> |

This is probably more efficient than breaking the document out with the .nodes() function since it can parse the XML in a single shot and not need to start and stop the parser per each node.

The other benefit of keeping it within .query() is that you get a single XML document returned. So, if you receive an XML document / value containing multiple nodes worth of stuff, you can maintain the scalar value approach of it being a single entity without having to reconstruct the resulting nodes back into a document again. This also lets you use it in a subquery / CTE without changing the number of expected rows being returned.

Paul White
95.3k30 gold badges439 silver badges689 bronze badges
answered Apr 25, 2018 at 21:44
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.