2

I have a query that needs to return text as valid XML. I also need to restrict the length of the text to a specific number of characters. I adapted a function that performs the conversion and outputs valid XML as shown below

ALTER FUNCTION [dbo].[ReplaceForbiddenXMLChars] (@MyField VARCHAR(MAX), @Len INT)
RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @rtn VARCHAR(MAX) = NULL;
IF @MyField IS NOT NULL 
 SET @rtn = LEFT((SELECT @MyField FOR XML PATH('')),@Len);
RETURN @rtn
END

The function works well most of the time, but I have come across an issue where the XML may be truncated resulting in invalid XML, for example:

SELECT dbo.[ReplaceForbiddenXMLChars]( 'Address: Abbey Fruit & Veg Ltd, 1234234 Mulberry Road, Chinatown, BB7 DNBD - Kingsbury Fruit & Veg',100)

Produces: Address: Abbey Fruit & Veg Ltd, 1234234 Mulberry Road, Chinatown, BB7 DNBD - Kingsbury Fruit &am

This is invalid due to the @am at the end.

I can't think of a way of reducing the size of this field while producing valid XML.

Any help appreciated.

asked Jul 21, 2023 at 9:30
2
  • Why don't you truncate before converting to XML? Maybe restrict it to 95 characters to avoid an overrun. dbfiddle.uk/cLtXKLTI Commented Jul 23, 2023 at 11:52
  • @Charlieface the problem with that is its impossible to know how much to truncate the string by before converting. The string may be full of '&' which expand to 4 characters in XML. To be safe I would have to reduce the string to 25 chars which is not ideal. Commented Aug 17, 2023 at 12:14

1 Answer 1

3

You can check if the XML string you created is valid and if it is not you can try to remove the last (perhaps incomplete) entity and try again. There can be other issues with the string that makes the XML invalid even if it is created using for xml path.

CREATE OR ALTER FUNCTION [dbo].[ReplaceForbiddenXMLChars] (@MyField VARCHAR(MAX), @Len INT)
RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @rtn VARCHAR(MAX) = NULL;
IF @MyField IS NOT NULL 
BEGIN
 SET @rtn = LEFT((SELECT @MyField FOR XML PATH('')),@Len);
 -- Check if xml is valid
 IF TRY_CAST(@rtn AS XML) IS NULL
 BEGIN
 -- Guess that it is a broken entity at the end of the string and remove it
 SET @rtn = LEFT(@rtn, LEN(@rtn) - CHARINDEX('&', REVERSE(@rtn)));
 
 -- Check if xml is valid
 IF TRY_CAST(@rtn AS XML) IS NULL
 -- Something is still wrong
 SET @rtn = NULL;
 END;
END;
answered Jul 21, 2023 at 10:54
1
  • Thanks, this works well. I did not consider the reverse option. Commented Jul 21, 2023 at 11:46

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.