I have a query that needs to return text as valid XML. I also need to restrict the length of the text to a specific number of characters. I adapted a function that performs the conversion and outputs valid XML as shown below
ALTER FUNCTION [dbo].[ReplaceForbiddenXMLChars] (@MyField VARCHAR(MAX), @Len INT)
RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @rtn VARCHAR(MAX) = NULL;
IF @MyField IS NOT NULL
SET @rtn = LEFT((SELECT @MyField FOR XML PATH('')),@Len);
RETURN @rtn
END
The function works well most of the time, but I have come across an issue where the XML may be truncated resulting in invalid XML, for example:
SELECT dbo.[ReplaceForbiddenXMLChars]( 'Address: Abbey Fruit & Veg Ltd, 1234234 Mulberry Road, Chinatown, BB7 DNBD - Kingsbury Fruit & Veg',100)
Produces: Address: Abbey Fruit & Veg Ltd, 1234234 Mulberry Road, Chinatown, BB7 DNBD - Kingsbury Fruit &am
This is invalid due to the @am at the end.
I can't think of a way of reducing the size of this field while producing valid XML.
Any help appreciated.
-
Why don't you truncate before converting to XML? Maybe restrict it to 95 characters to avoid an overrun. dbfiddle.uk/cLtXKLTICharlieface– Charlieface2023年07月23日 11:52:38 +00:00Commented Jul 23, 2023 at 11:52
-
@Charlieface the problem with that is its impossible to know how much to truncate the string by before converting. The string may be full of '&' which expand to 4 characters in XML. To be safe I would have to reduce the string to 25 chars which is not ideal.OptimumCoder– OptimumCoder2023年08月17日 12:14:12 +00:00Commented Aug 17, 2023 at 12:14
1 Answer 1
You can check if the XML string you created is valid and if it is not you can try to remove the last (perhaps incomplete) entity and try again. There can be other issues with the string that makes the XML invalid even if it is created using for xml path
.
CREATE OR ALTER FUNCTION [dbo].[ReplaceForbiddenXMLChars] (@MyField VARCHAR(MAX), @Len INT)
RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE @rtn VARCHAR(MAX) = NULL;
IF @MyField IS NOT NULL
BEGIN
SET @rtn = LEFT((SELECT @MyField FOR XML PATH('')),@Len);
-- Check if xml is valid
IF TRY_CAST(@rtn AS XML) IS NULL
BEGIN
-- Guess that it is a broken entity at the end of the string and remove it
SET @rtn = LEFT(@rtn, LEN(@rtn) - CHARINDEX('&', REVERSE(@rtn)));
-- Check if xml is valid
IF TRY_CAST(@rtn AS XML) IS NULL
-- Something is still wrong
SET @rtn = NULL;
END;
END;
-
Thanks, this works well. I did not consider the reverse option.OptimumCoder– OptimumCoder2023年07月21日 11:46:06 +00:00Commented Jul 21, 2023 at 11:46