16

SQL Server 2016 introduced STRING_SPLIT which is very fast and an excellent replacement for any homemade implementation people would roll before 2016.

Unfortunately, STRING_SPLIT only supports a single-character separator, which isn't always enough. Does anyone know of a good implementation that allows for using multiple characters in the separator?

Aaron Bertrand
182k28 gold badges406 silver badges625 bronze badges
asked Jun 8, 2017 at 12:02

1 Answer 1

32

Well, you can always use REPLACE to add a single-character delimiter to the argument before passing it in. You just need to choose a character that is unlikely/impossible to appear in the actual data. In this example, let's say your original data uses three pipes as a delimiter; I chose a Unicode character at random to substitute:

DECLARE 
 @olddelim nvarchar(32) = N'|||', 
 @newdelim nchar(1) = NCHAR(9999); -- pencil (✏)
DECLARE @x nvarchar(max) = N'foo|||bar|||blat|||splunge';
SELECT * FROM STRING_SPLIT(REPLACE(@x, @olddelim, @newdelim), @newdelim);

I blogged about this in more detail here:


Addressing a comment:

bad solution. what if original string is like 'abc||pqr|||rst||123' (dynamic and can contain anything). desired o/p is 'abc||pqr' and 'rst||123' but your solution will give 'abc' 'pqr' 'rst' '123'

Okay, let's take your input and see if my solution gets the wrong output.

DECLARE 
 @olddelim nvarchar(32) = N'|||', 
 @newdelim nchar(1) = NCHAR(9999); -- pencil (✏)
DECLARE @x nvarchar(max) = N'abc||pqr|||rst||123';
SELECT * FROM STRING_SPLIT(REPLACE(@x, @olddelim, @newdelim), @newdelim);

Result is:

abc||pqr
rst||123

And not what you must have assumed (but didn't test) this:

abc
pqr
rst
123

If your data is in a table, you could create a view so that you don't have to factor that expression into all of your queries.


If that doesn't work, because you might have a pencil () in your data, and you can't find a single character in the 1,111,998 available Unicode characters that won't be in your data set, you'll have to skip STRING_SPLIT(), since it is hard-coded to accept a single character delimiter (separator Is a single character expression).

Alternatives have been answered here dozens of times before, many before STRING_SPLIT() existed. Those methods still work.

I go over many alternatives, and also discuss the limitations in STRING_SPLIT(), in this series (I also discuss why you might consider not doing this in T-SQL using any method at all):

answered Jun 8, 2017 at 12:23
3
  • 1
    Sometimes you just want to (mis)use string_split() ¯\_(ツ)_/¯ Commented Jan 22, 2019 at 14:36
  • 5
    @JitendraPancholi You are technically correct, but you're being unfair here because: 1) the limitation is with STRING_SPLIT, not this particular work-around, 2) this question is about STRING_SPLIT, not working with multi-character delimiters in general, 3) in practice, it's safe to assume that certain characters won't be there, else it's just bad data, 4) NCHAR(31) (record separator) should be safe since that is it's purpose, or NCHAR(0) with , @newdelim COLLATE Latin1_General_100_BIN2); because if U+0000 (null) is in any string data, then there are bigger problems! Commented Jan 22, 2019 at 16:50
  • @SolomonRutzky NCHAR(31) is unit separator; record separator is NCHAR(30); both are fine though. Commented Aug 1, 2021 at 20:03

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.