236

How do I split a string? There doesn't seem to be a built-in function for this.

Mateen Ulhaq
27.8k21 gold badges121 silver badges155 bronze badges
asked Sep 15, 2009 at 12:42
2
  • 9
    Please see Splitting Strings Commented Sep 15, 2009 at 12:44
  • On what delimiter do you want to split it? Commented Feb 1, 2024 at 7:18

24 Answers 24

189

Use the gmatch() function to capture strings which contain at least one character of anything other than the desired separator. The separator is any whitespace (%s in Lua) by default:

function mysplit(inputstr, sep)
 if sep == nil then
 sep = "%s"
 end
 local t = {}
 for str in string.gmatch(inputstr, "([^"..sep.."]+)") do
 table.insert(t, str)
 end
 return t
end
Mateen Ulhaq
27.8k21 gold badges121 silver badges155 bronze badges
answered Sep 30, 2011 at 19:26
Sign up to request clarification or add additional context in comments.

9 Comments

This worked. It's just for single character delimiters. To split by strings, such as XML tags, change the match pattern to "(.-)("..sep..")" instead. Note: If the string ends with sep, the last match will fail. Append a newline or any character to the end of the input string to fix this.
Correction to my previous comment: The fix for the last match is done by appending the delimiter to the end of the input string.
As others have pointed out, you can simplify this by using table.insert(t,str) instead of t[i] = str and then you don't need i=1 or i = i +1
Doesn't work if string contains empty values, eg. 'foo,,bar'. You get {'foo','bar'} instead of {'foo', '', 'bar'}
That's right. The next version will work in that case: function split(inputstr, sep) sep=sep or '%s' local t={} for field,s in string.gmatch(inputstr, "([^"..sep.."]*)("..sep.."?)") do table.insert(t,field) if s=="" then return t end end end
|
45

If you are splitting a string in Lua, you should try the string.gmatch() or string.sub() methods. Use the string.sub() method if you know the index you wish to split the string at, or use the string.gmatch() if you will parse the string to find the location to split the string at.

Example using string.gmatch() from Lua 5.1 Reference Manual:

 t = {}
 s = "from=world, to=Lua"
 for k, v in string.gmatch(s, "(%w+)=(%w+)") do
 t[k] = v
 end
answered Sep 15, 2009 at 15:59

1 Comment

I "borrowed" an implementation from that lua-users page thanks anyway
40

If you just want to iterate over the tokens, this is pretty neat:

line = "one, two and 3!"
for token in string.gmatch(line, "[^%s]+") do
 print(token)
end

Output:

one,
two
and
3!

Short explanation: the "[^%s]+" pattern matches to every non-empty string in between space characters.

ggorlen
59.3k8 gold badges119 silver badges173 bronze badges
answered Sep 12, 2010 at 3:52

1 Comment

The pattern %S is equal to the one you mentioned, as %S is the negation of %s, like %D is the negation of %d. Additionally, %w is equal to [A-Za-z0-9_] (other characters might be supported depending on your locale).
22

Just as string.gmatch will find patterns in a string, this function will find the things between patterns:

function string:split(pat)
 pat = pat or '%s+'
 local st, g = 1, self:gmatch("()("..pat..")")
 local function getter(segs, seps, sep, cap1, ...)
 st = sep and seps + #sep
 return self:sub(segs, (seps or 0) - 1), cap1 or sep, ...
 end
 return function() if st then return getter(st, g()) end end
end

By default it returns whatever is separated by whitespace.

answered Oct 30, 2009 at 1:37

1 Comment

+1. Note to any other Lua beginners: this returns an iterator, and 'between patterns' includes the beginning and end of the string. (As a newbie I had to try it to figure these things out.)
15

Here is the function:

function split(pString, pPattern)
 local Table = {} -- NOTE: use {n = 0} in Lua-5.0
 local fpat = "(.-)" .. pPattern
 local last_end = 1
 local s, e, cap = pString:find(fpat, 1)
 while s do
 if s ~= 1 or cap ~= "" then
 table.insert(Table,cap)
 end
 last_end = e+1
 s, e, cap = pString:find(fpat, last_end)
 end
 if last_end <= #pString then
 cap = pString:sub(last_end)
 table.insert(Table, cap)
 end
 return Table
end

Call it like:

list=split(string_to_split,pattern_to_match)

e.g.:

list=split("1:2:3:4","\:")


For more go here:
http://lua-users.org/wiki/SplitJoin

Andrew White
6101 gold badge12 silver badges29 bronze badges
answered Oct 16, 2009 at 18:36

Comments

11

A lot of these answers only accept single-character separators, or don't deal with edge cases well (e.g. empty separators), so I thought I would provide a more definitive solution.

Here are two functions, gsplit and split, adapted from the code in the Scribunto MediaWiki extension, which is used on wikis like Wikipedia. The code is licenced under the GPL v2. I have changed the variable names and added comments to make the code a bit easier to understand, and I have also changed the code to use regular Lua string patterns instead of Scribunto's patterns for Unicode strings. The original code has test cases here.

-- gsplit: iterate over substrings in a string separated by a pattern
-- 
-- Parameters:
-- text (string) - the string to iterate over
-- pattern (string) - the separator pattern
-- plain (boolean) - if true (or truthy), pattern is interpreted as a plain
-- string, not a Lua pattern
-- 
-- Returns: iterator
--
-- Usage:
-- for substr in gsplit(text, pattern, plain) do
-- doSomething(substr)
-- end
local function gsplit(text, pattern, plain)
 local splitStart, length = 1, #text
 return function ()
 if splitStart then
 local sepStart, sepEnd = string.find(text, pattern, splitStart, plain)
 local ret
 if not sepStart then
 ret = string.sub(text, splitStart)
 splitStart = nil
 elseif sepEnd < sepStart then
 -- Empty separator!
 ret = string.sub(text, splitStart, sepStart)
 if sepStart < length then
 splitStart = sepStart + 1
 else
 splitStart = nil
 end
 else
 ret = sepStart > splitStart and string.sub(text, splitStart, sepStart - 1) or ''
 splitStart = sepEnd + 1
 end
 return ret
 end
 end
end
-- split: split a string into substrings separated by a pattern.
-- 
-- Parameters:
-- text (string) - the string to iterate over
-- pattern (string) - the separator pattern
-- plain (boolean) - if true (or truthy), pattern is interpreted as a plain
-- string, not a Lua pattern
-- 
-- Returns: table (a sequence table containing the substrings)
local function split(text, pattern, plain)
 local ret = {}
 for match in gsplit(text, pattern, plain) do
 table.insert(ret, match)
 end
 return ret
end

Some examples of the split function in use:

local function printSequence(t)
 print(unpack(t))
end
printSequence(split('foo, bar,baz', ',%s*')) -- foo bar baz
printSequence(split('foo, bar,baz', ',%s*', true)) -- foo, bar,baz
printSequence(split('foo', '')) -- f o o
answered Apr 24, 2017 at 7:23

2 Comments

the line ret = sepStart > splitStart and string.sub(text, splitStart, sepStart - 1) or '' can be simplified to just ret = string.sub(text, splitStart, sepStart - 1)?
@pynexj I think you are correct about this. I thought this would break if sepStart is zero, as string.sub(text, splitStart, -1) will return the substring from splitStart to the end of the string, not the empty string. However, string.find will never return zero, so this case will never occur. If I have some time I will port over the unit tests to make sure everything works, and make the change.
10

Because there are more than one way to skin a cat, here's my approach:

Code:

#!/usr/bin/env lua
local content = [=[
Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna 
aliqua. Ut enim ad minim veniam, quis nostrud exercitation 
ullamco laboris nisi ut aliquip ex ea commodo consequat.
]=]
local function split(str, sep)
 local result = {}
 local regex = ("([^%s]+)"):format(sep)
 for each in str:gmatch(regex) do
 table.insert(result, each)
 end
 return result
end
local lines = split(content, "\n")
for _,line in ipairs(lines) do
 print(line)
end

Output: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

Explanation:

The gmatch function works as an iterator, it fetches all the strings that match regex. The regex takes all characters until it finds a separator.

answered Aug 22, 2014 at 14:38

Comments

7

I like this short solution

function split(s, delimiter)
 result = {};
 for match in (s..delimiter):gmatch("(.-)"..delimiter) do
 table.insert(result, match);
 end
 return result;
end
answered Nov 20, 2013 at 15:46

2 Comments

This is my favorite, since it's so short and simple. I don't quite understand what happens, could someone explain to me?
This fails when using dot as delimiter (or potentially any other pattern magic character)
7

a way not seen in others

function str_split(str, sep)
 if sep == nil then
 sep = '%s'
 end 
 local res = {}
 local func = function(w)
 table.insert(res, w)
 end 
 string.gsub(str, '[^'..sep..']+', func)
 return res 
end
answered Aug 23, 2018 at 11:45

1 Comment

@MattSephton i use a func instead of loop
6

You can use this method:

function string:split(delimiter)
 local result = { }
 local from = 1
 local delim_from, delim_to = string.find( self, delimiter, from )
 while delim_from do
 table.insert( result, string.sub( self, from , delim_from-1 ) )
 from = delim_to + 1
 delim_from, delim_to = string.find( self, delimiter, from )
 end
 table.insert( result, string.sub( self, from ) )
 return result
end
delimiter = string.split(stringtodelimite,pattern) 
Jason Plank
2,3325 gold badges32 silver badges40 bronze badges
answered Feb 17, 2011 at 16:58

Comments

6

You could use penlight library. This has a function for splitting string using delimiter which outputs list.

It has implemented many of the function that we may need while programming and missing in Lua.

Here is the sample for using it.

> 
> stringx = require "pl.stringx"
> 
> str = "welcome to the world of lua"
> 
> arr = stringx.split(str, " ")
> 
> arr
{welcome,to,the,world,of,lua}
> 
answered Jul 1, 2019 at 12:31

Comments

5

Simply sitting on a delimiter

local str = 'one,two'
local regxEverythingExceptComma = '([^,]+)'
for x in string.gmatch(str, regxEverythingExceptComma) do
 print(x)
end
answered Apr 27, 2016 at 7:49

Comments

3

I used the above examples to craft my own function. But the missing piece for me was automatically escaping magic characters.

Here is my contribution:

function split(text, delim)
 -- returns an array of fields based on text and delimiter (one character only)
 local result = {}
 local magic = "().%+-*?[]^$"
 if delim == nil then
 delim = "%s"
 elseif string.find(delim, magic, 1, true) then
 -- escape magic
 delim = "%"..delim
 end
 local pattern = "[^"..delim.."]+"
 for w in string.gmatch(text, pattern) do
 table.insert(result, w)
 end
 return result
end
answered Oct 23, 2015 at 23:32

2 Comments

This was my big issue too. This works great with magic characters, nice one
might be stupid, but what are 'magic characters'?
3

Super late to this question, but in case anyone wants a version that handles the amount of splits you want to get.....

-- Split a string into a table using a delimiter and a limit
string.split = function(str, pat, limit)
 local t = {}
 local fpat = "(.-)" .. pat
 local last_end = 1
 local s, e, cap = str:find(fpat, 1)
 while s do
 if s ~= 1 or cap ~= "" then
 table.insert(t, cap)
 end
 last_end = e+1
 s, e, cap = str:find(fpat, last_end)
 if limit ~= nil and limit <= #t then
 break
 end
 end
 if last_end <= #str then
 cap = str:sub(last_end)
 table.insert(t, cap)
 end
 return t
end
answered Feb 11, 2020 at 15:13

Comments

1

For those coming from the exercice 10.1 of the "Programming in Lua" book, it seems clear that we could not use notion explained later in the book (iterator) and that the function should take more than a single char seperator.

The split() is a trick to get pattern to match what is not wanted (the split) and return an empty table on empty string. The return of plainSplit() is more like the split in other language.

magic = "([%%%.%(%)%+%*%?%[%]%^%$])"
function split(str, sep, plain)
 if plain then sep = string.gsub(sep, magic, "%%%1") end
 
 local N = '255円'
 str = N..str..N
 str = string.gsub(str, sep, N..N)
 local result = {}
 for word in string.gmatch(str, N.."(.-)"..N) do
 if word ~= "" then
 table.insert(result, word)
 end
 end
 return result
end
function plainSplit(str, sep)
 sep = string.gsub(sep, magic, "%%%1")
 local result = {}
 local start = 0
 repeat
 start = start + 1
 local from, to = string.find(str, sep, start)
 from = from and from-1
 
 local word = string.sub(str, start, from, true)
 table.insert(result, word)
 start = to
 until start == nil
 return result
end
function tableToString(t)
 local ret = "{"
 for _, word in ipairs(t) do
 ret = ret .. '"' .. word .. '", '
 end
 ret = string.sub(ret, 1, -3)
 ret = ret .. "}"
 return #ret > 1 and ret or "{}"
end
function runSplit(func, title, str, sep, plain)
 print("\n" .. title)
 print("str: '"..str.."'")
 print("sep: '"..sep.."'")
 local t = func(str, sep, plain)
 print("-- t = " .. tableToString(t))
end
print("\n\n\n=== Pattern split ===")
runSplit(split, "Exercice 10.1", "a whole new world", " ")
runSplit(split, "With trailing seperator", " a whole new world ", " ")
runSplit(split, "A word seperator", "a whole new world", " whole ")
runSplit(split, "Pattern seperator", "a1whole2new3world", "%d")
runSplit(split, "Magic characters as plain seperator", "a$.%whole$.%new$.%world", "$.%", true)
runSplit(split, "Control seperator", "a0円whole1円new2円world", "%c")
runSplit(split, "ISO Time", "2020年07月10日T15:00:00.000", "[T:%-%.]")
runSplit(split, " === [Fails] with \255円 ===", "a255円whole0円new0円world", "0円", true)
runSplit(split, "How does your function handle empty string?", "", " ")
print("\n\n\n=== Plain split ===")
runSplit(plainSplit, "Exercice 10.1", "a whole new world", " ")
runSplit(plainSplit, "With trailing seperator", " a whole new world ", " ")
runSplit(plainSplit, "A word seperator", "a whole new world", " whole ")
runSplit(plainSplit, "Magic characters as plain seperator", "a$.%whole$.%new$.%world", "$.%")
runSplit(plainSplit, "How does your function handle empty string?", "", " ")

output

=== Pattern split ===
Exercice 10.1
str: 'a whole new world'
sep: ' '
-- t = {"a", "whole", "new", "world"}
With trailing seperator
str: ' a whole new world '
sep: ' '
-- t = {"a", "whole", "new", "world"}
A word seperator
str: 'a whole new world'
sep: ' whole '
-- t = {"a", "new world"}
Pattern seperator
str: 'a1whole2new3world'
sep: '%d'
-- t = {"a", "whole", "new", "world"}
Magic characters as plain seperator
str: 'a$.%whole$.%new$.%world'
sep: '$.%'
-- t = {"a", "whole", "new", "world"}
Control seperator
str: 'awholenewworld'
sep: '%c'
-- t = {"a", "whole", "new", "world"}
ISO Time
str: '2020-07-10T15:00:00.000'
sep: '[T:%-%.]'
-- t = {"2020", "07", "10", "15", "00", "00", "000"}
 === [Fails] with 255円 ===
str: 'a�wholenewworld'
sep: ''
-- t = {"a"}
How does your function handle empty string?
str: ''
sep: ' '
-- t = {}
=== Plain split ===
Exercice 10.1
str: 'a whole new world'
sep: ' '
-- t = {"a", "whole", "new", "world"}
With trailing seperator
str: ' a whole new world '
sep: ' '
-- t = {"", "", "a", "", "whole", "", "", "new", "world", "", ""}
A word seperator
str: 'a whole new world'
sep: ' whole '
-- t = {"a", "new world"}
Magic characters as plain seperator
str: 'a$.%whole$.%new$.%world'
sep: '$.%'
-- t = {"a", "whole", "new", "world"}
How does your function handle empty string?
str: ''
sep: ' '
-- t = {""}
answered Aug 19, 2022 at 0:54

Comments

0

I found that many of the other answers had edge cases which failed (eg. when given string contains #, { or } characters, or when given a delimiter character like % which require escaping). Here is the implementation that I went with instead:

local function newsplit(delimiter, str)
 assert(type(delimiter) == "string")
 assert(#delimiter > 0, "Must provide non empty delimiter")
 -- Add escape characters if delimiter requires it
 delimiter = delimiter:gsub("[%(%)%.%%%+%-%*%?%[%]%^%$]", "%%%0")
 local start_index = 1
 local result = {}
 while true do
 local delimiter_index, _ = str:find(delimiter, start_index)
 if delimiter_index == nil then
 table.insert(result, str:sub(start_index))
 break
 end
 table.insert(result, str:sub(start_index, delimiter_index - 1))
 start_index = delimiter_index + 1
 end
 return result
end
answered Jul 7, 2022 at 19:59

Comments

0

There's an example (unexpandTabs) at the end of the Replacements section of Programming in Lua, 4th Ed., Chapter 10, that uses the SOH character (1円) to mark tab columns for later processing. I thought that was a neat idea, so I adapted it to the "match everything except a delimiter character" ideas that many of the answers here use. By preprocessing the input string to replace all matches with 1円, we can support arbitrary delimiter patterns, which is something only some answers do, e.g. @norman-ramsey's excellent answer.

I also included an exclude_empty parameter with default behavior just for fun.

Obviously this will produce bad output if the input string contains 1円, but that seems extremely unlikely in any case outside of specialized protocol exchanges.

function string:split(pat, exclude_empty)
 pat = pat or "%s+"
 self = self:gsub(pat, "1円")
 local res = {}
 for match in self:gmatch("([^1円]" .. (exclude_empty and "+" or "*") .. ")") do
 res[#res + 1] = match
 end
 return res
end
answered Aug 28, 2023 at 1:17

Comments

0

Cleanest/simplest solution yet? For splitting on whitespace, that is.

function(argstr)
 local args = {}
 for v in string.gmatch(argstr, "%S+") do
 table.insert(args, v)
 end
 return args
end
answered Dec 8, 2023 at 19:39

Comments

0

The way to split a string to two strings in given position:

str1 = "helloworld"
str2 = ""
index = 5
str1, str2 = string.sub(str1, 1, index), string.sub(str1, index+1, -1)
print (str1, str2) -- hello world
answered Dec 12, 2023 at 7:52

Comments

0

this function extends string with split function 'python like'.

-- string functions utilities
string.split=function(self,sep,limit)
 if sep==nil then
 sep=" "
 end
 local _table = {}
 local _string = ''
 local x = 0 -- separation counter
 for i=1,#self do
 local char=string.sub(self,i,i) -- get character 'i' in string
 if limit==nil then
 -- unlimited separations
 if char == sep then
 -- separation found, insert string in table and 'reset' string to store next world
 table.insert(_table,_string)
 _string=""
 else
 _string = _string .. char -- store no separation character to store later
 if i==#self then
 -- last character in string, just add its remain to table
 table.insert(_table,_string)
 end
 end 
 elseif type(limit)=="number" then
 -- limited separations
 if char == sep then
 -- separation character found
 x=x+1 -- increment separator count
 if x<=limit then
 -- while separator count <= limit, add _string to _table, and reset _string to next world
 table.insert(_table,_string)
 _string="" 
 else
 -- separator counter limit pass, now just concat chars to the last string
 _string = _string .. sep
 end
 else
 -- no char seprator, concat char to string and insert in table if last char
 _string = _string .. char
 if i==#self then
 table.insert(_table,_string)
 end
 end
 end
 end
 -- return splitted table
 return _table
end

usage:

msg='my favorite string'
msg=msg:split(' ',1)

this will result in table {'my','favorite string'} as expected!

answered Mar 19, 2024 at 12:07

Comments

0

I would say that the best answer I know about right now is to consider Penlight as a Lua’s standard library and use https://lunarmodules.github.io/Penlight/libraries/pl.stringx.html#split

answered Jan 28 at 8:51

1 Comment

This answer from 6 years earlier is more complete and includes an example of how to use the library. I'd delete this and defer to that, since I don't think anything new is being added here in an already very cluttered thread.
0

My two cents (it returns an array of tokens):

local function split_string (target, separator, plain)
 local token
 local rettbl = {}
 local tmp = target .. ','
 local e_pos, n_pos = string.find(tmp, separator, 1, plain)
 local s_pos = 1
 repeat
 token = tmp:sub(s_pos, e_pos - 1)
 table.insert(rettbl, token)
 s_pos = n_pos + 1
 e_pos, n_pos = string.find(tmp, separator, s_pos, plain)
 until s_pos == nil
 return rettbl
end

You can try it with--

local my_table = split_string('one, two, three,four , five',
 '%s*,%s*', false)
for key, val in ipairs(my_table) do
 print('/' .. val .. '/')
end

which will print

/one/
/two/
/three/
/four/
/five/

--madmurphy

answered Aug 14 at 16:29

Comments

-1

Here is a routine that works in Lua 4.0, returning a table t of the substrings in inputstr delimited by sep:

function string_split(inputstr, sep)
 local inputstr = inputstr .. sep
 local idx, inc, t = 0, 1, {}
 local idx_prev, substr
 repeat 
 idx_prev = idx
 inputstr = strsub(inputstr, idx + 1, -1) -- chop off the beginning of the string containing the match last found by strfind (or initially, nothing); keep the rest (or initially, all)
 idx = strfind(inputstr, sep) -- find the 0-based r_index of the first occurrence of separator 
 if idx == nil then break end -- quit if nothing's found
 substr = strsub(inputstr, 0, idx) -- extract the substring occurring before the separator (i.e., data field before the next delimiter)
 substr = gsub(substr, "[%c" .. sep .. " ]", "") -- eliminate control characters, separator and spaces
 t[inc] = substr -- store the substring (i.e., data field)
 inc = inc + 1 -- iterate to next
 until idx == nil
 return t
end

This simple test

inputstr = "the brown lazy fox jumped over the fat grey hen ... or something."
sep = " " 
t = {}
t = string_split(inputstr,sep)
for i=1,15 do
 print(i, t[i])
end

Yields:

--> t[1]=the
--> t[2]=brown
--> t[3]=lazy
--> t[4]=fox
--> t[5]=jumped
--> t[6]=over
--> t[7]=the
--> t[8]=fat
--> t[9]=grey
--> t[10]=hen
--> t[11]=...
--> t[12]=or
--> t[13]=something.
answered May 16, 2022 at 21:28

Comments

-3

Depending on the use case, this could be useful. It cuts all text either side of the flags:

b = "This is a string used for testing"
--Removes unwanted text
c = (b:match("a([^/]+)used"))
print (c)

Output:

string
answered Aug 12, 2019 at 22:36

1 Comment

I dont see any relation to the question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.