How to split the string when it contains pipe symbols |
in it.
I want to split them to be in array.
I tried
echo "12:23:11" | awk '{split(0,ドルa,":"); print a[3] a[2] a[1]}'
Which works fine. If my string is like "12|23|11"
then how do I split them into an array?
12 Answers 12
Have you tried:
echo "12|23|11" | awk '{split(0,ドルa,"|"); print a[3],a[2],a[1]}'
-
2@Mohamed Saligh, if you're on Solaris, you need to use /usr/xpg4/bin/awk, given the string length.Dimitre Radoulov– Dimitre Radoulov2011年11月04日 13:54:49 +00:00Commented Nov 4, 2011 at 13:54
-
6'is not working for me'. especially with colons between the echoed values and split set up to split on '|'??? Typo? Good luck to all.shellter– shellter2011年11月04日 23:17:28 +00:00Commented Nov 4, 2011 at 23:17
-
5Better with some syntax explanation.Alston– Alston2015年08月18日 11:42:28 +00:00Commented Aug 18, 2015 at 11:42
-
2This will not work in GNU awk, because third argument to
split
is regular expression, and|
is special symbol, which needs to be escaped. Usesplit(0,ドル a, "\|")
WhiteWind– WhiteWind2017年04月19日 04:03:01 +00:00Commented Apr 19, 2017 at 4:03 -
3@WhiteWind: another way to "ensure" that
|
is seen as a char and not a special symbol is to put it between[]
: ie,split(0,ドル a, "[|]")
# I like this better than '\|', in some cases, especially as some variant of regexp (perl vs grep vs .. others?) can have "|" interepreted literally and "\|" seen as regex separator, instead of the opposite... ymmvOlivier Dulac– Olivier Dulac2019年12月05日 09:16:52 +00:00Commented Dec 5, 2019 at 9:16
To split a string to an array in awk
we use the function split()
:
awk '{split(0,ドル array, ":")}'
# \/ \___/ \_/
# | | |
# string | delimiter
# |
# array to store the pieces
If no separator is given, it uses the FS
, which defaults to the space:
$ awk '{split(0,ドル array); print array[2]}' <<< "a:b c:d e"
c:d
We can give a separator, for example :
:
$ awk '{split(0,ドル array, ":"); print array[2]}' <<< "a:b c:d e"
b c
Which is equivalent to setting it through the FS
:
$ awk -F: '{split(0,ドル array); print array[2]}' <<< "a:b c:d e"
b c
In GNU Awk you can also provide the separator as a regexp:
$ awk '{split(0,ドル array, ":*"); print array[2]}' <<< "a:::b c::d e
#note multiple :
b c
And even see what the delimiter was on every step by using its fourth parameter:
$ awk '{split(0,ドル array, ":*", sep); print array[2]; print sep[1]}' <<< "a:::b c::d e"
b c
:::
Let's quote the man page of GNU awk:
split(string, array [, fieldsep [, seps ] ])
Divide string into pieces separated by fieldsep and store the pieces in array and the separator strings in the seps array. The first piece is stored in
array[1]
, the second piece inarray[2]
, and so forth. The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records). If fieldsep is omitted, the value of FS is used.split()
returns the number of elements created. seps is agawk
extension, withseps[i]
being the separator string betweenarray[i]
andarray[i+1]
. If fieldsep is a single space, then any leading whitespace goes intoseps[0]
and any trailing whitespace goes intoseps[n]
, where n is the return value ofsplit()
(i.e., the number of elements in array).
Please be more specific! What do you mean by "it doesn't work"? Post the exact output (or error message), your OS and awk version:
% awk -F\| '{
for (i = 0; ++i <= NF;)
print i, $i
}' <<<'12|23|11'
1 12
2 23
3 11
Or, using split:
% awk '{
n = split(0,ドル t, "|")
for (i = 0; ++i <= n;)
print i, t[i]
}' <<<'12|23|11'
1 12
2 23
3 11
Edit: on Solaris you'll need to use the POSIX awk (/usr/xpg4/bin/awk) in order to process 4000 fields correctly.
-
for(i = 0
orfor(i = 1
?PiotrNycz– PiotrNycz2015年09月17日 14:46:00 +00:00Commented Sep 17, 2015 at 14:46 -
i = 0, because I use ++i after (not i++).Dimitre Radoulov– Dimitre Radoulov2015年09月17日 15:44:01 +00:00Commented Sep 17, 2015 at 15:44
-
6Ok - I did not notice this. I strongly believe more readable would be
for (i = 1; i <= n; ++i)
...PiotrNycz– PiotrNycz2015年09月17日 15:48:33 +00:00Commented Sep 17, 2015 at 15:48 -
@PiotrNycz @Dimitre : why not
for (i = 0; i++ < n; ) { ... }
- this way it combines the best of the 1-based indexing and 0-based indexing (with the freebie bonus of no longer needing the 3rd argument in thefor (...;...;...) { }
statementRARE Kpop Manifesto– RARE Kpop Manifesto2025年06月04日 21:22:20 +00:00Commented Jun 4 at 21:22 -
@rare-kpop-manifesto if this is not just a strange sense of humor, then: we have for (init; condition; increment) construct. Each of 3 parts have their well defined places in this for-loop. What is the possible benefit of placing two parts in one slot and have one slot empty? The only reason for that is to make code less readable, thus maybe avoiding being fired - because the code will be hard to maintain by someone else?PiotrNycz– PiotrNycz2025年06月07日 09:20:18 +00:00Commented Jun 7 at 9:20
I do not like the echo "..." | awk ...
solution as it calls unnecessary fork
and exec
system calls.
I prefer a Dimitre's solution with a little twist
awk -F\| '{print 3ドル 2ドル 1ドル}' <<<'12|23|11'
Or a bit shorter version:
awk -F\| '0ドル=3ドル 2ドル 1ドル' <<<'12|23|11'
In this case the output record put together which is a true condition, so it gets printed.
In this specific case the stdin
redirection can be spared with setting an awk internal variable:
awk -v T='12|23|11' 'BEGIN{split(T,a,"|");print a[3] a[2] a[1]}'
I used ksh quite a while, but in bash this could be managed by internal string manipulation. In the first case the original string is split by internal terminator. In the second case it is assumed that the string always contains digit pairs separated by a one character separator.
T='12|23|11';echo -n ${T##*|};T=${T%|*};echo ${T#*|}${T%|*}
T='12|23|11';echo ${T:6}${T:3:2}${T:0:2}
The result in all cases is
112312
-
1I think the end result was supposed to be the awk array variable references, regardless of the print output example given. But you missed a really easy bash case to provide your end result. T='12:23:11';echo ${T//:}Daniel Liston– Daniel Liston2018年09月14日 00:17:10 +00:00Commented Sep 14, 2018 at 0:17
-
@DanielListon You are right! Thanks! I did not know that the trailing / can be left in this
bash
expression...TrueY– TrueY2018年09月17日 13:42:52 +00:00Commented Sep 17, 2018 at 13:42
Actually awk
has a feature called 'Input Field Separator Variable' link. This is how to use it. It's not really an array, but it uses the internal $ variables. For splitting a simple string it is easier.
echo "12|23|11" | awk 'BEGIN {FS="|";} { print 1,ドル 2,ドル 3ドル }'
Joke? :)
How about echo "12|23|11" | awk '{split(0,ドルa,"|"); print a[3] a[2] a[1]}'
This is my output:
p2> echo "12|23|11" | awk '{split(0,ドルa,"|"); print a[3] a[2] a[1]}'
112312
so I guess it's working after all..
-
is that because of the length of the string ? since, my string length is 4000. any ideasMohamed Saligh– Mohamed Saligh2011年11月04日 13:19:43 +00:00Commented Nov 4, 2011 at 13:19
I know this is kind of old question, but I thought maybe someone like my trick. Especially since this solution not limited to a specific number of items.
# Convert to an array
_ITEMS=($(echo "12|23|11" | tr '|' '\n'))
# Output array items
for _ITEM in "${_ITEMS[@]}"; do
echo "Item: ${_ITEM}"
done
The output will be:
Item: 12
Item: 23
Item: 11
echo "12|23|11" | awk '{split(0,ドルa,"|"); print a[3] a[2] a[1]}'
should work.
echo "12|23|11" | awk '{split(0,ドルa,"|"); print a[3] a[2] a[1]}'
code
awk -F"|" '{split(0,ドルa); print a[1],a[2],a[3]}' <<< '12|23|11'
output
12 23 11
-
1Your answer could be improved by adding more information on what the code does and how it helps the OP.Tyler2P– Tyler2P2022年04月24日 11:19:05 +00:00Commented Apr 24, 2022 at 11:19
The challenge: parse and store split strings with spaces and insert them into variables.
Solution: best and simple choice for you would be convert the strings list into array and then parse it into variables with indexes. Here's an example how you can convert and access the array.
Example: parse disk space statistics on each line:
sudo df -k | awk 'NR>1' | while read -r line; do
#convert into array:
array=($line)
#variables:
filesystem="${array[0]}"
size="${array[1]}"
capacity="${array[4]}"
mountpoint="${array[5]}"
echo "filesystem:$filesystem|size:$size|capacity:$capacity|mountpoint:$mountpoint"
done
#output:
filesystem:/dev/dsk/c0t0d0s1|size:4000|usage:40%|mountpoint:/
filesystem:/dev/dsk/c0t0d0s2|size:5000|usage:50%|mountpoint:/usr
filesystem:/proc|size:0|usage:0%|mountpoint:/proc
filesystem:mnttab|size:0|usage:0%|mountpoint:/etc/mnttab
filesystem:fd|size:1000|usage:10%|mountpoint:/dev/fd
filesystem:swap|size:9000|usage:9%|mountpoint:/var/run
filesystem:swap|size:1500|usage:15%|mountpoint:/tmp
filesystem:/dev/dsk/c0t0d0s3|size:8000|usage:80%|mountpoint:/export
awk -F'['|'] -v '{print 1ドル"\t"2ドル"\t"3ドル}' file <<<'12|23|11'
OFS
, stick commas in between them, makingprint
see them as separate arguments.echo "12:23:11" | sed "s/.*://"
echo "12:23:11" | sed "s/.*://"
) delete everything until (and including) the last ":", keeping only the "11" ... it works to get the last number, but would need to be modified (in an difficult to read way) to get the 2nd number, etc. awk (and awk's split) is much more elegant and readable.cut
input="12:23:11"
and thenoutput=$(echo -n ":${input}" | tr ':' '\n' | tac -b | tr '\n' ':'); output="${output#:}"; echo "${output}"
output now contains11:23:12