I need to split a command like:
r' "C:\Program Files (x86)\myeditor" "$FILEPATH" -n$LINENO "c:\Program Files" -f$FILENAME -aArg2'`
into:
['"C:\\Program Files (x86)\\myeditor"',
'"$FILEPATH"',
'-n$LINENO',
'"c:\\Program Files"',
'-f$FILENAME',
'-aArg2']
That is, I want to split by spaces, but avoid splitting the elements in double-quotes.
I have this code:
import re
s = r' "C:\Program Files (x86)\myeditor" "$FILEPATH" -n$LINENO "c:\Program Files" -f$FILENAME -aArg2'
start = end = 0
split = []
for elem in re.findall('".*?"', s):
end = s.find(elem)
split.append(s[start:end])
split.append(elem)
start = end + len(elem)
split.extend(s[start:].split())
split = [elem.strip() for elem in split]
split = list(filter(None, split))
It works, but I'm wondering if there's some more elegant/shorter/more readable way to do that in Python(3) ?
2 Answers 2
The best way to do what you want with the standard library would be shlex.split():
>>> import shlex
>>> s = r' "C:\Program Files (x86)\myeditor" "$FILEPATH" -n$LINENO "c:\Program Files" -f$FILENAME -aArg2'
>>> shlex.split(s)
['C:\\Program Files (x86)\\myeditor', '$FILEPATH', '-n$LINENO', 'c:\\Program Files', '-f$FILENAME', '-aArg2']
Note that the quotes are not retained.
You could use a different regex:
import re
s = r' "C:\Program Files (x86)\myeditor" "$FILEPATH" -n$LINENO "c:\Program Files" -f$FILENAME -aArg2'
pattern = re.compile(r"((\"[^\"]+\")|(-[^\s]+))")
for m in re.finditer(pattern, s):
print(m.group(0))
This regex will match either an item enclosed by double quotes ("
) or an item prepended with a dash (-
).
However this might be harder to read/grasp and I'm also not sure if this is considered pythonic as it's the Perl way of doing things so take this with a grain of salt.
shlex
+ copying doublequotes is a more pythonic and sensible approach, as it follows line of thinking "use the standard library if it does the job". \$\endgroup\$