This question is not about how to write a properly escaped string literal. I couldn't find any related question that isn't about how to escape variables for direct consumption within a script or by other programs.
My goal is to enable a script to generate other scripts. This is because the tasks in the generated scripts will run anywhere from 0 to n times on another machine, and the data from which they are generated may change before they're run (again), so doing the operations directly, over a network will not work.
Given a known variable that may contain special characters such as single quotes, I need to write that out as a fully escaped string literal, e.g. a variable foo
containing bar'baz
should appear in the generated script as:
qux='bar'\''baz'
which would be written by appending "qux=$foo_esc"
to the other lines of script. I did it using Perl like this:
foo_esc="'`perl -pe 's/('\'')/\1円\\\\\1円\1円/g' <<<"$foo"`'"
but this seems like overkill.
I have had no success in doing it with bash alone. I have tried many variations of these:
foo_esc="'${file//\'/\'\\\'\'}'"
foo_esc="'${file//\'/'\\''}'"
but either extra slashes appear in the output (when I do echo "$foo"
), or they cause a syntax error (expecting further input if done from the shell).
7 Answers 7
Bash has a parameter expansion option for exactly this case:
${parameter@Q}
The expansion is a string that is the value of parameter quoted in a format that can be reused as input.
So in this case:
foo_esc="${foo@Q}"
This is supported in Bash 4.4 and up. There are several options for other forms of expansion as well, and for specifically generating complete assignment statements (@A
).
-
13Neat, but only have 4.2 which gives
bad substitution
.Walf– Walf2017年07月18日 06:26:52 +00:00Commented Jul 18, 2017 at 6:26 -
20The Z shell equivalent is
"${foo:q}"
.JdeBP– JdeBP2017年07月18日 12:04:38 +00:00Commented Jul 18, 2017 at 12:04 -
2@JdeBP that Z shell equivalent doesn't work. Any other ideas for zsh?Steven Shaw– Steven Shaw2019年06月12日 05:05:28 +00:00Commented Jun 12, 2019 at 5:05
-
3I found the answer: "${(@qq)foo}"Steven Shaw– Steven Shaw2019年06月12日 11:23:14 +00:00Commented Jun 12, 2019 at 11:23
-
4Note that
${var@Q}
was actually copied from mksh, and is not among the safest to use. See my answer for details.Stéphane Chazelas– Stéphane Chazelas2020年07月26日 17:37:01 +00:00Commented Jul 26, 2020 at 17:37
TL;DR: skip to the conclusion.
While several shells/tools have builtin quoting operators some of which have already been mentioned in a few answers, I'd like to stress here that many are unsafe to use depending on:
- what is being quoted
- context in which the quoted string is used.
- the locale in which the quoted output is generated
- the locale in which that generated quoted output is later used.
Several things to consider:
in some contexts, it's important the empty string be represented as
''
or""
. For instance, if it's to be used insh -c "cmd $quoted_output"
it matters if we want what was quoted to be passed as one argument tocmd
. Insh -c "var=$quoted_output; ..."
, it doesn't matter whether the empty string is represented as''
,""
or as the empty string.The
$var:q
operator ofzsh
represents the empty string as the empty string, not''
,""
nor$''
.The
${var@Q}
operator ofbash
(itself copied frommksh
which behaves differently in this regard), represents an empty$var
as''
, but an unset$var
as the empty string:$ empty_var= bash -c 'printf "<%s>\n" "${empty_var@Q}" "${unset_var@Q}"' <''> <> $ empty_var= mksh -c 'printf "<%s>\n" "${empty_var@Q}" "${unset_var@Q}"' <''> <''> $ empty_var= zsh -c 'printf "<%s>\n" "${empty_var:q}" "${unset_var:q}"' <> <>
some of those quoting operators will use a combination of
'...'
,\
,"..."
or$'...'
. The syntax of the latter varies between shells and between versions of a given shell. So for those operators that do use it or can use it depending on the input, it's important that the result be used in the same shell (and same version thereof). That applies at least to:- the
printf %q
of GNUprintf
,bash
,ksh93
,zsh
zsh
's$var:q
,${(q)var}
,${(q+)var}
,${(qqqq)var}
,mksh
's${var@Q}
bash
's${var@Q}
,- the
typeset
/declare
/export -p
output ofksh93
,mksh
,zsh
,bash
(not for scalar variables in older versions). - the
alias
/set
output ofbash
,ksh93
,mksh
,zsh
- the
xtrace
output ofksh93
,mksh
,zsh
In any case
$'...'
is not (yet1) a standardsh
quoting operator, and beware that non-Bourne-like shells such asrc
,es
,akanga
,fish
have completely different quoting syntax. There is simply no way to quote a string in a way that is compatible with every shell in existence (though see this other Q&A for some ways to work around it).- the
some shells decode their input as characters before interpreting the code in it, some don't, and some do it sometimes, and sometimes not.
Some shells (like
bash
) also make their syntax conditional on the locale. For instance, token delimiters in the syntax are the characters considered as blanks in the locale inyash
andbash
(though inbash
, that only works for single-byte ones). Some shells also rely on the locale's character classification to decide what characters are valid in a variable name. So for instanceStéphane=1
could be interpreted as an assignment in one locale, or as the invocation of theStéphane=1
command in another.The sequence of bytes 0xa3 0x5c represents the
£\
string in the ISO-8859-1 (aka latin1) character set, theα
character in BIG5, or an invalid sequence of bytes in UTF-8.\
happens to be a special character in the shell syntax, including within"..."
and$'...'
.`
is also a (dangerous) character whose encoding can be found in the encoding of other characters in some locales.Byte
0xa0
is the non-breaking-space character in a great number of single-byte character sets and that character is considered as blank in some locales on some systems, and as such as a token delimiter in the syntax ofbash
oryash
there.That byte is also found in the UTF-8 encoding of thousands of characters including many alphabetical ones (like
à
, encoded as 0xc3 0xa0).I'm not aware of any charset in use in any locale of any ASCII-based systems that have characters whose encoding contains the encoding of
'
though.Some shell quoting operators output
$'\u00e9'
or$'\u[e9]'
for theé
character for instance. And that in turn, when used, depending on the shell, and the locale at the time of interpreting or running the code that uses it will be expanded to its UTF-8 encoding or in the locale's encoding (with variation in behaviour if the locale doesn't have that character).So, it's not only important that the resulting string be used in the same shell and shell version, but also that it be used in the same locale (at least for those shells that do some character encoding/decoding). And even then, several shells (including
bash
) have or have had bugs in that regard.Any quoting operator that uses
$'...'
,"..."
, or backslash for quoting or that leaves some non-ASCII characters unquoted is potentially unsafe.Or in other words, only the ones that use
'...'
are safe in that regard. That leaves:zsh
's${(qq)var}
operator- The
alias
output ofdash
/bash
,bosh
(at least current versions). - The
export -p
ofdash
/bosh
(at least current versions). - the
set
output ofdash
(at least current versions).
Though of those only the first is documented and committed to always use single quotes (though note the caveat about
rcquotes
below).bash 5.3 will apparently introduce the
%#q
format directive to quote with single quotes (incompatible with the%#q
of ast-open/ksh93'sprintf
which is used for CSV quoting), but as currently written in the development version, it quotes'
itself as\'
and not''\'''
like in zsh's${(qq)var}
so is potentially unsafe if the quoted test is appended to something that ends in a byte >= 0x80.Also note that
yash
can't cope with data that can't be decoded in the locale's charset, so there's no way to pass arbitrary data to that shell (at least in the current version).Ironically, the output of the
locale
utility has the problem (as it's required to use"..."
to output implied settings), and it's typically intended to be used to input code in a locale that is different from that wherelocale
was invoked (to restore the locale).The NUL character (0 byte) cannot occur in an environment variable or in arguments of a command that is executed by way of the
execve()
system call (that's a limitation of that system call that takes those env and arguments strings as C-style NUL-delimited strings). Except inzsh
, NUL cannot be found in shell variables or builtin arguments or more generally shell code either.A 0 byte however can be read and written alright from/to a file or pipe or any I/O mechanism.
In
zsh
it can be stored in a variable, read and written, passed as argument to builtins like in any modern programming language (such aspython
orperl
).But bear in mind that if you quote a NUL with any method that leaves it as-is (as opposed to
$'0円'
,$'\x0'
,$'\u0000'
,$'\C@'
for instance), regardless of how it is quoted, the result cannot be passed in an argument or env var to an executed command, and no other shell will be able to make use of that NUL character.That's possibly to bear in mind if you take external input in
zsh
, as inIFS= read -r var
. If a NUL byte is included in that line read from stdin,$var
and${(qq)var}
will contain it which may restrict what you can do with it.That's one case where using the
$'...'
form of quoting can be preferable (if the other caveats associated with that form of quoting (see above) can be addressed).If the resulting quoted text is to be used in shell code located inside backticks, beware that there's an extra layer of backslash interpretation. Always use
$(...)
in place of`...`
.Some characters are only special in some context. For instance
=
is special in the words that precede the command name (as ina=1 cmd arg
), but not after2 (as incmd a=1
), though there are some special cases in some shells for commands likeexport
,readonly
...~
is special in some contexts and not others.Not all quoting operators will quote those.
Some characters are special in some shells but not in others, or only when some option is enabled...
Even digits are special in some contexts. For instance
sh -c "echo ${quoted_text}>file"
would not output the quoted text infile
, if2
was not quoted as'2'
for instance.
both
bash
andzsh
can perform csh-style history expansion where the!
and^
characters by default are treated special. That's enabled by default when interactive (where it's "useful"), and can be disabled withset +o banghist
in zsh, or by setting$histchars
in bash or$HISTCHARS
in zsh to the empty strings.Those
$histchars
/$HISTCHARS
variables can also be used to change the default!
and^
history characters to something else.zsh
's quoting operators honour the value of$HISTCHARS
and thebanghist
options whilebash
's always escape only!
and^
regardless of the value of$histchars
.That means quoting that uses
"..."
or\
instead of'...'
or$'...'
(inside which history expansion is not performed) cannot be used for reinput inbash
if history expansion is enabled and$histchars
has a different value from the default; and inzsh
can only be used if the same history configuration is applied.in
zsh
, thercquotes
option affects how single-quoted strings are interpreted (and generated by its quoting operators). When enabled, a single quote can be represented in a single-quoted string with''
like in therc
shell. For instance,"foo'bar"
can also be written'foo''bar'
.So it's important that the quoted string generated when
rcquotes
is enabled be only interpreted byzsh
instances that also havercquotes
enabled.A
${(qq)var}
produce by a zsh with or withoutrcquotes
should be safe to use inzsh -o rcquotes
, but notes that inzsh -o rcquotes
, concatenating single quoted strings would result in a single quote being inserted between them.$ quoted_text="'*'" $ zsh -o rcquotes -c "echo $quoted_text$quoted_text" *'*
same as:
$ rc -c "echo $quoted_text$quoted_text" *'*
You can work around it by inserting
""
in between the two:$ zsh -o rcquotes -c "echo $quoted_text\"\"$quoted_text" **
While in
rc
and derivatives (where"..."
is not a quoting operator,'...'
being the only kind of quotes, hence the need to be able to insert'
within them), you'd use^
:$ rc -c "echo $quoted_text^$quoted_text" **
In conclusion
The only quoting method that is safe (if we limit to Bourne-like shells and disregard yash
and `...`
or rogue locales, and assume the data doesn't contain NUL characters) is single quoting of everything (even the empty string, even characters you'd imagine never to be a problem), and represent the single quote character itself as \'
or "'"
outside of the single-quotes, as was the initial intent in your question.
To do that you can use:
zsh
's${(qq)var}
operator (or"${(qq@)array}"
for an array), assuming thercquotes
option is not enabled.a function like:
shquote() { LC_ALL=C awk -v q="'" ' BEGIN{ for (i=1; i<ARGC; i++) { gsub(q, q "\\" q q, ARGV[i]) printf "%s ", q ARGV[i] q } print "" }' "$@" }
or
shquote() { perl -le "print join ' ', map {q(') . s/'/'\\\\''/gr . q(')} @ARGV" -- "$@" }
ksh93
/zsh
/bash
/mksh
:quoted_text=\'${1//\'/\'\\\'\'}\'
(don't double-quote the expansion and don't use it outside of scalar variable assignments, or you'll run into compatibility problems between different versions of
bash
(see description of thecompat41
option in the manual))
All other builtin quoting operators found in various shells such as the q
, qqq
, qqqq
, q+
, q-
parameter expansion flags and :q
modifier of zsh
, the %q
of various printf
implementations, the @Q
of recent versions of mksh and bash are all potentially unsafe, how much so depending on context and are better avoided for strings meant to be interpreted by a POSIX-like shell interpreter.
1 The POSIX specification of $'...'
was initially targetted for Issue 8 of the Single UNIX Specification, expected to be released in 2021 at the earliest, but it looks like it's not going to make it (consensus on a resolution was not reached in time). So, we'll probably have to wait at least another decade before $'...'
is added to the standard
2 except when the -k
(keyword
) option of the Bourne shell and some of its derivatives is enabled
-
In
perl
, I use$q = "\x27";
to avoid the double-quote/single-quote/backslash construct you used.jrw32982– jrw329822020年07月26日 02:04:37 +00:00Commented Jul 26, 2020 at 2:04 -
1@jrw32982supportsMonica, I try and avoid specifying characters by the value of their encoding in a specific charset wherever possible as otherwise that fails when you port your script to systems/locales that use a different charset. EBCDIC systems are rare nowadays, but who can tell whether ASCII will live forever?Stéphane Chazelas– Stéphane Chazelas2020年07月26日 07:46:36 +00:00Commented Jul 26, 2020 at 7:46
Bash provides a printf
builtin with %q
format specifier, which performs shell escaping for you, even in older (<4.0) versions of Bash:
printf '[%q]\n' "Ne'er do well"
# Prints [Ne\'er\ do\ well]
printf '[%q]\n' 'Sneaky injection $( whoami ) `ls /root`'
# Prints [Sneaky\ injection\ \$\(\ whoami\ \)\ \`ls\ /root\`]
This trick can also be used to return arrays of data from a function:
function getData()
{
printf '%q ' "He'll say hi" 'or `whoami`' 'and then $( byebye )'
}
declare -a DATA="( $( getData ) )"
printf 'DATA: [%q]\n' "${DATA[@]}"
# Prints:
# DATA: [He\'ll\ say\ hi]
# DATA: [or\ \`whoami\`]
# DATA: [and\ then\ \$\(\ byebye\ \)]
Note that the Bash printf
builtin is different than the printf
utility which comes bundled with most Unix-like operating systems. If, for some reason, the printf
command invokes the utility instead of the builtin, you can always execute builtin printf
instead.
-
I'm not sure how that helps if what I'd need printed would be
'Ne'\''er do well'
, etc., i.e. quotes included in the output.Walf– Walf2018年07月30日 03:15:13 +00:00Commented Jul 30, 2018 at 3:15 -
1@Walf I think you're not understanding that the two forms are equivalent, and both are perfectly as safe as each other. E.g.
[[ 'Ne'\''er do well' == Ne\'er\ do\ well ]] && echo 'equivalent!'
will echoequivalent!
Dejay Clayton– Dejay Clayton2018年07月30日 17:39:54 +00:00Commented Jul 30, 2018 at 17:39 -
I did miss that :P however I prefer the quoted form as it's easier to read in a syntax-highlighting viewer/editor.Walf– Walf2018年07月31日 23:43:15 +00:00Commented Jul 31, 2018 at 23:43
-
@Walf it seems like your approach is pretty dangerous, considering that in your example Perl, passing a value like
'hello'
results in the incorrect value''\''hello''
, which has an unnecessary leading empty string (the first two single quotes), and an inappropriate trailing single quote.Dejay Clayton– Dejay Clayton2018年08月01日 03:02:56 +00:00Commented Aug 1, 2018 at 3:02 -
1@Walf, for clarification,
$'escape-these-chars'
is the ANSI-C quoting feature of Bash that causes all characters within the specified string to be escaped. Thus, to easily create a string literal that contains a newline within the filename (e.g.$'first-line\nsecond-line')
, use\n
within this construct.Dejay Clayton– Dejay Clayton2018年08月06日 19:56:23 +00:00Commented Aug 6, 2018 at 19:56
Current solution at bottom.
I guess I didn't RTFM. It can be done like so:
q_mid=\'\\\'\'
foo_esc="'${foo//\'/$q_mid}'"
Then echo "$foo_esc"
gives the expected 'bar'\''baz'
How I'm actually using it is with a function:
function esc_var {
local mid_q=\'\\\'\'
printf '%s' "'${1//\'/$mid_q}'"
}
...
foo_esc="`esc_var "$foo"`"
Edit: this appears to be a bug in older versions of bash, because I now get the expected output from this:
foo_esc="'${foo//\'/\'\\\'\'}'"
Modifying this to use the printf
built-in from Dejay's solution:
function esc_vars {
printf ' %q' "$@" | cut -b 2-
}
To heed Stéphane's warnings about incompatibilities between different versions of bash, regarding single quotes inside double-quoted expansions, the bullet-proof function becomes:
esc_vars() {
local i v=()
while [ "$#" -gt 0 ]; do
i=${#v[@]}
v[i]=\'${1//\'/\'\\\'\'}\'
shift
done
printf '%s' "${v[*]}"
}
which uses an array so as to quote all arguments separately which it outputs in the end separated with the first character of IFS
(or space if IFS
is unset or nothing if IFS
is set to the empty string).
-
The function should be esc_var and not esc_vars I know it's a small edit, and of course I would edit it myself, but I can't do single character editsbenathon– benathon2020年06月24日 21:30:27 +00:00Commented Jun 24, 2020 at 21:30
-
1@portforwardpodcast The plural on the latter version is deliberate because the
"$@"
is used to expand and escape all arguments passed, unlike the former which only escapes the first argument and drops any others. Your comment did prompt be to check it and it was not separating them properly, so thanks.Walf– Walf2020年06月25日 04:34:01 +00:00Commented Jun 25, 2020 at 4:34
There are several solutions to quote a var value:
alias
In most shells (where alias is available)(except csh, tcsh and probably others csh like):$ alias qux=bar\'baz $ alias qux qux='bar'\''baz'
Yes, this works in many
sh
-like shells like dash or ash.set
Also in most shells (again, not csh):$ qux=bar\'baz $ set | grep '^qux=' qux='bar'\''baz'
typeset
In some shells (ksh, bash and zsh at least):$ qux=bar\'baz $ typeset -p qux typeset qux='bar'\''baz' # this is zsh, quoting style may # be different for other shells.
export
First do:export qux=bar\'baz
Then use:
kshexport -p | grep 'qux='
bashexport -p | grep 'qux='
zshexport -p qux
quote
bashecho "${qux@Q}"
zshecho "${(qq)qux}"
# from one to four q's may be used.
-
2The alias approach is clever and seems like it is specified by POSIX. For maximum portability I think this is the way to go. I believe the suggestions involving
grep
withexport
orset
may break on variables containing embedded newlines.jw013– jw0132020年01月02日 19:34:26 +00:00Commented Jan 2, 2020 at 19:34
If you don't want to maintain your own function, you can use shell-quote
:
$> shell-quote "single'quote" 'double"quote'
'single'\''quote' 'double"quote'
shell-quote lets you pass arbitrary strings through the shell so that they won't be changed by the shell. This lets you process commands or files with embedded white space or shell globbing characters safely. Here are a few examples.
https://linux.die.net/man/1/shell-quote
On Debian/Ubuntu, it is part of libstring-shellquote-perl
.
In PHP, you can use the escapeshellarg function, that transforms a general string into a bash argument string.
-
Perhaps you could add an example on how to apply it to the OPs code (your answer ended up in my "low quality review" queue because it was so short ...)AdminBee– AdminBee2020年07月29日 10:09:05 +00:00Commented Jul 29, 2020 at 10:09
-
1This seems irrelevant to the question at hand, which does not mention PHP at all.2020年07月29日 10:32:42 +00:00Commented Jul 29, 2020 at 10:32
-
2I can imagine using php to "enable a script to generate other scripts", if the OP is OK with using PHP as the primary generator-script. The question is tagged bash, and probably assumes a shell as the generator, but mentions perl, which seems to me to open the door to using other scripting languages.2020年07月29日 13:26:05 +00:00Commented Jul 29, 2020 at 13:26
You must log in to answer this question.
Explore related questions
See similar questions with these tags.