Using extended Unicode characters is (no-doubt) useful for many users.
Simpler shells (ash (busybox), dash) and ksh do fail with:
tést() { echo 34; }
tést
But bash, mksh, lksh, and zsh seem to allow it.
I am aware that POSIX valid function names use this definition of Names. That means this regex:
[a-zA-Z_][a-zA-Z0-9_]*
However, in the first link it is also said:
An implementation may allow other characters in a function name as an extension.
The questions are:
- Is this accepted and documented?
- Where?
- For which shells (if any)?
Related questions:
Its possible use special characters in a shell function name?
I am not interested in using meta-characters (>) in function names.
Upstart and bash function names containing "-"
I do not believe that an operator (subtraction "-") should be part of a name.
2 Answers 2
Since POSIX documentation allow it as an extension, there's nothing prevent implementation from that behavior.
A simple check (ran in zsh
):
$ for shell in /bin/*sh 'busybox sh'; do
printf '[%s]\n' $shell
$=shell -c 'á() { :; }'
done
[/bin/ash]
/bin/ash: 1: Syntax error: Bad function name
[/bin/bash]
[/bin/dash]
/bin/dash: 1: Syntax error: Bad function name
[/bin/ksh]
[/bin/lksh]
[/bin/mksh]
[/bin/pdksh]
[/bin/posh]
/bin/posh: á: invalid function name
[/bin/yash]
[/bin/zsh]
[busybox sh]
sh: syntax error: bad function name
show that bash
, zsh
, yash
, ksh93
(which ksh
linked to in my system), pdksh
and its derivation allow multi-bytes characters as function name.
yash
is designed to support multibyte characters from the beginning, so there's no surprise it worked.
The other documentation you can refer is ksh93
:
A blank is a tab or a space. An identifier is a sequence of letters, digits, or underscores starting with a letter or underscore. Identifiers are used as components of variable names. A vname is a sequence of one or more identifiers separated by a . and optionally preceded by a .. Vnames are used as function and variable names. A word is a sequence of characters from the character set defined by the current locale, excluding non-quoted metacharacters.
So setting to C
locale:
$ export LC_ALL=C
$ á() { echo 1; }
ksh: á: invalid function name
make it failed.
-
posh
isn't worth to be listed in such a list. It depends on Linux specific bugs inlibc
and will not work on other platforms.schily– schily2018年06月01日 08:27:42 +00:00Commented Jun 1, 2018 at 8:27 -
I cannot repeat your claims about
ksh93
using a self compiled ksh93 from original sources. Whileksh88
seems to accept non-7-Bit-ASCII letters for function names, only theksh93
binary from Ubuntu seems to accept them.schily– schily2018年06月01日 08:30:08 +00:00Commented Jun 1, 2018 at 8:30 -
@schily ksh I used in this test is the binary in Debian (so it may be the same with one on Ubuntu)cuonglm– cuonglm2018年06月01日 08:59:35 +00:00Commented Jun 1, 2018 at 8:59
Note that functions share the same namespace as other commands including commands in the file system, which on most systems have no limitation on the characters or even bytes they may contain in their path.
So while most shells restrict the characters of their functions, there's no real good reason why they would do that. That means in those shells, there are commands you can't replace with a function.
zsh
and rc
allow anything for their function names including some with /
and the empty string. zsh
even allows NUL bytes.
$ zsh
$ $'0円'() echo nul
$ ^@
nul
$ ""() uname
$ ''
Linux
$ /bin/ls() echo test
$ /bin/ls
test
A simple command in shell is a list of arguments, and the first argument is used to derive the command to execute. So, it's just logical that those arguments and function names share the same possible values and in zsh
arguments to builtins and functions can be any byte sequence.
There's not security issue here as the functions you (the script author) define are the ones you invoke.
Where there may be security issues is when the parsing is affected by the environment, for instance with shells where the valid names for functions is affected by the locale.
-
1One may play games in bash too, starting with
function /bin/sh { echo "0ドル: $FUNCNAME: Permission denied"; return 126; }
, and potentially useful things too with functions named--
,//
,@
or%
etc.mr.spuratic– mr.spuratic2015年11月27日 17:44:25 +00:00Commented Nov 27, 2015 at 17:44 -
but dont shells tend to bypass a hash-table lookup when
/
is found in a name? and a function isnt just an executable name - its code. i would think a simple implementation could encounter a lot of parse problems if its stored function names included metacharacters.mikeserv– mikeserv2015年11月27日 17:46:56 +00:00Commented Nov 27, 2015 at 17:46 -
Yes, I am aware of the inability of bash to contain nulls in vars, that could be reasonably extended to function names. I do not have an specific example, but I do feel that this games of allowing almost anything for names is more of a potential security breach than an "easy way to work". I hope I am wrong.user79743– user797432015年11月27日 19:01:35 +00:00Commented Nov 27, 2015 at 19:01
alias
to be a tad more lenient. and so you can write the function with a some proper, buttoned-down name, and then just define a more stylishly named alias to call the function. indash
there is also some stuff you can do with$PATH
and%func
.