Marvelous Bash does not support built-in sufficient namespace features for functions or variables to specific file/module you would find in PHP or JavaScript.
Conflicts in global scopes, which are normally impossible to overcome logically in some frankly adequate ways, due to how common shells operate, is still the main concern I sure stumbled multiple times, and while it is rather rare, it happens, and sometimes only xtrace-ing or even stracing helps.
All in all, the main rationale is to have it safer, systematic, and still convenient enough in production environments and to not have functions inside functions just for the sake of workarounds.
Let's assume:
- Both libraries and the main script may source other libraries;
- Re-sourcing of the same library anywhere is prohibited - there is a logic that prevents it;
- Scripts in this case are to be executed (not sourced).
There are conventions that are more or less portable or non-POSIX like the possibly Bashism-like1 Google code-style version with ::
(double-colon) for "packages" ("libraries" I assume). Some use .
(dot) instead of ::
for library "exports" like JSON.bash.
Double-colon (::
) is a popular convention, indeed, but it's for function names only. It's also known for causing some auto-completion errors (not very relevant in this case) and may express different (e.g. 1, 2, 3). These issues and the general support of underscore (_
) in shells by default, made me choose the latter over former as seen below.
For instance, a bold test what characters from ASCII 1-127 set are allowed for function and variable identifiers:
File: :/names.sh
:
#! /usr/bin/env bash
_Main()
{
export LC_ALL=C;
printf '%s\n\n' "$BASH_VERSION";
declare -A funcNames=();
declare -A varNames=();
_T() {
_N() { printf "\x$( printf '%x' "1ドル"; )"; };
declare a; for a in {1..127};
do
declare n="${1}$( _N "$a"; )";
if [[ "$( { eval "${n}() { printf 1; };" && "$n" && unset -f "$n"; } 2> '/dev/null'; )" == '1' ]]; then funcNames["$n"]=''; fi;
if ( declare -- "$n"; ) 2> '/dev/null'; then varNames["$n"]=''; fi;
done
}
_T;
_T _;
printf 'Function names (%s total): ' "${#funcNames[@]}";
for n in "${!funcNames[@]}"; do printf '%s, ' "$( declare -p n | sed 's/^.*=//'; )"; done;
printf '\n\nVariable names (%s total): ' "${#varNames[@]}";
printf $'\'%s\', ' "${!varNames[@]}"; printf '\n';
printf '\n';
}
_Main "$@";
$ ./names.sh | fold -sw 80;
5.2.15(1)-release
Function names (216 total): $'037円', $'036円', $'035円', $'034円', $'\E', $'032円',
$'031円', $'030円', $'027円', $'026円', $'025円', $'024円', $'023円', $'022円',
$'021円', $'020円', $'017円', $'016円', $'\r', $'\f', $'\v', $'\b', $'\a', $'006円',
$'005円', $'004円', $'003円', $'002円', $'001円', "?", ", ":", "9", "8", "7", "6",
"5", "4", "3", "2", "1", "0", "/", ".", ",", "+", "*", "_", "^", "]", "[", "Z",
"Y", "X", "W", "V", "U", "T", "S", "R", "Q", "P", "O", "N", "M", "L", "K", "J",
"I", "H", "G", "F", "E", "D", "C", "B", "A", "@", $'177円', "~", "z", "y", "x",
"w", "v", "u", "t", "s", "r", "q", "p", "o", "n", "m", "l", "k", "j", "i", "h",
"g", "f", "e", "d", "c", "b", "a", "_@", "_A", "_B", "_C", "_D", "_E", "_F",
"_G", "_H", "_I", "_J", "_K", "_L", "_M", "_N", "_O", "_P", "_Q", "_R", "_S",
"_T", "_U", "_V", "_W", "_X", "_Y", "_Z", "_]", "_^", "__", "_a", "_b", "_c",
"_d", "_e", "_f", "_g", "_h", "_i", "_j", "_k", "_l", "_m", "_n", "_o", "_p",
"_q", "_r", "_s", "_t", "_u", "_v", "_w", "_x", "_y", "_z", "_{", "_}", "_~",
$'_177円', $'_001円', $'_002円', $'_003円', $'_004円', $'_005円', $'_006円', $'_\a',
$'_\b', $'_\v', $'_\f', $'_\r', $'_016円', $'_017円', $'_020円', $'_021円',
$'_022円', $'_023円', $'_024円', $'_025円', $'_026円', $'_027円', $'_030円', $'_031円',
$'_032円', $'_\E', $'_034円', $'_035円', $'_036円', $'_037円', "_!", "_#", "_%",
"_*", "_+", "_,", "_-", "_.", "_/", "_0", "_1", "_2", "_3", "_4", "_5", "_6",
"_7", "_8", "_9", "_:", "_?",
Variable names (117 total): '_', 'Z', 'Y', 'X', 'W', 'V', 'U', 'T', 'S', 'R',
'Q', 'P', 'O', 'N', 'M', 'L', 'K', 'J', 'I', 'H', 'G', 'F', 'E', 'D', 'C', 'B',
'A', 'z', 'y', 'x', 'w', 'v', 'u', 't', 's', 'r', 'q', 'p', 'o', 'n', 'm', 'l',
'k', 'j', 'i', 'h', 'g', 'f', 'e', 'd', 'c', 'b', 'a', '_A', '_B', '_C', '_D',
'_E', '_F', '_G', '_H', '_I', '_J', '_K', '_L', '_M', '_N', '_O', '_P', '_Q',
'_R', '_S', '_T', '_U', '_V', '_W', '_X', '_Y', '_Z', '__', '_a', '_b', '_c',
'_d', '_e', '_f', '_g', '_h', '_i', '_j', '_k', '_l', '_m', '_n', '_o', '_p',
'_q', '_r', '_s', '_t', '_u', '_v', '_w', '_x', '_y', '_z', '_0', '_1', '_2',
'_3', '_4', '_5', '_6', '_7', '_8', '_9', '_=',
Question
All the above, however, don't take into account cases with function names matching variable names, or in Google's:
Same as for function names.
Hence this question, and constant re-consideration of approaches.
The following style have been working flawlessly in all the cases I have ever worked with and am aware of, and for years.
Yet, I am still curious is this code-style matches anything standard you know, or what do you think about it?
Do you know any common modern enough environment (i.e. CentOS 7
and up) where it would break and why?
Library Script:
Lib_PubFuncName()
Lib_pubVarName
Lib__FuncName()
Lib__varName
Main Script:
_FuncName()
_varName
Function:
FuncName
varName
__argVarName
___refVarName
Code-style Explanation
The style follows the following logic:
- Variable names (
V
) are incamelCase
; - Function names (
F
) are inPascalCase
; - Global
F
andV
are prefixed_
(underscore); - Library "private"
F
andV
are prefixed with additional_
; - Library
F
andV
are prefixed with library name inPascalCase
; - Named function argument
V
(positional parameters) are prefixed with two_
.
Capitalization
The common rationale to keep it all lower-case is understandable, too of course, where some CLI commands may include variables which are actually aliases, functions, programs under the hood. Some find it more convenient in terminals if the script is to be sourced explicitly. In the end, there is always alias
or explicitly re-named "wrapper functions" (depending on the case) if ever required lower-case only.
Yet, considering that in common enough Linux environments like Ubuntu, programs/commands which include/start with upper-case characters are rare in use but still exist, I chose to use capitalization over other options like additional characters (e.g. _
) leaving it optional for actual identifiers predominantly only.
If we consider upper-case rare, then PascalCase
also suggests it is a more custom logic - more scripting-oriented, I believe, leaving standards in lower-case.
In other words, having capitalization at disposal allows to have shorter names and still does not prohibit usage of _
. This assists in more diverse cases without locking the developer too much. For instance, automated name sections are possible, too, like:
{Lib?}{_prefix?}_Func{suffix?}Name{_postfix?}
...which may express more entity details yet still systematically.
"Private" and "Public"
By "private" function or variable (identifier) I mean the general - suggest it's for internal library functionality only. "Public" library variables may be used to modify the library's behavior like its own default specific "verbosity" you could set during the process running without changing other environment variables.
File: :/lib/dialog.lib.sh
:
# "Public"
Dialog_theme='light';
# "Private"
declare -A Dialog__colors=(
['light']='#FFF,#99C'
['dark']='#333,#77A'
);
Underscore
The use of single _
prefix for global scope identifiers only, and none or multiple only but optional in functions makes it more systematic, since usually the most logic happens in functions (to keep it is scoped mostly/atomic/organized) and global scope comes first. The mandatory prefix in global scope makes it harder to overwrite a program/command declaration with a custom variable name without use of explicit escaping (e.g. \test
or command -- test
).
The "named function positional arguments" above is an example of a prefix with multiple _
in functions which expect positional arguments. These are prefixed with __
(2 characters _
) for convenience only (e.g. __argVarName
above).
Another example could be variable references (i.e. declare -n
) which would have a prefix ___
(3 characters _
) when are to be passed deeper with lower risk it being overwritten by an accident (e.g. ___refVarName
above):
File :/ref.sh
:
#! /usr/bin/env bash
_Work() {
declare __varName="1ドル";
declare __a="2ドル";
declare __b="3ドル";
shift 3;
declare -n ref="$__varName" || return $?;
# Temporary local variable.
declare value='';
value+="work1:$(( (__a + __b) * 2 ))";
value+=";work2:$(( (__a + __b) ** 2 ))";
ref="$value";
}
_Main() {
declare value;
_Work value "$@";
declare -p value;
declare ___value;
_Work ___value "$@";
declare -p ___value;
}
_Main 3 4;
$ ./ref.sh;
declare -- value
declare -- ___value="work1:14;work2:49"
The underscore positioning rationale, having it Lib__FuncName()
and not _Lib_FuncName
for "private", is to have Lib
name prefix as definite and strict as possible. This allows the main script to declare functions in safer manner where the latter would always have a prefix:
# ...
declare -r _stepPrefix='_Step';
_buildName='awesome-build';
# From pre-sourced shell library "Dialog".
_Step_PromtBuildName() { declare ___name; Dialog_Prompt ___name; _buildName="${___name:-_buildName}"; };
# From pre-sourced shell library "Notifications".
_Step_NotifyBuildStart() { Notifications_Send 'build-start'; };
_Step_NotifyBuildEnd() { Notifications_Send 'build-end'; };
_Step_PrepareBuild() { :; };
_Step_Build() { :; };
_Main()
{
declare steps=(
'PromtBuildName'
'PrepareBuild'
'NotifyBuildStart'
'Build'
'NotifyBuildEnd'
);
declare stepName; for stepName in "${steps[@]}";
do
printf $'Executing step \'%s\'.\n' "$stepName";
"${_stepPrefix}_${step}" \
|| return $?;
done
}
_Main "$@";
Example
File :/lib/misc.lib.sh
:
#! /bin/false
# (Private) Determine whether a string is a valid signed or unsigned integer.
Misc__IsValidInteger()
{
# If signed
if [[ "1ドル" == '-s' ]];
then
[[ "2ドル" =~ ^(0|-?[1-9][0-9]*)$ ]];
return;
fi
# Unsigned.
[[ "1ドル" =~ ^(0|[1-9][0-9]*)$ ]];
}
# (Public) Repeat a string definite number of times.
Misc_Repeat()
{
declare __count="1ドル";
shift;
if ! Misc__IsValidInteger "$__count";
then
printf $'Invalid value of option \'count\': \'%s\'.\n' "$__count" >&2;
return 2;
fi
declare i;
for (( i = 0; i < __count; i++ ));
do
printf '%s' "$@";
done
}
File :/snow.sh
:
#! /usr/bin/env bash
declare _filepath; _filepath="$( readlink -en -- "${BASH_SOURCE[0]:-0ドル}"; )" || exit 99; readonly _filepath;
declare _dirpath; _dirpath="$( dirname -- "$_filepath"; )" || exit 99; readonly _dirpath;
declare -r _libsDirpath="${_dirpath}/lib";
\. "${_libsDirpath}/misc.lib.sh" || exit 98;
_Main()
{
declare __count="${1:-2}";
declare message; message="Let it snow$( Misc_Repeat "$__count" ', let it snow'; )..." || return $?;
printf '%s\n' "$message";
}
_Main "$@";
$ ./snow.sh; echo $?;
Let it snow, let it snow, let it snow...
0
$ ./snow.sh -1; echo $?;
Invalid value of option 'count': '-1'.
2
$ ./snow.sh 1; echo $?;
Let it snow, let it snow...
0
Supplementary
The most of the code-style is inspired by the C#'s naming convention, and while those projects and their features, and purposes are quite massively... different, the convention is the most adequate I know, which takes into account various complex cases, including abbreviations. Or, as of 2024, TypeScript, general conventions may be relevant, too. Both are the works of Anders Hejlsberg, Microsoft2.
It's probably the over9000+3 time I reconsider it having at least more than a decade experience in Bash, but still am therefore trying to clarify whether is it "too custom" or will it conflict with anything crucial in some common Linux environment I am not aware of.
This is for Bash v5+
, and currently I don't have enough experience in Zsh and Ksh to know for sure.
References
-
\$\begingroup\$ Please could you provide as an accurate version of the code, and only the version you want reviewed. Whilst your question is interesting (to me) as is, providing only what you want reviewed is likely to be more well received and easier for answerers to answer. \$\endgroup\$Peilonrayz– Peilonrayz ♦2025年10月01日 02:01:53 +00:00Commented 20 hours ago