1
\$\begingroup\$

Marvelous Bash does not support built-in sufficient namespace features for functions or variables to specific file/module you would find in PHP or JavaScript.

Conflicts in global scopes, which are normally impossible to overcome logically in some frankly adequate ways, due to how common shells operate, is still the main concern I sure stumbled multiple times, and while it is rather rare, it happens, and sometimes only xtrace-ing or even stracing helps.

All in all, the main rationale is to have it safer, systematic, and still convenient enough in production environments and to not have functions inside functions just for the sake of workarounds.


Let's assume:

  1. Both libraries and the main script may source other libraries;
  2. Re-sourcing of the same library anywhere is prohibited - there is a logic that prevents it;
  3. Scripts in this case are to be executed (not sourced).

There are conventions that are more or less portable or non-POSIX like the possibly Bashism-like1 Google code-style version with :: (double-colon) for "packages" ("libraries" I assume). Some use . (dot) instead of :: for library "exports" like JSON.bash.

Double-colon (::) is a popular convention, indeed, but it's for function names only. It's also known for causing some auto-completion errors (not very relevant in this case) and may express different (e.g. 1, 2, 3). These issues and the general support of underscore (_) in shells by default, made me choose the latter over former as seen below.

For instance, a bold test what characters from ASCII 1-127 set are allowed for function and variable identifiers:

File: :/names.sh:

#! /usr/bin/env bash
_Main()
{
 export LC_ALL=C;
 printf '%s\n\n' "$BASH_VERSION";
 declare -A funcNames=();
 declare -A varNames=();
 _T() {
 _N() { printf "\x$( printf '%x' "1ドル"; )"; };
 declare a; for a in {1..127};
 do
 declare n="${1}$( _N "$a"; )";
 if [[ "$( { eval "${n}() { printf 1; };" && "$n" && unset -f "$n"; } 2> '/dev/null'; )" == '1' ]]; then funcNames["$n"]=''; fi;
 if ( declare -- "$n"; ) 2> '/dev/null'; then varNames["$n"]=''; fi;
 done
 }
 _T;
 _T _;
 printf 'Function names (%s total): ' "${#funcNames[@]}";
 for n in "${!funcNames[@]}"; do printf '%s, ' "$( declare -p n | sed 's/^.*=//'; )"; done;
 printf '\n\nVariable names (%s total): ' "${#varNames[@]}"; 
 printf $'\'%s\', ' "${!varNames[@]}"; printf '\n';
 printf '\n';
}
_Main "$@";
$ ./names.sh | fold -sw 80;
5.2.15(1)-release
Function names (216 total): $'037円', $'036円', $'035円', $'034円', $'\E', $'032円',
$'031円', $'030円', $'027円', $'026円', $'025円', $'024円', $'023円', $'022円',
$'021円', $'020円', $'017円', $'016円', $'\r', $'\f', $'\v', $'\b', $'\a', $'006円',
$'005円', $'004円', $'003円', $'002円', $'001円', "?", ", ":", "9", "8", "7", "6",
"5", "4", "3", "2", "1", "0", "/", ".", ",", "+", "*", "_", "^", "]", "[", "Z",
"Y", "X", "W", "V", "U", "T", "S", "R", "Q", "P", "O", "N", "M", "L", "K", "J",
"I", "H", "G", "F", "E", "D", "C", "B", "A", "@", $'177円', "~", "z", "y", "x",
"w", "v", "u", "t", "s", "r", "q", "p", "o", "n", "m", "l", "k", "j", "i", "h",
"g", "f", "e", "d", "c", "b", "a", "_@", "_A", "_B", "_C", "_D", "_E", "_F",
"_G", "_H", "_I", "_J", "_K", "_L", "_M", "_N", "_O", "_P", "_Q", "_R", "_S",
"_T", "_U", "_V", "_W", "_X", "_Y", "_Z", "_]", "_^", "__", "_a", "_b", "_c",
"_d", "_e", "_f", "_g", "_h", "_i", "_j", "_k", "_l", "_m", "_n", "_o", "_p",
"_q", "_r", "_s", "_t", "_u", "_v", "_w", "_x", "_y", "_z", "_{", "_}", "_~",
$'_177円', $'_001円', $'_002円', $'_003円', $'_004円', $'_005円', $'_006円', $'_\a',
$'_\b', $'_\v', $'_\f', $'_\r', $'_016円', $'_017円', $'_020円', $'_021円',
$'_022円', $'_023円', $'_024円', $'_025円', $'_026円', $'_027円', $'_030円', $'_031円',
$'_032円', $'_\E', $'_034円', $'_035円', $'_036円', $'_037円', "_!", "_#", "_%",
"_*", "_+", "_,", "_-", "_.", "_/", "_0", "_1", "_2", "_3", "_4", "_5", "_6",
"_7", "_8", "_9", "_:", "_?",
Variable names (117 total): '_', 'Z', 'Y', 'X', 'W', 'V', 'U', 'T', 'S', 'R',
'Q', 'P', 'O', 'N', 'M', 'L', 'K', 'J', 'I', 'H', 'G', 'F', 'E', 'D', 'C', 'B',
'A', 'z', 'y', 'x', 'w', 'v', 'u', 't', 's', 'r', 'q', 'p', 'o', 'n', 'm', 'l',
'k', 'j', 'i', 'h', 'g', 'f', 'e', 'd', 'c', 'b', 'a', '_A', '_B', '_C', '_D',
'_E', '_F', '_G', '_H', '_I', '_J', '_K', '_L', '_M', '_N', '_O', '_P', '_Q',
'_R', '_S', '_T', '_U', '_V', '_W', '_X', '_Y', '_Z', '__', '_a', '_b', '_c',
'_d', '_e', '_f', '_g', '_h', '_i', '_j', '_k', '_l', '_m', '_n', '_o', '_p',
'_q', '_r', '_s', '_t', '_u', '_v', '_w', '_x', '_y', '_z', '_0', '_1', '_2',
'_3', '_4', '_5', '_6', '_7', '_8', '_9', '_=',

Question


All the above, however, don't take into account cases with function names matching variable names, or in Google's:

Same as for function names.

Hence this question, and constant re-consideration of approaches.

The following style have been working flawlessly in all the cases I have ever worked with and am aware of, and for years.

Yet, I am still curious is this code-style matches anything standard you know, or what do you think about it?
Do you know any common modern enough environment (i.e. CentOS 7 and up) where it would break and why?


Library Script:
 Lib_PubFuncName()
 Lib_pubVarName
 Lib__FuncName()
 Lib__varName
Main Script:
 _FuncName()
 _varName
Function:
 FuncName
 varName
 __argVarName
 ___refVarName

Code-style Explanation

The style follows the following logic:

  1. Variable names (V) are in camelCase;
  2. Function names (F) are in PascalCase;
  3. Global F and V are prefixed _ (underscore);
  4. Library "private" F and V are prefixed with additional _;
  5. Library F and V are prefixed with library name in PascalCase;
  6. Named function argument V (positional parameters) are prefixed with two _.

Capitalization

The common rationale to keep it all lower-case is understandable, too of course, where some CLI commands may include variables which are actually aliases, functions, programs under the hood. Some find it more convenient in terminals if the script is to be sourced explicitly. In the end, there is always alias or explicitly re-named "wrapper functions" (depending on the case) if ever required lower-case only.

Yet, considering that in common enough Linux environments like Ubuntu, programs/commands which include/start with upper-case characters are rare in use but still exist, I chose to use capitalization over other options like additional characters (e.g. _) leaving it optional for actual identifiers predominantly only.
If we consider upper-case rare, then PascalCase also suggests it is a more custom logic - more scripting-oriented, I believe, leaving standards in lower-case.

In other words, having capitalization at disposal allows to have shorter names and still does not prohibit usage of _. This assists in more diverse cases without locking the developer too much. For instance, automated name sections are possible, too, like:

{Lib?}{_prefix?}_Func{suffix?}Name{_postfix?}

...which may express more entity details yet still systematically.

"Private" and "Public"

By "private" function or variable (identifier) I mean the general - suggest it's for internal library functionality only. "Public" library variables may be used to modify the library's behavior like its own default specific "verbosity" you could set during the process running without changing other environment variables.

File: :/lib/dialog.lib.sh:

# "Public"
Dialog_theme='light';
# "Private"
declare -A Dialog__colors=(
 ['light']='#FFF,#99C'
 ['dark']='#333,#77A'
);

Underscore

The use of single _ prefix for global scope identifiers only, and none or multiple only but optional in functions makes it more systematic, since usually the most logic happens in functions (to keep it is scoped mostly/atomic/organized) and global scope comes first. The mandatory prefix in global scope makes it harder to overwrite a program/command declaration with a custom variable name without use of explicit escaping (e.g. \test or command -- test).

The "named function positional arguments" above is an example of a prefix with multiple _ in functions which expect positional arguments. These are prefixed with __ (2 characters _) for convenience only (e.g. __argVarName above). Another example could be variable references (i.e. declare -n) which would have a prefix ___ (3 characters _) when are to be passed deeper with lower risk it being overwritten by an accident (e.g. ___refVarName above):

File :/ref.sh:

#! /usr/bin/env bash
_Work() {
 declare __varName="1ドル";
 declare __a="2ドル";
 declare __b="3ドル";
 shift 3;
 declare -n ref="$__varName" || return $?;
 # Temporary local variable.
 declare value='';
 value+="work1:$(( (__a + __b) * 2 ))";
 value+=";work2:$(( (__a + __b) ** 2 ))";
 ref="$value";
}
_Main() {
 declare value;
 _Work value "$@";
 declare -p value;
 declare ___value;
 _Work ___value "$@";
 declare -p ___value;
}
_Main 3 4;
$ ./ref.sh;
declare -- value
declare -- ___value="work1:14;work2:49"

The underscore positioning rationale, having it Lib__FuncName() and not _Lib_FuncName for "private", is to have Lib name prefix as definite and strict as possible. This allows the main script to declare functions in safer manner where the latter would always have a prefix:

# ...
declare -r _stepPrefix='_Step';
_buildName='awesome-build';
# From pre-sourced shell library "Dialog".
_Step_PromtBuildName() { declare ___name; Dialog_Prompt ___name; _buildName="${___name:-_buildName}"; };
# From pre-sourced shell library "Notifications".
_Step_NotifyBuildStart() { Notifications_Send 'build-start'; };
_Step_NotifyBuildEnd() { Notifications_Send 'build-end'; };
_Step_PrepareBuild() { :; };
_Step_Build() { :; };
_Main()
{
 declare steps=(
 'PromtBuildName'
 'PrepareBuild'
 'NotifyBuildStart'
 'Build'
 'NotifyBuildEnd'
 );
 declare stepName; for stepName in "${steps[@]}";
 do
 printf $'Executing step \'%s\'.\n' "$stepName";
 "${_stepPrefix}_${step}" \
 || return $?;
 done
}
_Main "$@";

Example


File :/lib/misc.lib.sh:

#! /bin/false
# (Private) Determine whether a string is a valid signed or unsigned integer.
Misc__IsValidInteger()
{
 # If signed
 if [[ "1ドル" == '-s' ]];
 then
 [[ "2ドル" =~ ^(0|-?[1-9][0-9]*)$ ]];
 return;
 fi
 # Unsigned.
 [[ "1ドル" =~ ^(0|[1-9][0-9]*)$ ]];
}
# (Public) Repeat a string definite number of times.
Misc_Repeat()
{
 declare __count="1ドル";
 shift;
 if ! Misc__IsValidInteger "$__count";
 then
 printf $'Invalid value of option \'count\': \'%s\'.\n' "$__count" >&2;
 return 2;
 fi
 declare i;
 for (( i = 0; i < __count; i++ ));
 do
 printf '%s' "$@";
 done
}

File :/snow.sh:

#! /usr/bin/env bash
declare _filepath; _filepath="$( readlink -en -- "${BASH_SOURCE[0]:-0ドル}"; )" || exit 99; readonly _filepath;
declare _dirpath; _dirpath="$( dirname -- "$_filepath"; )" || exit 99; readonly _dirpath;
declare -r _libsDirpath="${_dirpath}/lib";
\. "${_libsDirpath}/misc.lib.sh" || exit 98;
_Main()
{
 declare __count="${1:-2}";
 declare message; message="Let it snow$( Misc_Repeat "$__count" ', let it snow'; )..." || return $?;
 printf '%s\n' "$message";
}
_Main "$@";
$ ./snow.sh; echo $?;
Let it snow, let it snow, let it snow...
0
$ ./snow.sh -1; echo $?;
Invalid value of option 'count': '-1'.
2
$ ./snow.sh 1; echo $?;
Let it snow, let it snow...
0

Supplementary


The most of the code-style is inspired by the C#'s naming convention, and while those projects and their features, and purposes are quite massively... different, the convention is the most adequate I know, which takes into account various complex cases, including abbreviations. Or, as of 2024, TypeScript, general conventions may be relevant, too. Both are the works of Anders Hejlsberg, Microsoft2.

It's probably the over9000+3 time I reconsider it having at least more than a decade experience in Bash, but still am therefore trying to clarify whether is it "too custom" or will it conflict with anything crucial in some common Linux environment I am not aware of.

This is for Bash v5+, and currently I don't have enough experience in Zsh and Ksh to know for sure.

References

1 Bashism
2 Why C# goes well with TypeScript
3 Over9000!

asked Jul 28, 2024 at 22:22
\$\endgroup\$
1
  • \$\begingroup\$ Please could you provide as an accurate version of the code, and only the version you want reviewed. Whilst your question is interesting (to me) as is, providing only what you want reviewed is likely to be more well received and easier for answerers to answer. \$\endgroup\$ Commented 20 hours ago

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.