Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Public interface and backward compatibility #537

akinomyoga started this conversation in General
Discussion options

We are recommending that the completion for each specific command is preferred to be maintained along with the command itself. bash_completion provides helper functions which can be used from such external completions.

CONTRIBUTING.md [...] On the other hand, we do have a pretty nice test suite and a bunch of helper functions that you may find useful. And a whole slew of completions in one package. Our functions can be used from "external" completions as well, just make sure you test for their existence and/or fail gracefully if you intend your completion to be usable without having bash-completion installed.

However, one of the concerns that the writers of the external completion might have is the backward compatibility of helper functions in bash_completion.

#531 (comment) [...] All that said, I think the better approach is to pull the completion file into slackpkg, on the condition that you folks are willing to hold my hand later if modifications are needed due to changes in upstream bash-completion :-)

Question: Is there any borderline between public interface (i.e., stable helper functions which can be used from external completions) or private interface (i.e., the helper functions whose names might be changed or whose detailed behavior might be changed in the future without notice). Or are all the functions appearing in the main script bash_completion considered public interface?

If everything is considered public interface, I think it is harder to make changes to bash_completion keeping the backward compatibility. By designating public interface which can be used from external completions, I think we can more safely change/update/reorganize the implementation of bash_completion, etc., and also the external completion writers can easily look up useful helper functions from a shorter list and don't have to care about the version differences of bash_completion too much.

You must be logged in to vote

Replies: 6 comments 31 replies

Comment options

I share the concern, this is kind of a problematic area. Previously there has been no notion of a public interface, and as we do encourage upstream completions to use our helpers, it's fair for upstreams to have assumed that it's ok to use anything we provide.

I don't have a good answer for what would make something public and something not. Pretty much the only reason I wouldn't like everything to be public is the ability to make incompatible changes. I guess the best approach I can give off the cuff would be to not make those at all, but to introduce incompatibilities in new functions. That'll obviously build up cruft over time. We can deprecate things and add warnings to fight that and eventually just remove old stuff, but that's something we haven't been too good at. Added #542 related to that.

On another semi-related note, I've started applying _comp_ prefix to all our functions, see 822d113.

You must be logged in to vote
15 replies
Comment options

akinomyoga Jan 4, 2022
Collaborator Author

Re subshell: I think in the vast majority of cases a subshell with compgen is what return arrays would be eventually fed into anyway. We might save a few here and there by having the callers do it, but I don't think there will be a significant difference at the end of the day. Anyway, tempfiles are not an option.

OK. I see. I think it actually depends on how these functions will be used. If there are not so many functions that need to emit the payloads, and the number of forks introduced by the change is limited, the stdout approach can be an acceptable option. I was just afraid of the increased number of forks.

It's not that different from namerefs

Maybe you have implied this by italicizing "that", but namerefs do not solve the problem of name conflicts that _upvars solves.

I was thinking that we could actually use the COMPREPLY array and always append to it, as that's what the functions usually do. However that would be awkward when the functions actually want to do something different with the return values.

Hmm. Now I am wondering if they need to be consistent. I feel the function that generates the completions can just append to the COMPREPLY array, and the other functions can use another approach. This might be just my preference, but I still think some fixed variable name is efficient and manageable.

Actually, this problem is also what I have been thinking about for a long time for my project (ble.sh). In my project, performance is so important that I would like to reduce the cost of forks as much as possible. So I have been using the local variables to return values from functions. Initially, I tried to use the upvar trick or another way to accept arbitrary variable names, but it turned out to be a little bit slow, and also the code seems to be unnecessarily complicated (i.e., I don't want to write additional codes for every single function including small ones). So, now I use the fixed variable names. In particular, I basically use the fixed name ret everywhere in my project. It doesn't need to be ret as there is no particular reason for the choice, but a consistent variable name over the whole codebase is important for writing and managing the codes in my experience.

Maybe you have been already vaguely aware of it from the issues and PRs I made in the past, but I actually call the completion functions extracted from complete -p in ble.sh. In particular ble.sh calls completion functions for each keystroke, so it is also convenient for my project if bash-completion becomes more efficient.

Comment options

scop Jan 4, 2022
Maintainer

Having had an actual look at what kind of functions we actually have, I'm agreeing with you re consistency. Things that generate completions can append to COMPREPLY (and they can assume $cur is set to something appropriate they do not need to process further before just using). For other things, functions returning a boolean can use the actual return value as appropriate. For others, ret is a fine name, I'm fine with adopting it.

Regarding the ret cases, are you differentiating between functions that "return" a single value and ones that do multiple by populating an array? We could probably get away with not differentiating and always declare ret as an array -- functions that emit just one value can just do ret=something which populates the first element (or be explicit about it by assigning to the first element). That'd leave other values in indexes > 0 intact though, callers would have to be careful with that or empty ret before calling functions when there are more than one to be called. Or we could take the approach that functions are supposed to populate ret entirely, not append nor assign to first.

I'm only vaguely aware of ble.sh -- should definitely have a look sometime.

Comment options

akinomyoga Jan 6, 2022
Collaborator Author

Things that generate completions can append to COMPREPLY (and they can assume $cur is set to something appropriate they do not need to process further before just using). For other things, functions returning a boolean can use the actual return value as appropriate.

I agree with it.

For others, ret is a fine name, I'm fine with adopting it.

I have promoted the fixed varname approach, but I think it's fair to explain the downside of this approach before making the final decision.

(a) Common fixed-name approach

With the fixed variable name approach, we need to copy the resulting value to another variable before calling another function setting ret in some cases. For example, this is not the problem for the following case:

# stdout approach
echo "$(func_print_z $(func_print_y $(func_print_x)))"
# common fixed-name approach
local ret
func_get_x
func_get_y "$ret"
func_get_z "$ret"
echo "$ret"

However, for the following case, we need to save the result of each function call, which some people might not like.:

# stdout approach
echo "$(func_print_x),$(func_print_y),$(func_print_z)"
# common fixed-name
local ret
func_get_x; local x=$ret
func_get_y; local y=$ret
func_get_z; local z=$ret
echo "$x,$y,$z"

(b) Fixed variable name for each function

Some people might like this approach more because we don't have to copy the return values.

# func_get_{x,y,z} assign their return values directly to x,y,z respectively.
local x y z
func_get_x
func_get_y
func_get_z
echo "$x,$y,$z"

Nevertheless, we need to copy the return values when we call the same function multiple times.

local x
func_get_x 1; local x1=$x
func_get_x 2; local x2=$x
func_get_x 3; local x3=$x
echo "$x1,$x2,$x3"

The problem with this approach, IMO, is that the user needs to remember the variable name that each function assigns its return value. It is probably easy for the functions of the name get_xxx but not all the function returning values have that form of names.

Another bonus of this approach is that we can make it more consistent with the functions that return values in multiple variables. For example,

local x y z
func_get_x_and_y
func_get_z
echo "$x,$y,$z"

(c) The variable of the same name as the function

Another approach is to assign the return value to the variable of the same name as the function.

local func_get_x; func_get_x
local func_get_y; func_get_y
local func_get_z; func_get_z
echo "$func_get_x,$func_get_y,$func_get_z"

Maybe this is the most conservative solution. This approach cannot be used when we allow the non-POSIX compliant function names containing [^[:alnum:]_] characters (which is not the case in bash-completion though). Also, the function names can be very long. With this approach, we need to declare the variables of long names for every call of a function returning values, and we also need to write the long name when we reference the resulting values, which I didn't like.

Disclaimer for ble.sh

Actually, ble.sh is not perfectly consistent but combines several approaches. I have written that ble.sh basically uses the approach (a) with the name ret, but

  • The functions that modify the existing values (including the ones that append values to arrays) receive the variable name in its argument.
  • The functions that end with get-xxx assign its return value to the variable xxx as described in approach (b). A part of such functions optionally receives the option -v varname, but I'd like to drop the support for this option in the future.
  • Some of the functions of the form namespace/Class#Method receive the variable name as the first argument. Actually, this depends on Class. There is no consistent rule for now.
  • There are many functions that touch multiple local/global variables (like _init_completion of bash-completion).

Regarding the ret cases, are you differentiating between functions that "return" a single value and ones that do multiple by populating an array?

No. That's a good question which I thought I had to explain after the previous reply. In ble.sh, there is no differentiation of functions returning scalar/array through ret. The function returning scalar values sets the value by ret=..., and the function for arrays sets values by ret=(...).

We could probably get away with not differentiating and always declare ret as an array -- functions that emit just one value can just do ret=something which populates the first element (or be explicit about it by assigning to the first element). That'd leave other values in indexes > 0 intact though, callers would have to be careful with that or empty ret before calling functions when there are more than one to be called.

Yes, in ble.sh, the caller is supposed to know whether the callee function returns a scalar or an array, and the caller is supposed to properly use the resulting ret as a scalar or an array. I don't empty ret before calling the function for scalars.

Another thing that might be related to scalar/array return values is the declaration of ret. Since the variable ret is reused for multiple calls of different types of functions, the variable is usually just declared as local ret in ble.sh (without -a flag even when the variable is used only for array values). If there is any function returning an array, the variable becomes an array when the function assigns values as ret=(...).

Comment options

scop Jan 6, 2022
Maintainer

Thanks for the elaborate explanation. I don't think the downside of a) is that bad. I'm wondering if we could/should just use ret as the fixed name also for functions that modify the value(s). Is it different enough from the "returning" cases that it would warrant different, inconsistent behavior?

I didn't remember that the type of a variable changes between scalar/array just through assignment, thanks for pointing it out. So it doesn't matter much at all if we'd declare ret with -a or not. (Assigning array to something declared as an associative ones does cause problems, that I did remember.)

Comment options

akinomyoga Jan 6, 2022
Collaborator Author

I'm wondering if we could/should just use ret as the fixed name also for functions that modify the value(s). Is it different enough from the "returning" cases that it would warrant different, inconsistent behavior?

Actually, I don't have much experience with that usage, but I believe we can safely modify the existing value of ret from the function just as we assign a value to ret from the functions. In ble.sh, no function uses the existing values of ret set by the callers. I have never cared about it consciously, but the reason is probably that it doesn't seem intuitive to me. I think this is some kind of implicit rule for me not to be confused.

Also, I would always use the value of ret on the same line or the next line as the function call in the codebase of ble.sh. If I want to use the value later, I would give a name to the value, i.e., copy the value of ret to another variable even when there are no other function calls in between. This is also a rule for me not to make mistakes (but this is not strict).

Edit: Come to think of it, I'm probably using ret in a similar way as $?. But I think it's fine to extend the usage.

I didn't remember that the type of a variable changes between scalar/array just through assignment, thanks for pointing it out.

Oh, maybe I need to clarify that the type conversion is one-way: scalar -> array. Once the variable becomes an indexed array or an associative array, the type cannot be changed back to scalar or to a different type of array (unless they are unset). So, when ret is an array, ret=... does not make ret scalar but just assigns a value to the first element, i.e., equivalent to ret[0]=.... In this case, the problem arises when the caller wrongly assumes ret being an array and references ret as "${ret[@]}".

Comment options

akinomyoga
Sep 5, 2022
Collaborator Author

Functions returning results by setting local variables

Let me create a separate thread for returning results through variables. The original discussion started around #537 (reply in thread).

I summarize here my current understanding of the direction in the original discussion.

  • Completion functions that generate the completion candidates would append the result to the array COMPREPLY.
  • Predicate functions, i.e., the functions that return a boolean value, can just return it through the exit status 0 or 1.

For the other functions, we generally recommend using a common fixed name, ret, for the variable that is used to return values. Usually, the caller function declares the local variable ret (except when the caller function itself eventually returns results through ret), and the callee function can write results to ret.

  • The variable ret can be used for both a scalar and an array depending on the callee function. The function that returns multiple values sets values by ret=(values...). The function that returns a scalar sets a value by ret=value (which is equivalent to ret[0]=value if ret is an existing array) or ret=(value). This means that the functions returning a scalar might or might not keep the values of ret[i] where i > 0, but the caller is supposed to know whether the function returns a scalar or an array so that it can properly handle the result.
  • The variable ret is supposed to be reused for multiple calls of functions returning results by ret. The value that needs to be preserved should be assigned to another named variable before calling the next function returning results by ret.
  • Currently, the variable ret will not be used as associative arrays. This is because once a variable becomes an (indexed) array, the variable cannot be later turned into an associative array, and vice versa. This may cause a problem when the variable ret is reused. It is in principle possible to unset ret and reuse it later, but it requires careful handling.
  • The variable ret can also be used to modify existing values. Such a function is allowed to read the existing values of ret.
  • (... Anything else?)

edit:

We also have "exporter functions", namely, the functions that write variables on the specified variable name. The related discussion on the local variable names used in exporter functions is found in #539 (comment).

You must be logged in to vote
0 replies
Comment options

akinomyoga
Sep 5, 2022
Collaborator Author

Exceptions to ret?

This is actually a question that came to my mind in starting to work on the refactoring for the naming convention. I'm interested in how strictly we impose the rule of ret on the new functions and existing functions. Let me ask this question in a separate thread. After getting a conclusion, I'll add an item to #537 (comment) if necessary.

Basic utilities

The primary reason that I raise this discussion is related to the interface of _comp_expand_glob and _comp_split suggested in #803. According to the proposed rule of ret, these new functions are actually supposed to always write their result to the array ret instead of receiving the array name as is done now.

# Current usage
_comp_expand_glob files '*.txt'
_comp_split fields ',' "$input"
# With "ret" rule
_comp_expand_glob '*.txt'
files=("${ret[@]}")
_comp_split ',' "$input"
fields=("${ret[@]}")

I actually chose the former intentionally even though I recognize that it is different from the proposed rule. A part of the reason that I decided to break the rule intentionally is that these utilities are actually safer replacements of the basic shell feature of the following form:

files=(*.txt)
IFS=','; fields=($input)

Functions modifying existing arrays

Also, _comp_xfunc_ARRAY_filter in #739 gives me an insight (though I actually not intended to merge _comp_xfunc_ARRAY_filter in the master). This function _comp_xfunc_ARRAY_filter demonstrates the implementation of function that modifies existing array variables. Currently, it accepts an array name so as to be used as follows:

# Current implementation
_comp_xfunc_ARRAY_filter -mF arr hello

However, if would strictly follow the ret rule, we shall change the usage as follows:

# With "ret" rule
ret=("${arr[@]}")
_comp_xfunc_ARRAY_filter -mF hello
arr=("${ret[@]}")

which seems to be a bit distracting.

Another approach is to receive the content of the original array in the arguments, but it is not possible to pass other variable-length arguments with this approach. Also, this approach doesn't seem to be natural when the function just wants to append new elements after the existing elements.

Functions returning multiple named variables

Another example is _init_completion, which returns results by writing values to named variables cur, prev, words, cword, etc. I don't think we should force this function to store the results in a single array ret, like ret=("$cur" "$prev" "$cword" "${words[@]}").

Edit: Exporter functions

I think the exporter functions discussed in #539 (comment) and found in #731 are also a related example.

You must be logged in to vote
2 replies
Comment options

akinomyoga Sep 5, 2022
Collaborator Author

My current thinking is that we allow these exceptions. At the present stage, I think we do not have to make detailed rules in which cases we allow such exceptions, but just judge case by case. We can later decide the exception rules if a problem arises, or at least after we get a sufficient number of actual cases of maintenance. But maybe we can create a few rules from the beginning.

Comment options

scop Sep 12, 2022
Maintainer

I agree we don't have to be overly strict about this. My primary interest is in getting rid of unnecessary uses of stdout and the problems and hardship it inflicts.

Comment options

akinomyoga
Sep 5, 2022
Collaborator Author

Existing functions that return results through stdout

There are already utilities that return results through stdout. If we would be finally going to change them to use ret, the ongoing interface change for the naming convention is also the chance of changing them to use ret. We do not want to change our interface twice.

Then, the question is whether we should change the existing functions that output stdout? If yes, do we force to change all of such functions? Or do we discuss it for each such function?

I feel we would want to discuss each function in the main file bash_completion. For the functions defined and used in specific completions, I feel we do not have to change them soon.


In theory, we would also need to think about it also for the existing functions that return results through fixed variable names or specified variable names by _upvars, but I think these (such as _init_completion, _get_comp_words_by_ref, etc.) are all related to the basic part of bash-completion and we can just leave all of such functions as is.

You must be logged in to vote
8 replies
Comment options

scop Sep 11, 2022
Maintainer

I'd have a better feeling about being consistent about (not) using stdout, but I'm not insisting. Before the next release, we could leave the ones we're still thinking about to use their old names.

Anyway, _parse_help and _parse_usage are already doing the monitor and lastpipe toggling, so unless I'm missing something, that part of the problem seems already solved.

With regards to the output filtering part, I think there's no need to do it with sed/grep; we could rather modify callers to either do it by modifying the ret array contents directly before using it, or using compgen -X. The number of cases doing the output filtering seems low (this might not catch all cases though, but there shouldn't be many more), and they seem like ones that would be easily converted:

$ git grep -A 1 -En '^[^#]*_parse_(help|usage)[^|]+\|($|[^|])'
completions/aptitude:77: _parse_help - | tr '\n' ' ') "
completions/aptitude-78-
--
completions/dnssec-keygen:41: COMPREPLY=($(compgen -W '$(_parse_help "1ドル" | \
completions/dnssec-keygen-42- command sed -e "s/:\$//")' -- "$cur"))
--
completions/links:83: COMPREPLY=($(compgen -W '$(_parse_help "1ドル" |
completions/links-84- command grep -vF -- "->")' -- "$cur"))
--
completions/zopfli:16: '$(_parse_help "1ドル" -h | command sed -e "s/#$//")' -- "$cur"))
completions/zopfli-17- [[ ${COMPREPLY-} == --i ]] && compopt -o nospace
Comment options

scop Sep 11, 2022
Maintainer

Re backcompat stuff in a completions.d snippet: need to consider how this would be done cleanly in a run-from-git setup. Maybe just add a bash_completion.d dir to the project, and do similarly as we do in __load_completion to use one if it's next to the main bash_completion file.

BTW, both the existing and the hypothetical new loader would be better preferring the dirs found next to the bash_completion in use over system ones. This way a system installed bash-completion package would not interfere with a setup that runs off a git clone.

Comment options

scop Sep 30, 2022
Maintainer

Added a note about general conventions and out variables in #731

Comment options

akinomyoga Oct 2, 2022
Collaborator Author

Thank you! I have added a comment there.

Comment options

scop Jan 15, 2023
Maintainer

WIP branch for moving deprecated things in a separate file at #870

Comment options

akinomyoga
May 21, 2023
Collaborator Author

Switching ret => REPLY for the variable name to return a string

In the latest Bash devel commit e44e3d50, Chet implemented "nofork command substitution" ${ command; } and ${| command; }. The latter form expands to the content of the variable REPLY after executing command. This is expected to be combined with the function that returns a string in the local variable REPLY. This feature is going to be available in the next version of Bash (5.3 if the major version is not incremented).

Now I think we should switch from ret to REPLY for the variable name at some point. What do you think?

For reference, the original discussion that introduced ret in bash-completion has started around #537 (reply in thread).

You must be logged in to vote
3 replies
Comment options

scop May 22, 2023
Maintainer

If I understand the feature correctly, and even though it will likely take a long time until we can start make use of it, I suppose it makes sense.

I suggest we don't rush this, but rather check back on the bash status at the time we're nearing the 2.12 release, then go ahead with the change as one of the last things before the release if it's still that way in bash.

Comment options

scop May 22, 2023
Maintainer

We could draft a PR against api-and-naming.md already now and mark it as TODO in the 2.12 project, so we don't forget.

Comment options

akinomyoga May 22, 2023
Collaborator Author

Thank you. I agree that we wait to see how the design of Bash finally becomes. I think the variable name REPLY is stable because it is a feature copied from mksh. If something would be changed, the feature might be canceled or postponed.

I created a Draft PR #987 and put it into "To do".

Comment options

Hi!

While pondering about upstreaming the dpkg support, I ended up having to also ponder about the API bash-completion provides, and whether that can be used by external projects. First of all, I think the API cleanup, namespacing and general rework that has been going on is great. One problem though is that I'm not sure external projects can rely on being able to use those, because usually this is something that you install on the system, but do not want to have a hard dependency on, and that means it's hard to express versioned relationships (I mean for something like dpkg one could declare a versioned Breaks relationship to force a higher bash-completion version, but...).

This also brings the question that I think you have touched about when to obsolete/remove APIs, and for external projects that would seem tricky.

And finally there's also the tension between shipping the completion with bash-completion and getting all this global code improvements and being able to use the latest and greatest APIs, but having a slow release cycle disconnected from the changes in the upstream projects, or having the completion upstream in sync with the commands being completed, but not being able to benefit from these code cleanups and API improvements.

I'm tending towards the latter, but that also means that if every single project adopted the completion support upstream, as I think is the preference here, that would mean much of the great APIs provided might then not be usable by those external parties, which also seems like a pity. :/

You must be logged in to vote
3 replies
Comment options

akinomyoga Nov 22, 2023
Collaborator Author

This is just my personal opinion, but I believe external completions should use the older API until the new version of bash-completion replaces most of the distributed packages of bash-completion. I expect it will take several years. If you want it to be sure that the new API of the bash-completion is available in all the environments, it would probably take ten years to replace all the packages including those provided by LTS distributions. At which point external completion switches to the new API would be at the discretion of the author of the external completion, but I think one should wait for at least two or three years after the release of 2.12.

Comment options

scop Nov 27, 2023
Maintainer

I'm with @akinomyoga here, and coincidentally it has also passed my mind that it'll probably take a decade for the migration to fully complete. Some Debian tooling might be able to migrate to the new API at a quicker pace.

I hope we can make the release cycle here much, much faster once we get 2.12 out the door.

Comment options

akinomyoga Feb 12, 2025
Collaborator Author

I opened an issue at #1328.

I think we can close this GitHub discussion after processing #1328.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet

AltStyle によって変換されたページ (->オリジナル) /