There are two cases in which I do not understand why bash behaves as it behaves regarding syntax checking and asking for a newline.
Case 1
bash can execute the line ( (ls) | cat)
, but depending on when you hit return while entering, it might work or not.
This works:
( (ls ) |⏎
cat
This fails:
( (ls) ⏎
| cat)
with error
-bash: syntax error near unexpected token `|'
Is there a logic why the second case is not working? Or it is just how bash works internally?
Case 2
Another thing I do not understand is, that, when you enter (ls) "⏎
bash asks for another line. Is there any way to finish this command without getting a syntax error? And if not, why does bash not directly print out an error message for a syntax error?
4 Answers 4
You would want to think of this a bit in terms of "what the program does when it scans what I sent it by pressing enter". In some cases, it starts a new "thing" (like the c
in cat
: that is clearly the beginning of some "word" the shell must understand), some characters just continue the existing "thing" (like the c
continuing the c
-started "thing"), and some things end previously started "things" and put them on the pile of finished "things" (like the space after cat
, making it clear that the "thing" cat
cannot continue).
The "things" above would be called "tokens" in parser/lexer science.
Is there a logic why the second case is not working? Or it is just how bash works internally?
That's how shell syntax works, yes; having an opened, unmatched parenthesis doesn't start a "newline-absorbing" token. However, |
does start a new token that absorbs all whitespace and newlines after.
So, the "logic" is: that's how the shell is defined to work! Sorry.
Another thing I do not understand is, that, when you enter (ls) "{return} bash asks for another line. Is there any way to finish this command without getting a syntax error?
your "
starts entry of a string literal, so until you type another "
, all the return you hit just become line breaks in that string literal.
-
"what the program does when it notices me typing something", well, I would rather say "when it scans what I sent it by pressing enter", since it doesn't really check all that while you're typing (but instead has the line editor as a separate component, allowing you to go back and write "dog" instead of "cat" after all).ilkkachu– ilkkachu2024年12月16日 18:13:15 +00:00Commented Dec 16, 2024 at 18:13
-
@ilkkachu I'm on phone, would you could quickly fix my wording, please?Marcus Müller– Marcus Müller2024年12月16日 18:19:48 +00:00Commented Dec 16, 2024 at 18:19
-
if open brackets are not newline absorbing, why does
( ( ls ) {return} )
work?Bastian– Bastian2024年12月16日 18:32:45 +00:00Commented Dec 16, 2024 at 18:32 -
1@Bastian
( (ls) ⏎
could be followed by another command, i.e. the newline terminates the command as usual (when there isn't something like a|
at the end). So you could have( (ls) ⏎ echo hello ⏎ )
, similarly to( (ls); echo hello )
(with or without the second semicolon or newline, in the case of( ... )
). It doesn't run thels
at the first newline, though, since it has to wait for the compound subshell to close, since there could be a redirection at the end, e.g.( (ls) ⏎ ) > /dev/null
.ilkkachu– ilkkachu2024年12月16日 18:56:09 +00:00Commented Dec 16, 2024 at 18:56 -
1because
(ls) echo hello
is simply not valid shell input.(ls) ⏎ echo hello
is the same as(ls);echo hello
, which says "spawn a subshell, runls
in that, when done, executeecho hello
.Marcus Müller– Marcus Müller2024年12月16日 19:43:38 +00:00Commented Dec 16, 2024 at 19:43
Here:
( (ls) |
cat )
# and
( (ls)
| cat )
the subshells are just a distraction. Consider the simpler case:
ls |
cat
# and
ls
| cat
The first is a pipeline with ls
on the left-hand side, and cat
on the right-hand side. The shell syntax allows a newline after the |
so that the pipeline continues on the next line. (This can help in arranging a long pipeline more clearly, but I would guess it's also just always been done so. It's in no way ambiguous anyway, all that can come after the |
is another command.)
The second is the command ls
, followed by the broken pipeline | cat
. I say broken pipeline, because the pipeline operator |
requires a command on the left-hand side too. Here, there's nothing you can do to fix that syntax error, so e.g. an interactive shell complains about it immediately, without waiting for another line (even if you had it wrapped in ( )
).
In
(ls) "
the shell doesn't get to the stage of actually checking the syntax of the command. Syntax-wise, the quoted string is a single item, and the shell has to wait for it to end before it can recognize the corresponding token and check the syntax. But yes, it doesn't seem like anything on the following line could make the command valid shell syntax, since you can't put a simple word after a subshell compound command.
(you could have something like "foo"abc'bar'
, with more than one quoted or unquoted part, in total producing only one "word" token, but you get the point.)
All good answers and comments above, but I'm surprized no one mentioned the "continuation character".
I agree that the ( )
sub-shells, are a distraction, you can "fix" your broken code by adding one further character,
ls
| cat
Becomes
ls \
| cat
Your other version
ls |
cat
works because the shell sees the |
char as a stand-alone token (as explained in MM's answer), and then allows it to be the last item on a line, but will wait for a second line to close reading the input. (There is probably a better, tech explanation, but this is the general idea).
Note that the \
char must be the last character on the line, any following space/tab chars will generate a syntax error message.
shellcheck is a good tool for detecting and explaining syntax errors. In this case, it says:
paul: ~ $ shellcheck -s bash -
( (ls)
| cat )
In - line 1:
( (ls)
^-- SC1073 (error): Couldn't parse this explicit subshell. Fix to allow more checks.
In - line 2:
| cat )
^-- SC1072 (error): Expected ) closing the subshell. Fix any mentioned problems and try again.
^-- SC1133 (error): Unexpected start of line. If breaking lines, |/||/&& should be at the end of the previous one.
For more information:
https://www.shellcheck.net/wiki/SC1133 -- Unexpected start of line. If brea...
https://www.shellcheck.net/wiki/SC1072 -- Expected ) closing the subshell. ...
https://www.shellcheck.net/wiki/SC1073 -- Couldn't parse this explicit subs...
paul: ~ $
-
1The OP is using
{return}
to indicate that they pressed the return key on the keyboard.2024年12月16日 16:43:28 +00:00Commented Dec 16, 2024 at 16:43 -
@terdon That's an interesting addition to the conventions of code block formatting.Paul_Pedant– Paul_Pedant2024年12月16日 17:06:16 +00:00Commented Dec 16, 2024 at 17:06