The absolute value of a number \$x\$ is normally written as \$|x|\$. The left and right side of the absolute value uses the same symbol, so it is not immediately obvious how to parse nested absolute values e.g. \$||1-2|+|3-|4-5|||\$
Your goal is to parse such an expression containing nested absolute values:
The expression will be given as a string of characters.
For simplicity the expression will only contain single-digit numbers (or letters if that is easier in your language),
the operators +
and -
(you can use any two distinct characters to represent these operations), and the symbol |
for the left and right side of an absolute value.
You do not need to handle the case where a number is directly adjacent to an absolute value (e.g. 2|3|
or |2|3
)
Your output should be the same expression in a form that allows you to determine how the absolute values are bracketed.
The output has to satisfy the following rules:
- The expression within an absolute value must not end with an operator (
+
or-
) - The expression within an absolute value cannot be empty
- Each
|
has to be part of exactly one absolute value
You may assume there is a valid way to parse the given input.
Examples:
|2| -> (2)
|2|+|3| -> (2)+(3)
||2|| -> ((2))
||2|-|3|| -> ((2)-(3))
|-|-2+3|| -> (-(-2+3))
|-|-2+3|+|4|-5| -> (-(-2+3)+(4)-5)
|-|-2+|-3|+4|-5| -> (-(-2+(-3)+4)-5)
||1-2|+|3-|4-5||| -> ((1-2)+(3-(4-5)))
This is code-golf the shortest solution wins.
Optional additional requirement:
Also support expressions that allow a number direct before or after a bracket. If the result is not unique, you may return any valid solution.
test-cases (for optional requirement):
|2|3|4| -> (2(3)4)
|2|3|4| -> (2)3(4)
||3|4| -> ((3)4)
|2|3-|4| -> (2)3-(4)
|1+|2|3| -> (1+(2)3)
|1+2|3|| -> (1+2(3))
7 Answers 7
JavaScript (ES6), 52 bytes
f=s=>s>(s=s.replace(/\|(.*?[\d)])\|/,"(1ドル)"))?f(s):s
Method
At each iteration, we look for the first substring consisting of:
\|
a leading pipe.*?
the shortest possible string[\d)]
either a digit or a closing parenthesis\|
a trailing pipe
and replace the pipes with (
and )
respectively.
If a replacement occurs, the new string is less than the original string in lexicographical order, because the leading pipe is turned into an opening parenthesis. Hence the test s > (s = s.replace(...))
which triggers a recursive call if true.
NB: A safer way would be to use [^|]*
instead of .*?
. If you find a test case where .*?
fails, please let me know.
-
\$\begingroup\$ The pattern works wonderfully with JavaScript regexes, but cannot make it work the same way with RE2, possibly because of differences in empty string matching. \$\endgroup\$doubleunary– doubleunary2023年12月05日 14:22:26 +00:00Commented Dec 5, 2023 at 14:22
-
1\$\begingroup\$
[^|]*
wouldn't need a?
, would it? \$\endgroup\$Neil– Neil2023年12月05日 15:09:50 +00:00Commented Dec 5, 2023 at 15:09
Retina 0.8.2, (削除) 22 (削除ここまで) 16 bytes
T`|`)`\d\|+
\|
(
Try it online! Link includes test cases. Explanation:
T`|`)`\d\|+
Replace all |
s that appear after digits with )
s.
\|
(
Replace the remaining |
s with (
s.
-
-
\$\begingroup\$ @ovs I took your idea to its logical conclusion... \$\endgroup\$Neil– Neil2023年12月05日 14:58:10 +00:00Commented Dec 5, 2023 at 14:58
K (ngn/k), (削除) 27 (削除ここまで) 22 bytes
Based on Arnauld's answer, but without regex.
{`c$(84-0=\"0|"'x)!'x}
Any closing parenthesis in the output directly follows a digit or another closing parenthesis, and none of the opening ones do. This means we can replace all consecutive |
's following a digit by closing parentheses. The remaining |
's are replaced by opening ones.
K (ngn/k), 36 bytes
My first idea, tests all ways of replacing |
with parentheses using try/eval.
{c@*<.[.:;;`]','c:+`c$(83+!2&x)!''x}
It is not very easy to show this is correct, as (1+)
is a valid expression in K. However after toying around with this for quite some time, I'm convinced that you can't construct a valid input where this actually becomes a problem
Charcoal, 16 bytes
FS≡ι|§)(‹ψ0«ι≔ιψ
Try it online! Link is to verbose version of code. Explanation:
FS≡ι
Loop over the input characters.
|§)(‹ψ0
If the current characters is a |
then output a )
or (
depending on whether the last non-|
was less than 0
.
«ι≔ιψ
Otherwise output the character and save it as the last non-|
.
JavaScript (Node.js), 44 bytes
s=>s.replace(d=/./g,c=>c>{}?++d?')':'(':d=c)
Look last non-pipe character before each pipe. Converting pipe to )
only when last non-pipe is a digit, and converting to (
otherwise. Does not work with the optional additional requirement.
05AB1E, 13 bytes
ε'|Qižu®dèëy©
Port of @Neil's Charcoal answer, so make sure to upvote that answer as well!
Outputs as a list of characters.
Try it online or verify all test cases.
Explanation:
ε # Map over the characters of the (implicit) input:
'|Qi '# If the current character is a "|":
žu # Push builtin string "()<>[]{}"
® # Push variable `®` (which is -1 by default)
d # Check whether it's a non-negative (>=0) number
è # Use that check (0 or 1) to (0-based) index into string "()<>[]{}"
ë # Else:
y # Simply push the current character
© # And store it in variable `®` (without popping)
# (after which the mapped list of characters is output implicitly as result)
Haskell + hgl, 26 bytes
gkx"(/d(/|{)_)+|/|{(_|.)+"
This solution uses a "regex" to replace each of the |
s with either (
or )
.
A closing paren must be preceded by a valid statement, and valid statements can only end in either digits, or a closing paren. Meanwhile a open paren can never come after a digit or a closing paren, so this classification is complete.
We translate these rules to the "regex":
( )+ -- at least once
/d -- a digit
( )+ -- at least one of
/| -- a |
{)_ -- replaced with )
| -- or
/| -- a |
{(_ -- replaced with (
| -- or
. -- any character
Without "regex", 36 bytes
gkY$dgS<>sO(hh"|"")")++hh"|""("++hdS
This is the same idea as the algorithm above, but implemented using parser combinators.
This is considerably longer, as parser combinators tend to be at a disadvantage to the "regex" when it comes to challenges involving the specific behavior of specific characters.
36 bytes
rA"|""("<gkY(dgS<>sO(hh"|"")")++hdS)
This is another solution which is also the same length, but which uses a mix of parser and non-parser functions.
Reflection
Although these solutions don't exactly feel good, there is remarkably little I see in the way of improvement.
Regex
There's very little that can be done to improve the "regex" based solution. I have no plans to change the syntax of "regex" in the near future, and to be honest I don't even know what I would do.
- The only thing I can think of is to make a combined version of
gkx
andgkY
. This would pull the(...)+
out and save 3 bytes.
Non-regex
I have more thoughts here, but even with all my suggestions this doesn't come close to catching up with the regex.
- The time has probably come for a combined version of
<>
andsO
. hh"|"")"
is annoyingly long. There are two ways to solve this:- Simplest is that like other characters we could just add a shortcut to
hh"|"
. Of course next time we are going to want to do something likehh"@""*"
and this will have been completely useless, but if we keep adding these eventually we will get there. - A more principled solution would be to implement something like
xx
but taking a single string argument and unconsing it to get two arguments. e.g.xxh(x:xs)=xx x xs
. This would allow it to bexxh"|)"
which saves only a byte.
- Simplest is that like other characters we could just add a shortcut to
(2(+)3)
\$\endgroup\$|1+2|3||
be a valid test case for the optional requirement? \$\endgroup\$