In most programming languages, the string Hello, World!
can be represented as "Hello, World!"
. But if you want to represent "Hello, World!"
you need to escape the double quotes with backslashes for "\"Hello, World!\""
, and to represent that you also need to escape the backslashes resulting in "\"\\\"Hello, World!\\\"\""
.
Your challenge is to, given a printable ASCII string that's been escaped multiple times (such as "\"\\\"Hello, World!\\\"\""
, find how many characters it is when fully unescaped. Specifically, you should remove a single pair of enclosing "
and replace \\
with \
and \"
with "
, until there are no more enclosing "
left.
You can assume that the string will be syntactically valid - At all stages, as long as the string starts and ends with "
, all other backslashes and double quotes will be properly escaped, and only "
and \
will be escaped. The input string will not be "
at any level of escaping. If the string starts and ends with "
, the last "
cannot be escaped, so e.g. "abc\"
won't occur.
This is code-golf, shortest wins!
Testcases
e -> 1
"hello" -> 5
"\"\\" -> 2
a""b"c -> 6
"c\d+e -> 6
"\"\\\"Hello, World!\\\"\"" -> 13
"c\\\"d\"" -> 5
"\"\"" -> 0
"r\"\"" -> 3
"\"hello\"+" -> 8
"\"\\\"\\\\\\\"\\\\\\\\\\\\\\\"Hello\\\\\\\\\\\\\\\"\\\\\\\"\\\"\"" -> 5
"\\\\\"" -> 3
[""] -> 4
12 Answers 12
Jelly, 11 bytes
ŒV.ị=""ȦƲ¿L
-2 bytes thanks to Jonathan Allan
Explanation
ŒV.ị=""ȦƲ¿L Main Link
------Ʋ Link Grouping (see below)
¿ While
.ị the first and last (index at 0.5)
Ȧ are all
="" equal to "
ŒV Evaluate (Python)
L Length
The footer runs all tests and prints 1
for any correct answers. Ỵœṣ" -> "$€Ç=V}ɗ/€
means "split on newlines, split each string on substrings equal to " -> "
, then for each pair, check that the result of the main link on the left side equals the right side evaluated (to convert to a number)".
For the link grouping, Ʋ
combines four links into a monad, so you'd think we'd need to use $
to capture the fifth, but since ""Ȧ
would be an LCC (a chain that begins with a nilad and only has monads and dyad-nilad/nilad-dyad pairs after it, so a chain with a constant result), it doesn't count that as a link and therefore keeps going one extra time. This is an important tip when understanding the link combining quicks as all of them do this and it is easy to miscount or misunderstand the grouping because it may be taking more links than you expect.
ŒV.ị=)""Ʋ¿L
would also work where instead of =""Ȧ
to check that all are equal to "
, we just check that it's equal to ['"', '"']
. This also makes the grouping a bit simpler because we don't have any potential LCCs so it just takes four links.
-
1\$\begingroup\$ I’m clarifying, but I think this fails for the following input:
"
. It crashes rather than returning 1. \$\endgroup\$Nick Kennedy– Nick Kennedy2024年01月03日 08:45:49 +00:00Commented Jan 3, 2024 at 8:45 -
1\$\begingroup\$ The OP has clarified that
"
won’t occur so you’re all good! \$\endgroup\$Nick Kennedy– Nick Kennedy2024年01月03日 11:05:49 +00:00Commented Jan 3, 2024 at 11:05 -
\$\begingroup\$ The code has 11 characters, but it is 19 bytes actually. \$\endgroup\$Kirill V. Lyadvinsky– Kirill V. Lyadvinsky2024年01月11日 14:40:14 +00:00Commented Jan 11, 2024 at 14:40
-
\$\begingroup\$ @KirillV.Lyadvinsky Jelly uses its own encoding so all 256 of the characters it uses occupy 1 byte each. \$\endgroup\$2024年01月11日 17:47:17 +00:00Commented Jan 11, 2024 at 17:47
Retina 0.8.2, 47 bytes
+`(?=.*"$)(^"(?!$)|\\(.)|(?!^)"$)(?<=^".*)
2ドル
.
Try it online! Link includes test cases. Explanation:
(?=.*"$)
Ensure the string ends with a "
before making any replacements.
(^"(?!$)|\\(.)|(?!^)"$)
Replace a leading "
, an escaped character or a trailing "
, but not a lone "
.
(?<=^".*)
Ensure that the string starts with a "
before making any replacements.
2ドル
Unescape the character.
+`
Repeat until the string can't be unquoted further.
.
Get the length of the final string.
If "
is excluded as being an unsupported input, then ten bytes can be saved by removing (?!$)
and (?!^)
.
Excel ms365, 138 bytes
Assuming input in A1
:
=LET(x,LAMBDA(f,s,IF(s<"",s,f(f,IFNA(SUBSTITUTE(SUBSTITUTE(MID(s,XMATCH("""*""",s,2)+1,LEN(s)-2),"\\","\"),"\""",""""),LEN(s))))),x(x,A1))
It's a recursive LAMBDA that will keep calling itself untill no more starting+leading double quotes.
Google Spreadsheets, 107 bytes
Again, assuming input in A1
, applying the same recursive logic with a few slight changes making use of the regex-functions:
=LET(x,LAMBDA(f,s,IF(REGEXMATCH(s,""".*"""),f(f,REGEXREPLACE(s,"^""|""$|\\(\\|"")","1ドル")),LEN(s))),x(x,A1))
-
\$\begingroup\$ Use
regexmatch(s,"^"".*""$")
to cover all test cases? \$\endgroup\$doubleunary– doubleunary2024年01月03日 21:08:44 +00:00Commented Jan 3, 2024 at 21:08
Perl 5 -pl
, 45 bytes
s/\\(\\|")/1ドル/g while s/^"(.*)"$/1ドル/;$_=y///c
Perl 5 -pl
, 54 bytes
Handles the "\"foo\\\""
testcase I proposed in the comments. If that's not a valid test case, then the shorter version above will suffice.
s/\\(\\|")/1ドル/g while s/^"(.*[^\\])"$/1ドル/;say;$_=y///c
-
\$\begingroup\$ just 1 byte less Try it online! \$\endgroup\$Nahuel Fouilleul– Nahuel Fouilleul2024年01月04日 13:07:33 +00:00Commented Jan 4, 2024 at 13:07
-
\$\begingroup\$ Your second version fails on
"\"foo\\\\\""
, which is a legal input for the first version. (Also it has a sparesay;
which you don't need.) \$\endgroup\$Neil– Neil2024年01月04日 14:21:17 +00:00Commented Jan 4, 2024 at 14:21 -
\$\begingroup\$ Oh, and is there any reason not to use
s/\\(.)/1ドル/g
? \$\endgroup\$Neil– Neil2024年01月04日 14:22:04 +00:00Commented Jan 4, 2024 at 14:22
Python, (削除) 51 (削除ここまで) 52 bytes
f=lambda s:f(eval(s))if'"'==s[-1:]==s[:1]else len(s)
Not complicated, but it works well.
Explanation:
Define a function f that takes an argument s:
f=lambda s:
If the string begins and ends with "
, return the value of the function with backslashes evaluated:
f(eval(s))if'"'==s[-1:]==s[:1]
Otherwise, output the length of the string:
len(s)
-
3\$\begingroup\$ Reversing your condition order
if'"'==s[0]==s[-1]
saves a byte. \$\endgroup\$Neil– Neil2024年01月03日 00:20:22 +00:00Commented Jan 3, 2024 at 0:20 -
\$\begingroup\$ Crash when result is 0 \$\endgroup\$l4m2– l4m22024年01月03日 01:20:59 +00:00Commented Jan 3, 2024 at 1:20
-
\$\begingroup\$ I’m clarifying, but I think this fails for the following input:
"
. It crashes rather than returning 1. \$\endgroup\$Nick Kennedy– Nick Kennedy2024年01月03日 08:46:48 +00:00Commented Jan 3, 2024 at 8:46 -
\$\begingroup\$ @NickKennedy I'd say
"
is invalid input but""
is \$\endgroup\$l4m2– l4m22024年01月03日 11:01:47 +00:00Commented Jan 3, 2024 at 11:01 -
\$\begingroup\$ The OP has clarified that
"
won’t occur so you’re all good! \$\endgroup\$Nick Kennedy– Nick Kennedy2024年01月03日 11:06:01 +00:00Commented Jan 3, 2024 at 11:06
Charcoal, (削除) 34 (削除ここまで) 33 bytes
W¬∨⌕θ"⌕⮌Φθλ"≔⪫−⪪✂θ1±1¦1\\¦\¦\θILθ
Try it online! Link is to verbose version of code. Explanation:
W¬∨⌕θ"⌕⮌Φθλ"
Repeat while the string starts with "
and ends with a different "
...
≔⪫−⪪✂θ1±1¦1\\¦\¦\θ
... remove those "
s, split the string on \\
, remove any remaining \
s, then join the string on \
, thus unquoting the string.
ILθ
Output the length of the final string.
(Yes I could use eval to save 14 bytes but that's boring.)
If "
is excluded as being an unsupported input, then two bytes can be saved by replacing Φθλ
with θ
.
Python 3, (削除) 91 (削除ここまで)90 bytes
f=lambda s:f(s[1:-1].replace('\\"','"').replace(r'\\','\\'))if'"'==s[-1:]==s[0]else len(s)
because eval
is evil :P, with inspiration from Jakav
-
\$\begingroup\$ Welcome to Code Golf SE, nice first answer! I think
.replace('\\"','"').replace(r'\\','\\')
can be shortened to just.replace('\\','')
maybe? Not sure though. \$\endgroup\$noodle person– noodle person2024年01月06日 21:33:24 +00:00Commented Jan 6, 2024 at 21:33 -
\$\begingroup\$ @noodleman that was what I was initially thinking, but unfortunately it breaks with something like
"\\\\\""
\$\endgroup\$David_h– David_h2024年01月06日 21:49:25 +00:00Commented Jan 6, 2024 at 21:49 -
\$\begingroup\$ Ah, right, that makes sense \$\endgroup\$noodle person– noodle person2024年01月06日 22:01:17 +00:00Commented Jan 6, 2024 at 22:01
-
1\$\begingroup\$
if'"'==s[-1:]==s[0]
\$\endgroup\$l4m2– l4m22024年01月08日 01:08:49 +00:00Commented Jan 8, 2024 at 1:08
QBASIC, 302 bytes
Q$ = CHR$(34): LINE INPUT T$: DO WHILE LEFT$(T,ドル 1) + RIGHT$(T,ドル 1) = Q$ + Q$: U$ = MID$(T,ドル 2, LEN(T$) - 2): T$ = "": C$ = "": FOR J = 1 TO LEN(U$): C$ = C$ + MID$(U,ドル J, 1): B = -(C$ <> "\"): T$ = T$ + RIGHT$(C,ドル B * ((INSTR("\\" + Q,ドル C$) > 0) + 2)): C$ = LEFT$(C,ドル 1 - B): NEXT: LOOP: PRINT LEN(T$)
A bit of nostalgia...
Explanation:
'Set quote mark because QBASIC doesn't have string escaping
'then get string
Q$ = CHR$(34)
LINE INPUT T$
'Process the string while it starts and ends with quotes
DO WHILE LEFT$(T,ドル 1) + RIGHT$(T,ドル 1) = Q$ + Q$
'Strip enclosing quotes, assign to a temporary variable,
'prepare to rebuild the original string,
'and prepare to track current character(s)
U$ = MID$(T,ドル 2, LEN(T$) - 2)
T$ = ""
C$ = ""
'Process string character by character.
FOR J = 1 TO LEN(U$)
'Add current character to current status
C$ = C$ + MID$(U,ドル J, 1)
'QBASIC boolean ops return -1 for true, 0 for false,
'so set B to 0 if current character is an escape char, to 1 otherwise.
B = -(C$ <> "\")
'Add current character to string if not an escape char,
'de-escaping if appropriate
T$ = T$ + RIGHT$(C,ドル B * ((INSTR("\\" + Q,ドル C$) > 0) + 2))
'Clear current character unless it's an escape char
C$ = LEFT$(C,ドル 1 - B)
NEXT
LOOP
PRINT LEN(T$)
05AB1E, 10 bytes
Δ¬'"Qi.E]g
Try it online or verify all test cases. (Note: in the single TIO, the input is wrapped within """
-quotes to always have a string input.)
Explanation:
Δ # Loop until the result no longer changes:
¬ # Push its first character (without popping the string)
'"Qi '# If it's a double quote:
.E # Evaluate the string as Elixir code
] # Close both the if-statement and changes-loop
g # Pop and push the length of the reduced string
# (which is output implicitly as result)
Explore related questions
See similar questions with these tags.
[""]
or other valid JSON corrently? \$\endgroup\$"
, because allowing'
opens the door to backticks and various other types of quotes that deviate a bit from the intent of the challenge. \$\endgroup\$"
won't occur \$\endgroup\$