Recently I've been having some trouble with the new TeaScript interpreter. The biggest problem is identifying whether or not a string contains any special characters.
Challenge
A special character is defined as a character with codepoint 160 to 255. You will be given an input which is a string of characters with codepoints 0 to 255, at most one of which is a special character. The input will consist of a prefix of zero or more characters, a quoted string, and a suffix of zero or more characters. If there is a special character in the quoted string you should output a truthy value, otherwise a falsey value.
Details
- The characters
"'
are considered quotes. - Inside the quoted string, a backslash
\
will be used to escape the following character. In the prefix and suffix, it has no special meaning. - Quotes will always be balanced.
- There will only be one quoted string.
Examples
"Hello, World¡"
true
"Hello, World"¡
false
"Hello' Wo\"rld\\"¡
false
ab"cd\"ef\\gh\i\\"£
false
\"foo¡"
true
2 Answers 2
Retina, (削除) 19 (削除ここまで) 17 bytes
Thanks to user81655 for saving 2 bytes.
Byte count uses ISO 8859-1.
['"].*[¡-ÿ].*['"]
Output is 0 or 1.
Explanation
Due to the assumptions of the challenge, the first '
or "
will start the only string of the input and the last '
or "
ends it. We also don't need to worry about them being the same because they are guaranteed to be the same anyway.
Therefore, the regex just tries to find a character with code point 161 to 255, inclusive, which is preceded by one quote and followed by another. There will always be either 0 or 1 match.
-
\$\begingroup\$ Won't this give a false positive for
"abc"¡'
? (I guess depending on how you read the OP, that bare single quote can never occur in an input, but technically there is only one set of quotes in this input.) \$\endgroup\$lynn– lynn2015年12月28日 20:54:03 +00:00Commented Dec 28, 2015 at 20:54 -
\$\begingroup\$ @Mauris the spec says that quotes will always be balanced. \$\endgroup\$Martin Ender– Martin Ender2015年12月28日 22:56:07 +00:00Commented Dec 28, 2015 at 22:56
-
\$\begingroup\$ Here's another 17-byte solution:
(['"]).*[¡-ÿ].*1円
. It happens to be more practical. \$\endgroup\$Mama Fun Roll– Mama Fun Roll2015年12月30日 20:21:38 +00:00Commented Dec 30, 2015 at 20:21 -
\$\begingroup\$ @ןnɟuɐɯɹɐןoɯ yeah, I considered that one, but figured it was overkill, given the assumptions of the challenge. ¯\_(ツ)_/¯ \$\endgroup\$Martin Ender– Martin Ender2015年12月30日 20:56:18 +00:00Commented Dec 30, 2015 at 20:56
Note: This can be done with a simple regular expression. s=>s.match`['"].*[¡-ÿ].*['"]`
is 29 bytes in JavaScript, but it's more fun without regular expressions:
JavaScript (ES6), (削除) 84 (削除ここまで) 82 bytes
s=>[...s].map((c,i)=>q?i<s.lastIndexOf(q)&c>" "?r=1:s:c=="'"|c=='"'?q=c:0,q=r=0)|r
Explanation
Returns 1
for true
and 0
for false
. The " "
in the code below is a U+00A0 NO-BREAK SPACE
(code point 160).
s=>
[...s].map((c,i)=> // for each character c in the string
q?
i<s.lastIndexOf(q) // if we are still inside the string
&c>" "?r=1 // and c is a "unicode character", set the result to 1 (true)
:s // returning s for false guarantees that the array returned by map
// will cast to NaN, which allows us to use |r instead of &&r
:c=="'"|c=='"'? // if we are starting a string
q=c // set the end of string character
:0,
q= // q = end string character
r=0, // initialise r to 0 (false)
)|r // return r
Test
var solution = s=>[...s].map((c,i)=>q?i<s.lastIndexOf(q)&c>" "?r=1:s:c=="'"|c=='"'?q=c:0,q=r=0)|r
<input type="text" id="input" value='ab"cd\"ef\\gh\i\\"£' />
<button onclick="result.textContent=solution(input.value)">Go</button>
<pre id="result"></pre>
-
\$\begingroup\$ Does it handle the backspace to escape quotes? \$\endgroup\$edc65– edc652015年12月28日 10:01:48 +00:00Commented Dec 28, 2015 at 10:01
-
\$\begingroup\$ What do you mean? You could test it using the test snippet. \$\endgroup\$user81655– user816552015年12月28日 10:05:03 +00:00Commented Dec 28, 2015 at 10:05
-
\$\begingroup\$ Right. It does in fact \$\endgroup\$edc65– edc652015年12月28日 10:14:15 +00:00Commented Dec 28, 2015 at 10:14
-
\$\begingroup\$ Oh, your regex is even shorter than my two-stage Retina solution. Do you mind if I use it? \$\endgroup\$Martin Ender– Martin Ender2015年12月28日 14:45:43 +00:00Commented Dec 28, 2015 at 14:45
-
\$\begingroup\$ @MartinBüttner Go ahead. It's pretty much the same anyway. \$\endgroup\$user81655– user816552015年12月28日 15:12:35 +00:00Commented Dec 28, 2015 at 15:12
'
strings and multiple strings in a single test case (ideally with the Unicode character between them). \$\endgroup\$