In a string or not?

Question 1

Recently I've been having some trouble with the new TeaScript interpreter. The biggest problem is identifying whether or not a string contains any special characters.

Challenge

A special character is defined as a character with codepoint 160 to 255. You will be given an input which is a string of characters with codepoints 0 to 255, at most one of which is a special character. The input will consist of a prefix of zero or more characters, a quoted string, and a suffix of zero or more characters. If there is a special character in the quoted string you should output a truthy value, otherwise a falsey value.

Details

The characters "' are considered quotes.
Inside the quoted string, a backslash \ will be used to escape the following character. In the prefix and suffix, it has no special meaning.
Quotes will always be balanced.
There will only be one quoted string.

Examples

"Hello, World¡"
true
"Hello, World"¡
false
"Hello' Wo\"rld\\"¡
false
ab"cd\"ef\\gh\i\\"£
false
\"foo¡"
true

Question 2

This could use a test case where the Unicode character is escaped.

Question 3

Also test cases that actually use ' strings and multiple strings in a single test case (ideally with the Unicode character between them).

Question 4

@MartinBüttner One of the rules is that "There will only be one set of quotes" but +1 for the other test case ideas.

Question 5

@user81655 oh right, I overlooked that. That simplifies things.

Question 6

"Because there are only 1,114,112 characters in unicode, your code will need to be as short as possible" ................ I have no words for your golfing justifications.

Question 7

Retina, (削除) 19 (削除ここまで) 17 bytes

Thanks to user81655 for saving 2 bytes.

Byte count uses ISO 8859-1.

['"].*[¡-ÿ].*['"]

Output is 0 or 1.

Try it online.

Explanation

Due to the assumptions of the challenge, the first ' or " will start the only string of the input and the last ' or " ends it. We also don't need to worry about them being the same because they are guaranteed to be the same anyway.

Therefore, the regex just tries to find a character with code point 161 to 255, inclusive, which is preceded by one quote and followed by another. There will always be either 0 or 1 match.

Question 8

Won't this give a false positive for "abc"¡'? (I guess depending on how you read the OP, that bare single quote can never occur in an input, but technically there is only one set of quotes in this input.)

Question 9

@Mauris the spec says that quotes will always be balanced.

Question 10

Here's another 17-byte solution: (['"]).*[¡-ÿ].*1円. It happens to be more practical.

Question 11

@ןnɟuɐɯɹɐןoɯ yeah, I considered that one, but figured it was overkill, given the assumptions of the challenge. ¯\_(ツ)_/¯

Question 12

Note: This can be done with a simple regular expression. s=>s.match`['"].*[¡-ÿ].*['"]` is 29 bytes in JavaScript, but it's more fun without regular expressions:

JavaScript (ES6), (削除) 84 (削除ここまで) 82 bytes

s=>[...s].map((c,i)=>q?i<s.lastIndexOf(q)&c>" "?r=1:s:c=="'"|c=='"'?q=c:0,q=r=0)|r

Explanation

Returns 1 for true and 0 for false. The " " in the code below is a U+00A0 NO-BREAK SPACE (code point 160).

s=>
 [...s].map((c,i)=> // for each character c in the string
 q?
 i<s.lastIndexOf(q) // if we are still inside the string
 &c>" "?r=1 // and c is a "unicode character", set the result to 1 (true)
 :s // returning s for false guarantees that the array returned by map
 // will cast to NaN, which allows us to use |r instead of &&r
 :c=="'"|c=='"'? // if we are starting a string
 q=c // set the end of string character
 :0,
 q= // q = end string character
 r=0, // initialise r to 0 (false)
 )|r // return r

Test

var solution = s=>[...s].map((c,i)=>q?i<s.lastIndexOf(q)&c>" "?r=1:s:c=="'"|c=='"'?q=c:0,q=r=0)|r

<input type="text" id="input" value='ab"cd\"ef\\gh\i\\"£' />
<button onclick="result.textContent=solution(input.value)">Go</button>
<pre id="result"></pre>

Question 13

Does it handle the backspace to escape quotes?

Question 14

What do you mean? You could test it using the test snippet.

Question 15

Right. It does in fact

Question 16

Oh, your regex is even shorter than my two-stage Retina solution. Do you mind if I use it?

Question 17

@MartinBüttner Go ahead. It's pretty much the same anyway.

Martin Ender Martin Ender 198k67 gold badges455 silver badges997 bronze badges · Accepted Answer · 2015-12-28 09:37:58Z

5

\$\begingroup\$

Retina, (削除) 19 (削除ここまで) 17 bytes

Thanks to user81655 for saving 2 bytes.

Byte count uses ISO 8859-1.

['"].*[¡-ÿ].*['"]

Output is 0 or 1.

Try it online.

Explanation

Due to the assumptions of the challenge, the first ' or " will start the only string of the input and the last ' or " ends it. We also don't need to worry about them being the same because they are guaranteed to be the same anyway.

Therefore, the regex just tries to find a character with code point 161 to 255, inclusive, which is preceded by one quote and followed by another. There will always be either 0 or 1 match.

Share

Improve this answer

edited Dec 28, 2015 at 15:14

answered Dec 28, 2015 at 9:37

Martin Ender's user avatar

Martin Ender Martin Ender

198k67 gold badges455 silver badges997 bronze badges

\$\endgroup\$

4

\$\begingroup\$ Won't this give a false positive for "abc"¡'? (I guess depending on how you read the OP, that bare single quote can never occur in an input, but technically there is only one set of quotes in this input.) \$\endgroup\$

lynn
– lynn

2015年12月28日 20:54:03 +00:00
Commented Dec 28, 2015 at 20:54
\$\begingroup\$ @Mauris the spec says that quotes will always be balanced. \$\endgroup\$

Martin Ender
– Martin Ender

2015年12月28日 22:56:07 +00:00
Commented Dec 28, 2015 at 22:56
\$\begingroup\$ Here's another 17-byte solution: (['"]).*[¡-ÿ].*1円. It happens to be more practical. \$\endgroup\$

Mama Fun Roll
– Mama Fun Roll

2015年12月30日 20:21:38 +00:00
Commented Dec 30, 2015 at 20:21
\$\begingroup\$ @ןnɟuɐɯɹɐןoɯ yeah, I considered that one, but figured it was overkill, given the assumptions of the challenge. ¯\_(ツ)_/¯ \$\endgroup\$

Martin Ender
– Martin Ender

2015年12月30日 20:56:18 +00:00
Commented Dec 30, 2015 at 20:56

Add a comment |

Stack Exchange Network

In a string or not?

Challenge

Details

Examples

2 Answers 2

Retina, (削除) 19 (削除ここまで) 17 bytes

Explanation

JavaScript (ES6), (削除) 84 (削除ここまで) 82 bytes

Explanation

Test

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

In a string or not?

Challenge

Details

Examples

2 Answers 2

Retina, (削除) 19 (削除ここまで) 17 bytes

Explanation

JavaScript (ES6), (削除) 84 (削除ここまで) 82 bytes

Explanation

Test

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions