6
\$\begingroup\$

Interpret a SqueezeL string

SqueezeL is a golfing language I'm developing. Its main distinguishing feature is its 40 character code page, which led me to create a semi-complicated encoding method for string literals, which I think could be an interesting code golf challenge.

Input

Any string consisting only of spaces, quotes, parentheses, digits and lowercase letters.

Output

The string represented by the input. Here are the tokens that can appear in a string:

  • A space, digit, or lowercase letter, which represents itself.
  • A doubled parenthesis, which represents a single parenthesis. That is, (( represents ( and )) represents ).
  • )", which represents ".
  • A right parenthesis followed by a lowercase letter, which represents the uppercase version of that letter.
  • A "short base-36 code", which consists of a right parenthesis, followed by a digit, followed by an alphanumeric character. This can represent any character code point in [0, 359], in base 36. For example, a comma has the codepoint 44, which is 18 is base 36, so )18 represents a comma.
  • A "long base-36 code": a left parenthesis followed by four alphanumeric characters. Works the same way as a short base-36 code, but can represent any Unicode character.

You may assume the input won't contain any invalid tokens, such as (), (", or ) .

Examples

foobar123 -> foobar123
)capital -> Capital
((parentheses)) -> (parentheses)
)"quotes)" -> "quotes"
)who)18 me)1r -> Who, me?
)9z -> ŧ
(2r5s -> 😀
// more cases suggested by Arnauld
)9zzz -> ŧzz
(2r5uuu -> 😂uu

This is , so fewest bytes wins.

asked Jan 14 at 22:40
\$\endgroup\$
0

5 Answers 5

5
\$\begingroup\$

JavaScript (ES9), 105 bytes

s=>s.replace(/[()]((?<=\()\w{4}|\d?.)/g,(_,s)=>s[1]?String.fromCodePoint(parseInt(s,36)):s.toUpperCase())

Try it online!

Method

Regular expression:

[()] // a parenthesis followed by either:
( //
 (?<=\()\w{4} // if it's an opening parenthesis: 4 alphanumeric characters
 // matches: (xxxx
 | // or
 \d?. // an optional digit followed by any character
) // matches: ((, )), )a, )", and )xx

where the a in )a is a lower case letter and xx / xxxx are base-36 strings.

NB: Nothing else will be match provided that the input string does not contain invalid tokens, as specified in the challenge.

We don't really care about the escape character and only retrieve the string s that follows. It boils down to two cases:

  • If s is at least 2 characters long, proceed with base-36 decoding:

    String.fromCodePoint(parseInt(s,36))
    
  • Otherwise, return s in upper case (leaving (, ) and " unchanged):

    s.toUpperCase()
    
answered Jan 15 at 0:21
\$\endgroup\$
4
\$\begingroup\$

Japt, 35 bytes

Port of Arnauld's JS solution.

r"%)%d?.|%(%w\{0,3}."Ȥ?XÅn36 d:XÌu

Try it (includes all test cases)

r"..."Ȥ?XÅn36 d:XÌu :Implicit input of string
r :Replace
 "..." : RegEx /\)\d?.|\(\w{0,3}./g
 È : Pass each match, X, through the following fuction
 ¤ : Slice off the first 2 characters
 ? : If truthy (non-empty string)
 XÅ : Slice off the first character
 n36 : Convert from base 36
 d : Get character at that codepoint
 : : Else
 XÌ : Last character
 u : Uppercase
answered Jan 15 at 11:08
\$\endgroup\$
2
  • \$\begingroup\$ I've saved 2 bytes in the JS version with an updated regex. This may just make things longer in Japt, though. \$\endgroup\$ Commented Jan 16 at 17:34
  • \$\begingroup\$ Thanks, @Arnauld :) But, you're right, it would be longer in Japt; I'd only be able to reclaim 2 bytes, either by removing the matching group from the new RegEx, or by removing the slice before the base conversion and the indexing into the string before the case conversion. \$\endgroup\$ Commented Jan 17 at 9:56
3
\$\begingroup\$

Charcoal, 62 bytes

FS≡⪫υωω¿No()ι⊞υιι(≡ι⊟υι⊞υι)¿=ιIΣι≔⟦ωωι⟧υ∧⊟υ↥ι¿=L⊞Oυι4«c/o⍘υ36≔⟦⟧υ

Try it online! Link is to verbose version of code. Explanation:

FS

Loop over each character of the input string.

≡⪫υω

Check which state the program is in. The program state is kept as a list as this makes it easier to reset the program state by popping the character (saving 6 bytes), but switch only works on hashable types, so the list is joined here, although the only interesting cases are the empty list and the list containing a parenthesis.

ω

If the program is in the starting state:

¿No()ι

If the current character is a parenthesis, then...

⊞υι

... set the state to that character, otherwise...

ι

... output the character.

(

If the program is in the open parenthesis state:

≡ι⊟υ

If the current character is also an open parenthesis, resetting the program state, then...

ι

... output the current character, otherwise...

⊞υι

... add this base 36 digit to the program state.

)

If the program is in the close parenthesis state:

¿=ιIΣι

If the current character is a digit, then...

≔⟦ωωι⟧υ

... set the program state to waiting for the final base 36 digit, otherwise...

∧⊟υ↥ι

... reset the program state and output the current character in upper case.

¿=L⊞Oυι4«

Otherwise, if this is the last base 36 digit to collect, then:

c/o⍘υ36

Convert the state from base 36 and output that Unicode character.

≔⟦⟧υ

Reset the program state.

answered Jan 15 at 1:09
\$\endgroup\$
3
\$\begingroup\$

05AB1E, 43 bytes

¶ì„)(vJy©¡NUεDõQi®ë¬diX>·ôćAžhìÅβçšJëćuì]J¦

Try it online.

Explanation:

¶ì # Prepend a newline before the (implicit) input-string
 „)( # Push string ")("
 v # Loop over its characters `y`:
 J # (First iteration: no-op)
 # Second iteration: Join the list of the previous iteration back together to a string
 y # Push the current parenthesis-character
 © # Store it in variable `®` (without popping)
 ¡ # First iteration: split the implicit input by this character)
 # Second iteration: split the current string by this character
 NU # And also store the current index in variable `X`
 ε # Map over each part:
 DõQi # If the current part is empty:
 ® # Push the current parenthesis-character `®` instead
 ë¬di # Else-if the current part starts with a digit:
 X # Push index `X`
 >· # Increment and double (0 becomes 2 and 1 becomes 4)
 ô # Split the string into parts of that size
 ć # Extract head; push first item and remainder-list separately
 AžhìÅβ # Convert it from custom base "0-9a-z" to a base-10 integer
 ç # Convert that from a codepoint-integer to a character
 š # Prepend it back to the list
 J # Join the list back together
 ë # Else: it doesn't start with a digit
 ć # Extract head
 u # Uppercase it (no-op for '"')
 ì # Prepend it back to the remainder-string
 ] # Close the if-else statements; map; and loop
 J # Join the list back together to a string
 ¦ # Remove the leading newline again
 # (after which the result is output implicitly)

Minor note: the double J (one right after v and one after ]) is shorter than a single join after closing both if-else statements and the map at the end of every loop-iteration:

 ↓ ↓↓
¶ì„)(vJy©¡NUεDõQi®ë¬diX>·ôćAžhìÅβçšJëćuì]J¦
¶ì„)(vy©¡NUεDõQi®ë¬diX>·ôćAžhìÅβçšJëćuì}}}J}¦
 ↑↑↑↑↑
answered Jan 15 at 9:00
\$\endgroup\$
3
\$\begingroup\$

Perl 5 -MMath::Base36=:all -p, 108 bytes

s/(?<!\))\)(\pL)/\U1ドル/g;s/(?<!\))\)(\d.)|(?<!\()\((\w{4})/chr decode_base361ドル.2ドル/ge;s/\)([")])|\((\()/1ドル2ドル/g

Try it online!

answered Jan 15 at 16:52
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.