Decode an URL string

Question 1

Challenge

I think everyone of us heard of URL encoding mechanism - it's basically everywhere.

Given an URLEncoded string on stdin, decode it, and output the decoded form to stdout.

The encoding is very simple, + or %20 is representing space. Every percent followed by two hex digits (uppercase or lowercase) has to be replaced with ASCII character code of this number.

Related, but it's the other way round

Example

100%25+working
=>
100% working
%24+%26+%3C+%3E+%3F+%3B+%23+%3A+%3D+%2C+%22+%27+%7E+%2B+%25
=>
$ & < > ? ; # : = , " ' ~ + %
%414243
=>
A4243

Test case that fails:

%24%XY+%%%
 ^~~~~~~
=>
$

Rules

Loopholes are forbidden.
If your programming language has built-in function to decode the URL, using it (the function) is forbidden.
Assume ASCII character set. No Unicode and all that stuff.
This is a code-golf, so the answer with the fewest bytes used to accomplish the task wins.
Please include a link to an online interpreter for your code.
Your program has to decode until EOF or an error (eg. percent without valid hex digits - %xy) is spotted (you can safetly terminate).
If anything is unclear, please let me know down in the comments.

Question 2

* has to decode until * ... this feels like it assumes a specific method of solving the problem, i.e that you will decode left to right.

Question 3

The whole validation part of this is just additional to the challenge. I don't see why it's added on, it makes it two separate challenges. i.e URL decode a string (the named challenge) and validate whether a string is a valid URL encoded string (which is a different challenge imo)

Question 4

Who says I'm going to read it right to left, what if I split the input by %, and then do my work. What if I read right to left and determine if I see two characters and then a percent and then do work on it etc. etc. just because you aren't capable of understanding how a problem may be solved doesn't make it the impossible.

Question 5

(also, you could change the name of the challenge to "Decode the URL as far as possible" or similar - I don't think it's a bad challenge at all if you set the correct expectations from the start)

Question 6

Does it absolutely need to be stdin, rather than default IO rules? Also, agreed the required input validation makes it feel like a chameleon challenge... on some platforms the code to do that may be longer than the actual challenge.

Question 7

05AB1E, (削除) 41 (削除ここまで) 30 bytes

'+ð:Δć©'%Qi2ôćuDHç©Ç`hÊiq}J}®?

-11 bytes thanks to @Grimy.

Try it online or verify all test cases.

Explanation:

'+ð: '# Replace all "+" in the (implicit) input with spaces
Δ # Loop until the result no longer changes:
 ć # Extract head; pop and push remainder-string and first character
 © # Store the character in variable `®` (without popping)
 '%Qi '# If this character is a "%":
 2ô # Split the remainder-string into parts of size 2
 # i.e. "abcde" → ["ab","cd","e"]
 ć # Extract head again
 u # Convert it to uppercase
 D # Duplicate it
 H # Convert it from hexadecimal to integer
 # (NOTE: even if it isn't a valid hexadecimal string,
 # it will still result in an integer regardless)
 ç # And then from integer to ASCII-character with this codepoint
 © # Replace variable `®` with this (without popping)
 Ç`h # Reverse process: ASCII-character → integer → hexadecimal string
 Êi # If both are NOT equal (so it initially was invalid hexadecimal):
 q # Stop the program
 }J # And join the list of 2-char strings back together
 }®? # And then print `®` without newline

Question 8

Fails on lowercase (challenge explicitly says lowercase hex has to be handled too).

Question 9

@Grimy Straight-forward fix with +3 bytes for now (adding Dl«). Thanks for reporting. I did check if H would convert lowercase or mixed case characters correctly, but forgot about my å check..

Question 10

36

Question 11

30

Question 12

@Grimy Nice, thanks. :)

Question 13

JavaScript (V8), 112 bytes

u=>u.match(/([^%]|%[\da-f]{2})*/i)[0].replace(/%..|./g,d=>d=="+"?" ":d[1]?String.fromCharCode("0x"+d[1]+d[2]):d)

Try it online!

Noticed Arnauld's 92 byte Node.js answer shortly after spending some time golfing this. Porting that method would save quite a few bytes (with the main difference being Buffer vs. String.fromCharCode), but I wanted to post this one as it's more interesting and the V8 port wouldn't be worth a separate answer.

This challenge requires input validation, so the first .match takes only the valid part of the URL. Then, each part of it is replaced using a function. One trick I used that's kind of neat is the "0x"+d[1]+d[2]. Ordinarily you can convert hexadecimal to decimal using +("0x"+n), but it seems String.fromCharCode casts to number on its own, saving three bytes. Instead of slicing the initial %, I just concatenate the second and third characters, which is shorter.

Question 14

Gema, 45 characters

+= 
%<X2>=@int-char{@radix{16;10;1ドル}}
%=@fail

Insensitive on hexadecimal case.

Sample run:

bash-5.0$ gema '+= ;%<X2>=@int-char{@radix{16;10;1ドル}};%=@fail' <<< $'100%25+working\n%24+%26+%3C+%3E+%3F+%3B+%23+%3A+%3D+%2C+%22+%27+%7E+%2B+%25\n%414243\n%24+%XY+%%%'
100% working
$ & < > ? ; # : = , " ' ~ + %
A4243
$

Try it online!

Question 15

Perl 5 (`-0777p -Mre=/si`), 45 bytes

s/%.?[^\da-f].*//;y/+/ /;s/%(..)/chr hex1ドル/ge

TIO

Question 16

JavaScript (Node.js), 92 bytes

f=([x,y,z,...a])=>x=='%'?1/(n='0x'+y+z)?Buffer([n])+f(a):'':x?(x=='+'?' ':x)+f([y,z,...a]):a

Try it online!

Question 17

Python 3, 90 bytes

def d(s):t='%'!=s[0];print(end=t*s[0].replace('+',' ')or chr(int(s[1:3],16)));d(s[3-2*t:])

Try it online!

Explanation

Checks if the first character is %. If that is the case, it will try to hex-decode the following two characters and print the result. If not, it will just print the first character and replaces x with if necessary.

If the first character was %, the first three characters are sliced off the string and the function is called recursively. If not, only the first character is sliced off and the function is called again.

Raises an error if the hex string cannot be decoded or if end of line is reached.

Question 18

Jelly, 35 bytes

ṣ"+Kṣ"%μḊḢ;ḢƊ€ŒuØHiⱮⱮ’ḅ48żFO<0œṗƊḢỌ

Try it online!

A monadic link that takes a string as its argument and returns the decoded string, terminating early at any invalid hex.

I’ve assumed for now that standard I/O rules apply. If it really has to be stdin, that will cost a byte.

Question 19

Perl 6, 62 bytes

{S:g{(<-[%]>*)\%?(..)?}={TR/+/ /}(0ドル).print+print chr "0x"~1ドル}

Try it online!

Anonymous code block that outputs to STDOUT and then errors. This assumes that the input cannot contain spaces.

Question 20

Charcoal, 65 bytes

≔E16⍘ιφθWθ«≔⮌⪪S%ι⊟ιW∧ι⊟ι«¿∧›Lκ1⬤01Noθ↧§κμ«c/o⍘↧...κ2¦16✂κ2»«≔υι≔υθ»»D⎚

Try it online! Link is to verbose version of code. Note that Charcoal prompts "Enter input:" if it runs out of input. Explanation:

≔E16⍘ιφθ

Grab the list of hex digits into a variable.

Wθ«

Repeat while the variable is not empty. This is used as a flag to break out of the loop, since Charcoal has no other way of terminating the loop.

≔⮌⪪S%ι

Read the next line of text and split it on %s.

⊟ι

Output the first split.

W∧ι⊟ι«

Repeat while there are more splits to process, but stop if any of them are empty.

¿∧›Lκ1⬤01Noθ↧§κμ«

Also check that the length of the split is at least 2 and that the first 2 characters are hex digits. (Inconveniently I can't use a literal 2, I have to use a string of length 2 instead.)

c/o⍘↧...κ2¦16

Convert the first two characters from hex and output the character with that code.

✂κ2

Output the rest of that split.

»«≔υι≔υθ»»D⎚

Otherwise clear the loop variables so that we terminate processing. The canvas is also printed after each loop as otherwise Charcoal's input handling gets in the way again.

Question 21

Python 3, 91 bytes

lambda s:re.sub('%([A-Fa-f\d]{2})',lambda t:chr(int(t[1],16)),s.replace('+',' '))
import re

Try it online!

Approach this with regular expressions.

Question 22

Nice approach, but prints $%XY %%% instead of $ in the fourth example.

Question 23

Stax, 29 bytes

í☼∩ò☺μ◘Γπ╓l▄╓█₧ß:♦+ÇP¢Y╚↑░oHÑ

Run and debug it

It's mostly regex.

Question 24

Janet, 88 bytes

|(peg/match~(%(any(+(/"+"" ")(*"%"(/(number(2 :h)16),string/from-bytes))(*(!"%")'1))))$)

Returns a single-element array with the resulting string. +3 bytes if this is unacceptable:

|((peg/match~(%(any(+(/"+"" ")(*"%"(/(number(2 :h)16),string/from-bytes))(*(!"%")'1))))$)0)

In the case of invalid input, chops off the string at the last valid point, as requested.

Question 25

APL(NARS), 182 chars

r←f w;i;c;b
b←r←''⋄l←≢w⋄i×ばつ⍳l<i×ばつ⍳'%'=c←w[i×ばつ⍳'+'=c⋄r,←c×ばつ⍳l<i+←1⋄b,←w[i×ばつ⍳l<i+←1⋄b,←w[i×ばつ⍳∼b⊆⎕D∪⎕A[1..6]∪⎕a[1..6]⋄r,←⎕AV[1+⍎'16b',b]⋄b←''⋄→2
r,←' '⋄→2
r←,'$'

// +/ 12 16 44 93 10 7

the function with the line numbers

0:r←f w;i;c;b
1:b←r←''⋄l←≢w⋄i×ばつ⍳l<i×ばつ⍳'%'=c←w[i×ばつ⍳'+'=c⋄r,←c×ばつ⍳l<i+←1⋄b,←w[i×ばつ⍳l<i+←1⋄b,←w[i×ばつ⍳∼b⊆⎕D∪⎕A[1..6]∪⎕a[1..6]⋄r,←⎕AV[1+⍎'16b',b]⋄b←''⋄→2
4:r,←' '⋄→2
5:r←,'$'

f has as input one string and as output one string. If some error is found (% and 2 digits exadecimal not found or out of range allowable) it would return the string "$". ⍎'16bBB' is translated to 187 the hex value from 0xBB.

It is a little long for whaterver language, but if this is the only one in APL it would win the same.

 f '100%25+working'
100% working
 f '%24+%26+%3C+%3E+%3F+%3B+%3A'
$ & < > ? ; :
 f'%414243'
A4243
 f'%24%XY+%%%'
$

score 4 · Answer 1 · 2019-08-02 11:31:28Z

05AB1E, (削除) 41 (削除ここまで) 30 bytes

'+ð:Δć©'%Qi2ôćuDHç©Ç`hÊiq}J}®?

-11 bytes thanks to @Grimy.

Try it online or verify all test cases.

Explanation:

'+ð: '# Replace all "+" in the (implicit) input with spaces
Δ # Loop until the result no longer changes:
 ć # Extract head; pop and push remainder-string and first character
 © # Store the character in variable `®` (without popping)
 '%Qi '# If this character is a "%":
 2ô # Split the remainder-string into parts of size 2
 # i.e. "abcde" → ["ab","cd","e"]
 ć # Extract head again
 u # Convert it to uppercase
 D # Duplicate it
 H # Convert it from hexadecimal to integer
 # (NOTE: even if it isn't a valid hexadecimal string,
 # it will still result in an integer regardless)
 ç # And then from integer to ASCII-character with this codepoint
 © # Replace variable `®` with this (without popping)
 Ç`h # Reverse process: ASCII-character → integer → hexadecimal string
 Êi # If both are NOT equal (so it initially was invalid hexadecimal):
 q # Stop the program
 }J # And join the list of 2-char strings back together
 }®? # And then print `®` without newline

Fails on lowercase (challenge explicitly says lowercase hex has to be handled too).
@Grimy Straight-forward fix with +3 bytes for now (adding Dl«). Thanks for reporting. I did check if H would convert lowercase or mixed case characters correctly, but forgot about my å check..

rydwolf ♦rydwolf 19.3k2 gold badges90 silver badges178 bronze badges · Answer 2 · 2021-04-06 04:42:43Z

JavaScript (V8), 112 bytes

u=>u.match(/([^%]|%[\da-f]{2})*/i)[0].replace(/%..|./g,d=>d=="+"?" ":d[1]?String.fromCharCode("0x"+d[1]+d[2]):d)

Try it online!

Noticed Arnauld's 92 byte Node.js answer shortly after spending some time golfing this. Porting that method would save quite a few bytes (with the main difference being Buffer vs. String.fromCharCode), but I wanted to post this one as it's more interesting and the V8 port wouldn't be worth a separate answer.

This challenge requires input validation, so the first .match takes only the valid part of the URL. Then, each part of it is replaced using a function. One trick I used that's kind of neat is the "0x"+d[1]+d[2]. Ordinarily you can convert hexadecimal to decimal using +("0x"+n), but it seems String.fromCharCode casts to number on its own, saving three bytes. Instead of slicing the initial %, I just concatenate the second and third characters, which is shorter.

manatwork manatwork 20.8k5 gold badges53 silver badges82 bronze badges · Answer 3 · 2019-08-02 08:51:10Z

Gema, 45 characters

+= 
%<X2>=@int-char{@radix{16;10;1ドル}}
%=@fail

Insensitive on hexadecimal case.

Sample run:

bash-5.0$ gema '+= ;%<X2>=@int-char{@radix{16;10;1ドル}};%=@fail' <<< $'100%25+working\n%24+%26+%3C+%3E+%3F+%3B+%23+%3A+%3D+%2C+%22+%27+%7E+%2B+%25\n%414243\n%24+%XY+%%%'
100% working
$ & < > ? ; # : = , " ' ~ + %
A4243
$

Try it online!

score 3 · Answer 4 · 2019-08-02 09:24:34Z

3

\$\begingroup\$

Perl 5 (`-0777p -Mre=/si`), 45 bytes

s/%.?[^\da-f].*//;y/+/ /;s/%(..)/chr hex1ドル/ge

TIO

Share

Improve this answer

edited Jun 17, 2020 at 9:04

Community's user avatar

Community Bot

1

answered Aug 2, 2019 at 9:24

Nahuel Fouilleul's user avatar

Nahuel Fouilleul Nahuel Fouilleul

8,6571 gold badge11 silver badges19 bronze badges

\$\endgroup\$

0

Add a comment |

Arnauld Arnauld 205k21 gold badges186 silver badges668 bronze badges · Answer 5 · 2019-08-02 10:13:47Z

2

\$\begingroup\$

JavaScript (Node.js), 92 bytes

f=([x,y,z,...a])=>x=='%'?1/(n='0x'+y+z)?Buffer([n])+f(a):'':x?(x=='+'?' ':x)+f([y,z,...a]):a

Try it online!

Share

Improve this answer

answered Aug 2, 2019 at 10:13

Arnauld's user avatar

Arnauld Arnauld

205k21 gold badges186 silver badges668 bronze badges

\$\endgroup\$

Add a comment |

Jitse Jitse 8,1243 gold badges21 silver badges45 bronze badges · Answer 6 · 2019-08-02 10:00:10Z

Python 3, 90 bytes

def d(s):t='%'!=s[0];print(end=t*s[0].replace('+',' ')or chr(int(s[1:3],16)));d(s[3-2*t:])

Try it online!

Explanation

Checks if the first character is %. If that is the case, it will try to hex-decode the following two characters and print the result. If not, it will just print the first character and replaces x with if necessary.

If the first character was %, the first three characters are sliced off the string and the function is called recursively. If not, only the first character is sliced off and the function is called again.

Raises an error if the hex string cannot be decoded or if end of line is reached.

Nick Kennedy Nick Kennedy 21.2k3 gold badges18 silver badges44 bronze badges · Answer 7 · 2019-08-02 20:09:31Z

Jelly, 35 bytes

ṣ"+Kṣ"%μḊḢ;ḢƊ€ŒuØHiⱮⱮ’ḅ48żFO<0œṗƊḢỌ

Try it online!

A monadic link that takes a string as its argument and returns the decoded string, terminating early at any invalid hex.

I’ve assumed for now that standard I/O rules apply. If it really has to be stdin, that will cost a byte.

Jo King Jo King 48.1k6 gold badges130 silver badges187 bronze badges · Answer 8 · 2019-08-02 14:00:30Z

Perl 6, 62 bytes

{S:g{(<-[%]>*)\%?(..)?}={TR/+/ /}(0ドル).print+print chr "0x"~1ドル}

Try it online!

Anonymous code block that outputs to STDOUT and then errors. This assumes that the input cannot contain spaces.

Neil Neil 184k12 gold badges76 silver badges287 bronze badges · Answer 9 · 2019-08-02 19:09:06Z

Charcoal, 65 bytes

≔E16⍘ιφθWθ«≔⮌⪪S%ι⊟ιW∧ι⊟ι«¿∧›Lκ1⬤01Noθ↧§κμ«c/o⍘↧...κ2¦16✂κ2»«≔υι≔υθ»»D⎚

Try it online! Link is to verbose version of code. Note that Charcoal prompts "Enter input:" if it runs out of input. Explanation:

≔E16⍘ιφθ

Grab the list of hex digits into a variable.

Wθ«

Repeat while the variable is not empty. This is used as a flag to break out of the loop, since Charcoal has no other way of terminating the loop.

≔⮌⪪S%ι

Read the next line of text and split it on %s.

⊟ι

Output the first split.

W∧ι⊟ι«

Repeat while there are more splits to process, but stop if any of them are empty.

¿∧›Lκ1⬤01Noθ↧§κμ«

Also check that the length of the split is at least 2 and that the first 2 characters are hex digits. (Inconveniently I can't use a literal 2, I have to use a string of length 2 instead.)

c/o⍘↧...κ2¦16

Convert the first two characters from hex and output the character with that code.

✂κ2

Output the rest of that split.

»«≔υι≔υθ»»D⎚

Otherwise clear the loop variables so that we terminate processing. The canvas is also printed after each loop as otherwise Charcoal's input handling gets in the way again.

movatica movatica 1,4338 silver badges9 bronze badges · Answer 10 · 2019-08-02 22:22:12Z

0

\$\begingroup\$

Python 3, 91 bytes

lambda s:re.sub('%([A-Fa-f\d]{2})',lambda t:chr(int(t[1],16)),s.replace('+',' '))
import re

Try it online!

Approach this with regular expressions.

Share

Improve this answer

answered Aug 2, 2019 at 22:22

movatica's user avatar

movatica movatica

1,4338 silver badges9 bronze badges

\$\endgroup\$

1

\$\begingroup\$ Nice approach, but prints $%XY %%% instead of $ in the fourth example. \$\endgroup\$

Jitse
– Jitse

2019年08月03日 07:49:46 +00:00
Commented Aug 3, 2019 at 7:49

Add a comment |

recursive recursive 10.5k21 silver badges36 bronze badges · Answer 11 · 2019-08-12 14:55:40Z

0

\$\begingroup\$

Stax, 29 bytes

í☼∩ò☺μ◘Γπ╓l▄╓█₧ß:♦+ÇP¢Y╚↑░oHÑ

Run and debug it

It's mostly regex.

Share

Improve this answer

answered Aug 12, 2019 at 14:55

recursive's user avatar

recursive recursive

10.5k21 silver badges36 bronze badges

\$\endgroup\$

Add a comment |

Adamátor Adamátor 7,9872 gold badges13 silver badges27 bronze badges · Answer 12 · 2025-09-05 19:09:02Z

Janet, 88 bytes

|(peg/match~(%(any(+(/"+"" ")(*"%"(/(number(2 :h)16),string/from-bytes))(*(!"%")'1))))$)

Returns a single-element array with the resulting string. +3 bytes if this is unacceptable:

|((peg/match~(%(any(+(/"+"" ")(*"%"(/(number(2 :h)16),string/from-bytes))(*(!"%")'1))))$)0)

In the case of invalid input, chops off the string at the last valid point, as requested.

Rosario Rosario 1,5065 silver badges9 bronze badges · Answer 13 · 2025-09-06 08:08:34Z

APL(NARS), 182 chars

r←f w;i;c;b
b←r←''⋄l←≢w⋄i×ばつ⍳l<i×ばつ⍳'%'=c←w[i×ばつ⍳'+'=c⋄r,←c×ばつ⍳l<i+←1⋄b,←w[i×ばつ⍳l<i+←1⋄b,←w[i×ばつ⍳∼b⊆⎕D∪⎕A[1..6]∪⎕a[1..6]⋄r,←⎕AV[1+⍎'16b',b]⋄b←''⋄→2
r,←' '⋄→2
r←,'$'

// +/ 12 16 44 93 10 7

the function with the line numbers

0:r←f w;i;c;b
1:b←r←''⋄l←≢w⋄i×ばつ⍳l<i×ばつ⍳'%'=c←w[i×ばつ⍳'+'=c⋄r,←c×ばつ⍳l<i+←1⋄b,←w[i×ばつ⍳l<i+←1⋄b,←w[i×ばつ⍳∼b⊆⎕D∪⎕A[1..6]∪⎕a[1..6]⋄r,←⎕AV[1+⍎'16b',b]⋄b←''⋄→2
4:r,←' '⋄→2
5:r←,'$'

f has as input one string and as output one string. If some error is found (% and 2 digits exadecimal not found or out of range allowable) it would return the string "$". ⍎'16bBB' is translated to 187 the hex value from 0xBB.

It is a little long for whaterver language, but if this is the only one in APL it would win the same.

 f '100%25+working'
100% working
 f '%24+%26+%3C+%3E+%3F+%3B+%3A'
$ & < > ? ; :
 f'%414243'
A4243
 f'%24%XY+%%%'
$

Stack Exchange Network

Decode an URL string

Challenge

Example

Rules

13 Answers 13

05AB1E, (削除) 41 (削除ここまで) 30 bytes

JavaScript (V8), 112 bytes

Gema, 45 characters

Perl 5 (`-0777p -Mre=/si`), 45 bytes

JavaScript (Node.js), 92 bytes

Python 3, 90 bytes

Explanation

Jelly, 35 bytes

Perl 6, 62 bytes

Charcoal, 65 bytes

Python 3, 91 bytes

Stax, 29 bytes

Janet, 88 bytes

APL(NARS), 182 chars

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Decode an URL string

Challenge

Example

Rules

13 Answers 13

05AB1E, (削除) 41 (削除ここまで) 30 bytes

JavaScript (V8), 112 bytes

Gema, 45 characters

Perl 5 (-0777p -Mre=/si), 45 bytes

JavaScript (Node.js), 92 bytes

Python 3, 90 bytes

Explanation

Jelly, 35 bytes

Perl 6, 62 bytes

Charcoal, 65 bytes

Python 3, 91 bytes

Stax, 29 bytes

Janet, 88 bytes

APL(NARS), 182 chars

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Perl 5 (`-0777p -Mre=/si`), 45 bytes