I have an application which sends a POST request to the VB forum software and logs someone in (without setting cookies or anything).
Once the user is logged in I create a variable that creates a path on their local machine.
c:\tempfolder\date\username
The problem is that some usernames are throwing "Illegal chars" exception. For example if my username was mas|fenix
it would throw an exception..
Path.Combine( _
Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData), _
DateTime.Now.ToString("ddMMyyhhmm") + "-" + form1.username)
I don't want to remove it from the string, but a folder with their username is created through FTP on a server. And this leads to my second question. If I am creating a folder on the server can I leave the "illegal chars" in? I only ask this because the server is Linux based, and I am not sure if Linux accepts it or not.
EDIT: It seems that URL encode is NOT what I want.. Here's what I want to do:
old username = mas|fenix
new username = mas%xxfenix
Where %xx is the ASCII value or any other value that would easily identify the character.
-
Incorporate this to make file system safe folder names: http://stackoverflow.com/questions/333175/is-there-a-way-of-making-strings-file-path-safe-in-cmissaghi– missaghi02/22/2009 21:03:27Commented Feb 22, 2009 at 21:03
14 Answers 14
I've been experimenting with the various methods .NET provide for URL encoding. Perhaps the following table will be useful (as output from a test app I wrote):
Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded HexEscaped
A A A A A A A A %41
B B B B B B B B %42
a a a a a a a a %61
b b b b b b b b %62
0 0 0 0 0 0 0 0 %30
1 1 1 1 1 1 1 1 %31
[space] + + %20 %20 %20 [space] [space] %20
! ! ! ! ! ! ! ! %21
" %22 %22 " %22 %22 " " %22
# %23 %23 # %23 # # # %23
$ %24 %24 $ %24 $ $ $ %24
% %25 %25 % %25 %25 % % %25
& %26 %26 & %26 & & & %26
' %27 %27 ' ' ' ' ' %27
( ( ( ( ( ( ( ( %28
) ) ) ) ) ) ) ) %29
* * * * %2A * * * %2A
+ %2b %2b + %2B + + + %2B
, %2c %2c , %2C , , , %2C
- - - - - - - - %2D
. . . . . . . . %2E
/ %2f %2f / %2F / / / %2F
: %3a %3a : %3A : : : %3A
; %3b %3b ; %3B ; ; ; %3B
< %3c %3c < %3C %3C < < %3C
= %3d %3d = %3D = = = %3D
> %3e %3e > %3E %3E > > %3E
? %3f %3f ? %3F ? ? ? %3F
@ %40 %40 @ %40 @ @ @ %40
[ %5b %5b [ %5B %5B [ [ %5B
\ %5c %5c \ %5C %5C \ \ %5C
] %5d %5d ] %5D %5D ] ] %5D
^ %5e %5e ^ %5E %5E ^ ^ %5E
_ _ _ _ _ _ _ _ %5F
` %60 %60 ` %60 %60 ` ` %60
{ %7b %7b { %7B %7B { { %7B
| %7c %7c | %7C %7C | | %7C
} %7d %7d } %7D %7D } } %7D
~ %7e %7e ~ ~ ~ ~ ~ %7E
Ā %c4%80 %u0100 %c4%80 %C4%80 %C4%80 Ā Ā [OoR]
ā %c4%81 %u0101 %c4%81 %C4%81 %C4%81 ā ā [OoR]
Ē %c4%92 %u0112 %c4%92 %C4%92 %C4%92 Ē Ē [OoR]
ē %c4%93 %u0113 %c4%93 %C4%93 %C4%93 ē ē [OoR]
Ī %c4%aa %u012a %c4%aa %C4%AA %C4%AA Ī Ī [OoR]
ī %c4%ab %u012b %c4%ab %C4%AB %C4%AB ī ī [OoR]
Ō %c5%8c %u014c %c5%8c %C5%8C %C5%8C Ō Ō [OoR]
ō %c5%8d %u014d %c5%8d %C5%8D %C5%8D ō ō [OoR]
Ū %c5%aa %u016a %c5%aa %C5%AA %C5%AA Ū Ū [OoR]
ū %c5%ab %u016b %c5%ab %C5%AB %C5%AB ū ū [OoR]
The columns represent encodings as follows:
UrlEncoded:
HttpUtility.UrlEncode
UrlEncodedUnicode:
HttpUtility.UrlEncodeUnicode
UrlPathEncoded:
HttpUtility.UrlPathEncode
EscapedDataString:
Uri.EscapeDataString
EscapedUriString:
Uri.EscapeUriString
HtmlEncoded:
HttpUtility.HtmlEncode
HtmlAttributeEncoded:
HttpUtility.HtmlAttributeEncode
HexEscaped:
Uri.HexEscape
NOTES:
HexEscape
can only handle the first 255 characters. Therefore it throws anArgumentOutOfRange
exception for the Latin A-Extended characters (eg Ā).This table was generated in .NET 4.0 (see Levi Botelho's comment below that says the encoding in .NET 4.5 is slightly different).
EDIT:
I've added a second table with the encodings for .NET 4.5. See this answer: https://stackoverflow.com/a/21771206/216440
EDIT 2:
Since people seem to appreciate these tables, I thought you might like the source code that generates the table, so you can play around yourselves. It's a simple C# console application, which can target either .NET 4.0 or 4.5:
using System;
using System.Collections.Generic;
using System.Text;
// Need to add a Reference to the System.Web assembly.
using System.Web;
namespace UriEncodingDEMO2
{
class Program
{
static void Main(string[] args)
{
EncodeStrings();
Console.WriteLine();
Console.WriteLine("Press any key to continue...");
Console.Read();
}
public static void EncodeStrings()
{
string stringToEncode = "ABCD" + "abcd"
+ "0123" + " !\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~" + "ĀāĒēĪīŌōŪū";
// Need to set the console encoding to display non-ASCII characters correctly (eg the
// Latin A-Extended characters such as ĀāĒē...).
Console.OutputEncoding = Encoding.UTF8;
// Will also need to set the console font (in the console Properties dialog) to a font
// that displays the extended character set correctly.
// The following fonts all display the extended characters correctly:
// Consolas
// DejaVu Sana Mono
// Lucida Console
// Also, in the console Properties, set the Screen Buffer Size and the Window Size
// Width properties to at least 140 characters, to display the full width of the
// table that is generated.
Dictionary<string, Func<string, string>> columnDetails =
new Dictionary<string, Func<string, string>>();
columnDetails.Add("Unencoded", (unencodedString => unencodedString));
columnDetails.Add("UrlEncoded",
(unencodedString => HttpUtility.UrlEncode(unencodedString)));
columnDetails.Add("UrlEncodedUnicode",
(unencodedString => HttpUtility.UrlEncodeUnicode(unencodedString)));
columnDetails.Add("UrlPathEncoded",
(unencodedString => HttpUtility.UrlPathEncode(unencodedString)));
columnDetails.Add("EscapedDataString",
(unencodedString => Uri.EscapeDataString(unencodedString)));
columnDetails.Add("EscapedUriString",
(unencodedString => Uri.EscapeUriString(unencodedString)));
columnDetails.Add("HtmlEncoded",
(unencodedString => HttpUtility.HtmlEncode(unencodedString)));
columnDetails.Add("HtmlAttributeEncoded",
(unencodedString => HttpUtility.HtmlAttributeEncode(unencodedString)));
columnDetails.Add("HexEscaped",
(unencodedString
=>
{
// Uri.HexEscape can only handle the first 255 characters so for the
// Latin A-Extended characters, such as A, it will throw an
// ArgumentOutOfRange exception.
try
{
return Uri.HexEscape(unencodedString.ToCharArray()[0]);
}
catch
{
return "[OoR]";
}
}));
char[] charactersToEncode = stringToEncode.ToCharArray();
string[] stringCharactersToEncode = Array.ConvertAll<char, string>(charactersToEncode,
(character => character.ToString()));
DisplayCharacterTable<string>(stringCharactersToEncode, columnDetails);
}
private static void DisplayCharacterTable<TUnencoded>(TUnencoded[] unencodedArray,
Dictionary<string, Func<TUnencoded, string>> mappings)
{
foreach (string key in mappings.Keys)
{
Console.Write(key.Replace(" ", "[space]") + " ");
}
Console.WriteLine();
foreach (TUnencoded unencodedObject in unencodedArray)
{
string stringCharToEncode = unencodedObject.ToString();
foreach (string columnHeader in mappings.Keys)
{
int columnWidth = columnHeader.Length + 1;
Func<TUnencoded, string> encoder = mappings[columnHeader];
string encodedString = encoder(unencodedObject);
// ASSUMPTION: Column header will always be wider than encoded string.
Console.Write(encodedString.Replace(" ", "[space]").PadRight(columnWidth));
}
Console.WriteLine();
}
}
}
}
-
Note that the .NET documentation says Do not use; intended only for browser compatibility. Use UrlEncode., but that method encodes a lot of other undesired characters. The closest one is
Uri.EscapeUriString
, but beware it doesn't support anull
argument.Andrew– Andrew01/24/2018 15:16:53Commented Jan 24, 2018 at 15:16 -
1I forgot to mention, my comment above is for
UrlPathEncode
. So basically replaceUrlPathEncode
withUri.EscapeUriString
.Andrew– Andrew03/21/2018 20:01:25Commented Mar 21, 2018 at 20:01 -
1Watch out: this answer is misleading. Some of these escape methods escape differing chars depending on their context within the string. Some are quite dangerous if you don't fully understand their limitations. If you're sticking stuff into uri's stick to
Uri.EscapeDataString
(not EscapeUriString!) unless you're very sure you know what you're doing.Eamon Nerbonne– Eamon Nerbonne06/10/2020 14:45:03Commented Jun 10, 2020 at 14:45 -
Does .NET Core 3+ and .NET 5 any changes of these?huang– huang05/19/2021 17:39:58Commented May 19, 2021 at 17:39
-
2The HTML standard uses a slightly different encoding for the query parameters that are after the "?" character in a URL. Before the "?", URI Percent Encoding is used. After the "?", "application/x-www-form-urlencoded" encoding is used. In particular, that is why space is encoded as %20 before the "?", but space is encoded as + after the "?". Notice that URLEncoded encodes space as +, while EscapedUriString encodes it as %20. See "Mutate action URL" and "application/x-www-form-urlencoded serializer" in the HTML standard.user281806– user28180601/13/2022 22:14:35Commented Jan 13, 2022 at 22:14
You should encode only the user name or other part of the URL that could be invalid. URL encoding a URL can lead to problems since something like this:
string url = HttpUtility.UrlEncode("http://www.google.com/search?q=Example");
Will yield
http%3a%2f%2fwww.google.com%2fsearch%3fq%3dExample
This is obviously not going to work well. Instead, you should encode ONLY the value of the key/value pair in the query string, like this:
string url = "http://www.google.com/search?q=" + HttpUtility.UrlEncode("Example");
Hopefully that helps. Also, as teedyay mentioned, you'll still need to make sure illegal file-name characters are removed or else the file system won't like the path.
-
41Using the HttpUtility.UrlPathEncode method should prevent the problem you're describing here.vipirtti– vipirtti03/09/2009 10:08:55Commented Mar 9, 2009 at 10:08
-
15@DJ Pirtu: It's true that UrlPathEncode won't make those undesired changes in the path, however it also won't encode anything after the
?
(since it assumes the query string is already encoded). In Dan Herbert's example it looks like he's pretendingExample
is the text that requires encoding, soHttpUtility.UrlPathEncode("http://www.google.com/search?q=Example");
won't work. Try it with?q=Ex&ple
(where the desired result is?q=Ex%26ple
). It won't work because (1) UrlPathEncode doesn't touch anything after?
, and (2) UrlPathEncode doesn't encode&
anyway.Tim Goodman– Tim Goodman11/29/2010 18:21:32Commented Nov 29, 2010 at 18:21 -
1See here: connect.microsoft.com/VisualStudio/feedback/details/551839/… I should add that of course it's good that UrlPathEncode doesn't encode
&
, because you need that to delimit your query string parameters. But there are times when you want encoded ampersands as well.Tim Goodman– Tim Goodman11/29/2010 18:23:13Commented Nov 29, 2010 at 18:23 -
13HttpUtility is succeeded by WebUtility in latest versions, save yourself some time :)Wiseman– Wiseman04/11/2014 07:25:36Commented Apr 11, 2014 at 7:25
Better way is to use
to not reference Full Profile of .net 4.
[Update]
Based on what the OP is asking for, the recommended API should be
(Thank you @ykadaru)
-
1Totally agree since often the "Client Profile" is enough for apps using System.Net but not using System.Web ;-)hfrmobile– hfrmobile09/07/2012 17:08:59Commented Sep 7, 2012 at 17:08
-
8OP is talking about checking it for file system compatibility, so this won't work. Windows disallowed character set is '["/", "\\", "<", ">", ":", "\"", "|", "?", "*"]' but many of these don't get encoded using EscapedUriString (see table below - thanks for that table @Simon Tewsi) ..."creates a path on their local machine" -OP UrlEncoded takes care of almost all of the problems, but doesn't solve the problem with "%" or "%3f" being in original input, as a "decode" will now be different than original.m1m1k– m1m1k02/07/2013 00:32:28Commented Feb 7, 2013 at 0:32
-
7just to make it clear: THIS answer WONT WORK for file systemsm1m1k– m1m1k02/07/2013 00:41:26Commented Feb 7, 2013 at 0:41
-
1In addition, starting with the .NET Framework 4.5, the Client Profile has been discontinued and only the full redistributable package is available.twomm– twomm02/19/2013 13:46:36Commented Feb 19, 2013 at 13:46
-
51stackoverflow.com/a/34189188/3436164 Use
Uri.EscapeDataString
NOTUri.EscapeUriString
Read this comment, it helped me out.ykadaru– ykadaru03/13/2017 15:40:46Commented Mar 13, 2017 at 15:40
Since .NET Framework 4.5 and .NET Standard 1.0 you should use WebUtility.UrlEncode
. Advantages over alternatives:
It is part of .NET Framework 4.5+, .NET Core 1.0+, .NET Standard 1.0+, UWP 10.0+ and all Xamarin platforms as well.
HttpUtility
, while being available in .NET Framework earlier (.NET Framework 1.1+), becomes available on other platforms much later (.NET Core 2.0+, .NET Standard 2.0+) and it still unavailable in UWP (see related question).In .NET Framework, it resides in
System.dll
, so it does not require any additional references, unlikeHttpUtility
.It properly escapes characters for URLs, unlike
Uri.EscapeUriString
(see comments to drweb86's answer).It does not have any limits on the length of the string, unlike
Uri.EscapeDataString
(see related question), so it can be used for POST requests, for example.
-
1I like the way it encodes using "+" instead of %20 for spaces.. but this one still does not remove " from the URL and gives me invalid URL... oh well.. just gonna have to do a a replace(""""","")Piotr Kula– Piotr Kula05/15/2017 12:55:35Commented May 15, 2017 at 12:55
Edit: Note that this answer is now out of date. See Siarhei Kuchuk's answer below for a better fix
UrlEncoding will do what you are suggesting here. With C#, you simply use HttpUtility
, as mentioned.
You can also Regex the illegal characters and then replace, but this gets far more complex, as you will have to have some form of state machine (switch ... case, for example) to replace with the correct characters. Since UrlEncode
does this up front, it is rather easy.
As for Linux versus windows, there are some characters that are acceptable in Linux that are not in Windows, but I would not worry about that, as the folder name can be returned by decoding the Url string, using UrlDecode
, so you can round trip the changes.
-
5this answer is out of date now. read a few answers below - as of .net45 this might be the correct solution: msdn.microsoft.com/en-us/library/…blueberryfields– blueberryfields01/07/2015 17:20:39Commented Jan 7, 2015 at 17:20
-
1For FTP each Uri part (folder or file name) may be constructed using Uri.EscapeDataString(fileOrFolderName) allowing all non Uri compatible character (spaces, unicode ...). For example to allow any character in filename, use: req =(FtpWebRequest)WebRequest.Create(new Uri(path + "/" + Uri.EscapeDataString(filename))); Using HttpUtility.UrlEncode() replace spaces by plus signs (+). A correct behavior for search engines but incorrect for file/folder names.Renaud Bancel– Renaud Bancel02/17/2015 10:51:36Commented Feb 17, 2015 at 10:51
-
asp.net blocks majority of xss in url as you get warning when ever you try to add js script
A potentially dangerous Request.Path value was detected from the client
.Learning– Learning09/02/2018 04:39:41Commented Sep 2, 2018 at 4:39
Levi Botelho commented that the table of encodings that was previously generated is no longer accurate for .NET 4.5, since the encodings changed slightly between .NET 4.0 and 4.5. So I've regenerated the table for .NET 4.5:
Unencoded UrlEncoded UrlEncodedUnicode UrlPathEncoded WebUtilityUrlEncoded EscapedDataString EscapedUriString HtmlEncoded HtmlAttributeEncoded WebUtilityHtmlEncoded HexEscaped
A A A A A A A A A A %41
B B B B B B B B B B %42
a a a a a a a a a a %61
b b b b b b b b b b %62
0 0 0 0 0 0 0 0 0 0 %30
1 1 1 1 1 1 1 1 1 1 %31
[space] + + %20 + %20 %20 [space] [space] [space] %20
! ! ! ! ! %21 ! ! ! ! %21
" %22 %22 " %22 %22 %22 " " " %22
# %23 %23 # %23 %23 # # # # %23
$ %24 %24 $ %24 %24 $ $ $ $ %24
% %25 %25 % %25 %25 %25 % % % %25
& %26 %26 & %26 %26 & & & & %26
' %27 %27 ' %27 %27 ' ' ' ' %27
( ( ( ( ( %28 ( ( ( ( %28
) ) ) ) ) %29 ) ) ) ) %29
* * * * * %2A * * * * %2A
+ %2b %2b + %2B %2B + + + + %2B
, %2c %2c , %2C %2C , , , , %2C
- - - - - - - - - - %2D
. . . . . . . . . . %2E
/ %2f %2f / %2F %2F / / / / %2F
: %3a %3a : %3A %3A : : : : %3A
; %3b %3b ; %3B %3B ; ; ; ; %3B
< %3c %3c < %3C %3C %3C < < < %3C
= %3d %3d = %3D %3D = = = = %3D
> %3e %3e > %3E %3E %3E > > > %3E
? %3f %3f ? %3F %3F ? ? ? ? %3F
@ %40 %40 @ %40 %40 @ @ @ @ %40
[ %5b %5b [ %5B %5B [ [ [ [ %5B
\ %5c %5c \ %5C %5C %5C \ \ \ %5C
] %5d %5d ] %5D %5D ] ] ] ] %5D
^ %5e %5e ^ %5E %5E %5E ^ ^ ^ %5E
_ _ _ _ _ _ _ _ _ _ %5F
` %60 %60 ` %60 %60 %60 ` ` ` %60
{ %7b %7b { %7B %7B %7B { { { %7B
| %7c %7c | %7C %7C %7C | | | %7C
} %7d %7d } %7D %7D %7D } } } %7D
~ %7e %7e ~ %7E ~ ~ ~ ~ ~ %7E
Ā %c4%80 %u0100 %c4%80 %C4%80 %C4%80 %C4%80 Ā Ā Ā [OoR]
ā %c4%81 %u0101 %c4%81 %C4%81 %C4%81 %C4%81 ā ā ā [OoR]
Ē %c4%92 %u0112 %c4%92 %C4%92 %C4%92 %C4%92 Ē Ē Ē [OoR]
ē %c4%93 %u0113 %c4%93 %C4%93 %C4%93 %C4%93 ē ē ē [OoR]
Ī %c4%aa %u012a %c4%aa %C4%AA %C4%AA %C4%AA Ī Ī Ī [OoR]
ī %c4%ab %u012b %c4%ab %C4%AB %C4%AB %C4%AB ī ī ī [OoR]
Ō %c5%8c %u014c %c5%8c %C5%8C %C5%8C %C5%8C Ō Ō Ō [OoR]
ō %c5%8d %u014d %c5%8d %C5%8D %C5%8D %C5%8D ō ō ō [OoR]
Ū %c5%aa %u016a %c5%aa %C5%AA %C5%AA %C5%AA Ū Ū Ū [OoR]
ū %c5%ab %u016b %c5%ab %C5%AB %C5%AB %C5%AB ū ū ū [OoR]
The columns represent encodings as follows:
- UrlEncoded:
HttpUtility.UrlEncode
- UrlEncodedUnicode:
HttpUtility.UrlEncodeUnicode
- UrlPathEncoded:
HttpUtility.UrlPathEncode
- WebUtilityUrlEncoded:
WebUtility.UrlEncode
- EscapedDataString:
Uri.EscapeDataString
- EscapedUriString:
Uri.EscapeUriString
- HtmlEncoded:
HttpUtility.HtmlEncode
- HtmlAttributeEncoded:
HttpUtility.HtmlAttributeEncode
- WebUtilityHtmlEncoded:
WebUtility.HtmlEncode
- HexEscaped:
Uri.HexEscape
NOTES:
HexEscape can only handle the first 255 characters. Therefore it throws an ArgumentOutOfRange exception for the Latin A-Extended characters (eg Ā).
This table was generated in .NET 4.5 (see answer https://stackoverflow.com/a/11236038/216440 for the encodings relevant to .NET 4.0 and below).
EDIT:
- As a result of Discord's answer I added the new WebUtility UrlEncode and HtmlEncode methods, which were introduced in .NET 4.5.
-
2No not user UrlPathEncode - even the MSDN says it is not to be used. It was build to fix an issue with netscape 2 msdn.microsoft.com/en-us/library/…Jeff– Jeff03/20/2014 20:03:04Commented Mar 20, 2014 at 20:03
-
Is Server.URLEncode yet another variation on this theme? Does it generate any different output?ALEXintlsos– ALEXintlsos11/18/2015 17:21:43Commented Nov 18, 2015 at 17:21
-
2@ALEX: In ASP.NET the Server object is an instance of HttpServerUtility. Using the dotPeek decompiler I had a look at HttpServerUtility.UrlEncode. It just calls HttpUtility.UrlEncode so the output of the two methods would be identical.Simon Elms– Simon Elms11/19/2015 02:10:33Commented Nov 19, 2015 at 2:10
-
1It seems like, even with this overabundance of encoding methods, they all still fail pretty spectacularly for anything above Latin-1, such as → or ☠. (UrlEncodedUnicode seems like it at least tries to support Unicode, but is deprecated/missing.)brianary– brianary12/15/2015 16:46:03Commented Dec 15, 2015 at 16:46
-
Simon, can you just integrate this answer in the accepted answer? it will be nice to have it in one answer. you could integrate it and make a h1 heading in the bottom of that answer, or integrate in one table, and marked different lines, like:
(Net4.0) ? %3f................................
(Net4.5) ? %3f ..................................
T.Todua– T.Todua09/18/2017 14:17:36Commented Sep 18, 2017 at 14:17
Url Encoding is easy in .NET. Use:
System.Web.HttpUtility.UrlEncode(string url)
If that'll be decoded to get the folder name, you'll still need to exclude characters that can't be used in folder names (*, ?, /, etc.)
-
Does it encode every character thats not part of the alphabet?masfenix– masfenix02/22/2009 19:02:43Commented Feb 22, 2009 at 19:02
-
2URL encoding converts characters that are not allowed in a URL into character-entity equivalents. List of unsafe characters: blooberry.com/indexdot/html/topics/urlencoding.htmIan Robinson– Ian Robinson02/22/2009 19:05:15Commented Feb 22, 2009 at 19:05
-
MSDN Link on HttpUtility.UrlEncode: msdn.microsoft.com/en-us/library/4fkewx0t.aspxIan Robinson– Ian Robinson02/22/2009 19:06:04Commented Feb 22, 2009 at 19:06
-
15It is good practice to put the full System.Web... part in your answer, it saves a lot of people a little time :) thanksLiam– Liam04/24/2009 12:09:56Commented Apr 24, 2009 at 12:09
-
4This is dangerous: not all character of the url have to be encoded, only the values of parameters of querystring. The way you suggest will encode also the & that is needed to create multiple parameter in the querystring. The soution is to encode each value of parameters if neededMarco Staffoli– Marco Staffoli01/21/2013 09:29:53Commented Jan 21, 2013 at 9:29
If you can't see System.Web, change your project settings. The target framework should be ".NET Framework 4" instead of ".NET Framework 4 Client Profile"
-
1In my opinion developers should know about ".NET Profiles" and they should use the correct one for their purposes! Just adding the full profile in order to get (e.g System.Web) without really knowing why they add the full profile, isn't very smart. Use "Client Profile" for your client apps and the full profile only when needed (e.g. a WinForms or WPF client should use client profile and not full profile)! e.g. I don't see a reason using the HttpServerUtility in a client app ^^ ... if this is needed then there is something wrong with the design of the app!hfrmobile– hfrmobile10/26/2012 11:28:24Commented Oct 26, 2012 at 11:28
-
4Really? Do don't ever see a need for a client app to construct a URL? What do you do for a living - janitorial duties?sproketboy– sproketboy03/26/2013 20:33:52Commented Mar 26, 2013 at 20:33
-
@hfrmobile: no. It's all wrong with the profile model (which lived just once and was abandoned in next version). And it was obvious from the beginning. Is it obvious for you now? Think first, don't accept everything 'as is' what msft tries to sell you ;Pabatishchev– abatishchev01/19/2014 17:58:10Commented Jan 19, 2014 at 17:58
-
Sorry, but I never said that a client never has to build/use an URL. As long as .NET 4.0 is in use, user should care about it. To put it short: Developers should think twice before adding HttpServerUtility to a client. There are other/better ways, just see the answer with 139 votes or "Since .NET Framework 4.5 you can use WebUtility.UrlEncode. First, it resides in System.dll, so it does not require any additional references.".hfrmobile– hfrmobile01/20/2014 05:45:40Commented Jan 20, 2014 at 5:45
The .NET implementation of UrlEncode
does not comply with RFC 3986.
Some characters are not encoded but should be. The
!()*
characters are listed in the RFC's section 2.2 as a reserved characters that must be encoded yet .NET fails to encode these characters.Some characters are encoded but should not be. The
.-_
characters are not listed in the RFC's section 2.2 as a reserved character that should not be encoded yet .NET erroneously encodes these characters.The RFC specifies that to be consistent, implementations should use upper-case HEXDIG, where .NET produces lower-case HEXDIG.
I think people here got sidetracked by the UrlEncode message. URLEncoding is not what you want -- you want to encode stuff that won't work as a filename on the target system.
Assuming that you want some generality -- feel free to find the illegal characters on several systems (MacOS, Windows, Linux and Unix), union them to form a set of characters to escape.
As for the escape, a HexEscape should be fine (Replacing the characters with %XX). Convert each character to UTF-8 bytes and encode everything>128 if you want to support systems that don't do unicode. But there are other ways, such as using back slashes "\" or HTML encoding """. You can create your own. All any system has to do is 'encode' the uncompatible character away. The above systems allow you to recreate the original name -- but something like replacing the bad chars with spaces works also.
On the same tangent as above, the only one to use is
Uri.EscapeDataString
-- It encodes everything that is needed for OAuth, it doesn't encode the things that OAuth forbids encoding, and encodes the space as %20 and not + (Also in the OATH Spec) See: RFC 3986. AFAIK, this is the latest URI spec.
I have written a C# method that url-encodes ALL symbols:
/// <summary>
/// !#345ドルHf} → %21%23%24%33%34%35%48%66%7D
/// </summary>
public static string UrlEncodeExtended( string value )
{
char[] chars = value.ToCharArray();
StringBuilder encodedValue = new StringBuilder();
foreach (char c in chars)
{
encodedValue.Append( "%" + ( (int)c ).ToString( "X2" ) );
}
return encodedValue.ToString();
}
Ideally these would go in a class called "FileNaming" or maybe just rename Encode to "FileNameEncode". Note: these are not designed to handle Full Paths, just the folder and/or file names. Ideally you would Split("/") your full path first and then check the pieces. And obviously instead of a union, you could just add the "%" character to the list of chars not allowed in Windows, but I think it's more helpful/readable/factual this way. Decode() is exactly the same but switches the Replace(Uri.HexEscape(s[0]), s) "escaped" with the character.
public static List<string> urlEncodedCharacters = new List<string>
{
"/", "\\", "<", ">", ":", "\"", "|", "?", "%" //and others, but not *
};
//Since this is a superset of urlEncodedCharacters, we won't be able to only use UrlEncode() - instead we'll use HexEncode
public static List<string> specialCharactersNotAllowedInWindows = new List<string>
{
"/", "\\", "<", ">", ":", "\"", "|", "?", "*" //windows dissallowed character set
};
public static string Encode(string fileName)
{
//CheckForFullPath(fileName); // optional: make sure it's not a path?
List<string> charactersToChange = new List<string>(specialCharactersNotAllowedInWindows);
charactersToChange.AddRange(urlEncodedCharacters.
Where(x => !urlEncodedCharacters.Union(specialCharactersNotAllowedInWindows).Contains(x))); // add any non duplicates (%)
charactersToChange.ForEach(s => fileName = fileName.Replace(s, Uri.HexEscape(s[0]))); // "?" => "%3f"
return fileName;
}
Thanks @simon-tewsi for the very usefull table above!
-
also usefull:
Path.GetInvalidFileNameChars()
m1m1k– m1m1k02/08/2013 22:06:43Commented Feb 8, 2013 at 22:06 -
yes. Here's one way of doing it: foreach (char c in System.IO.Path.GetInvalidFileNameChars()) { filename = filename.Replace(c, '_'); }netfed– netfed06/25/2013 02:02:14Commented Jun 25, 2013 at 2:02
In addition to @Dan Herbert's answer , You we should encode just the values generally.
Split has params parameter Split('&','='); expression firstly split by & then '=' so odd elements are all values to be encoded shown below.
public static void EncodeQueryString(ref string queryString)
{
var array=queryString.Split('&','=');
for (int i = 0; i < array.Length; i++) {
string part=array[i];
if(i%2==1)
{
part=System.Web.HttpUtility.UrlEncode(array[i]);
queryString=queryString.Replace(array[i],part);
}
}
}
For .net core users, use this
Microsoft.AspNetCore.Http.Extensions.UriHelper.Encode(Uri uri)