6

How to encode URLs containing Unicode? I would like to pass it to a command line utility and I need to encode it first.

Example: http://zh.wikipedia.org/wiki/白雜訊

becomes http://zh.wikipedia.org/wiki/%E7%99%BD%E9%9B%9C%E8%A8%8A.

lmcanavals
2,3761 gold badge24 silver badges35 bronze badges
asked Sep 7, 2011 at 13:32
4
  • It seems Stackoverflow text editor encoded Unicode url. I would like to do the same in c#. Click on the link to get actual Unicode url. Commented Sep 7, 2011 at 13:33
  • 2
    Stack Overflow didn’t do this – your browser did! It displays the URL as Unicode but when you copy it, the copied text contains the URL-encoded string. Commented Sep 7, 2011 at 13:36
  • @KonradRudolph My browser, however, did not. I see it as what I presume to be chinese characters. :) Commented Nov 14, 2011 at 17:10
  • @TheDag That’s a misconception: the browser may still display the URL as Unicode, but internally it’s URL-encoded. To check this, try copying the Unicode URL from the address bar and pasting it into a text field (but not the address bar). Commented Nov 14, 2011 at 17:55

4 Answers 4

8

You can use the HttpUtility.UrlPathEncode method in the System.Web assembly (requires the full .NET Framework 4 profile):

var encoded = HttpUtility.UrlPathEncode("http://zh.wikipedia.org/wiki/白雜訊");
answered Sep 7, 2011 at 13:36
Sign up to request clarification or add additional context in comments.

3 Comments

How to get Unicode characters? The url will be passed by users and I do not know where Unicode characters appears in url.
@Tomas: Updated answer in response to your comment.
Note that UrlPathEncode is the correct thing to do for characters in the path and other parts of the URL except for the hostname. If you have Unicode characters in the hostname of an IRI, then to make a URI of it you must encode them using the IDN algorithm (Punycode).
4

According to MSDN you can't use UrlPathEncode anymore.

So, Correct way of doing it now is,

var urlString = Uri.EscapeUriString("http://zh.wikipedia.org/wiki/白雜訊");
Liam
30k28 gold badges144 silver badges206 bronze badges
answered Nov 14, 2016 at 12:44

Comments

0
Server.UrlEncode(s);

.NET strings are natively Unicode strings (UTF-8 encoded, to be specific) so you need to nothing more than invoke HttpServerUtility.UrlEncode (though the so-called "intrinsic" Server property will be available in most contexts in asp.net where you may want to do this).

Liam
30k28 gold badges144 silver badges206 bronze badges
answered Sep 7, 2011 at 13:36

4 Comments

I do not want to encode :// characters, only Unicode characters.
You encode the individual parameter values, not the entire url.
If I pass unicode url to Server.UrlEncode(s) it will encode all unicode characters along with special url characters like :, ? and // I do not want to do that.
Thats why you encode individual parameters. <a href="mysite.com?myParameter=<%=Server.UrlEncode("SomeUnicodeString")%>">My Link</a>
-1

I had Turkish character problem.<a href="/@Html.Raw(string)" solved the problem

answered Dec 14, 2016 at 14:24

1 Comment

Using Html.Raw() here means introducing an XSS vulnerability. There's nothing special about the Turkish-I, so using @myStringValue will work without any problems unless you're not using Unicode/UTF-8 consistently in your project and/or project files.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.