I have recently seen a few URIs containing the query parameter "utf8=✓". My first impression (after thinking "mmm, looks cool") was that this could be used to detect a broken character encoding.
So, is this a better way to resolve potential problems with character encoding, or is it just a developer having fun with a hack?
-
7I disagree. There are schemes out there that look like URNs and that take query parameters - such as Bitcoin. URIs are not confined to browsers. See en.wikipedia.org/wiki/URI_scheme. This question may also address the general case where character encoding is required when a browser accesses a protocol handler.Gary– Gary2012年10月19日 08:29:12 +00:00Commented Oct 19, 2012 at 8:29
-
3Give examples of these URLs or didn't happen.hakre– hakre2012年10月22日 12:59:07 +00:00Commented Oct 22, 2012 at 12:59
-
11Off topic, but OK. Here's my personal donation Bitcoin URI: bitcoin:1KzTSfqjF2iKCduwz59nv2uqh1W2JsTxZH?amount=0.5&label=Agile%20Stack. Notice that the scheme is essentially a URN with query parameters, but it hands off to a protocol handler. This kind of URI could probably benefit from the "utf8=✓" workaround as well.Gary– Gary2012年10月22日 17:47:16 +00:00Commented Oct 22, 2012 at 17:47
-
3@GaryRowe So did you ever get any donations off that link?Kyralessa– Kyralessa2018年09月18日 10:30:48 +00:00Commented Sep 18, 2018 at 10:30
-
@Gary I can't image possibly being a millionaire because of an off-hand comment on stackexchange 12 years ago. You're insanely lucky.stickynotememo– stickynotememo2025年01月16日 08:10:55 +00:00Commented Jan 16 at 8:10
1 Answer 1
By default, older versions of IE (<=8) will submit form data in Latin-1 encoding if possible. By including a character that can't be expressed in Latin-1, IE is forced to use UTF-8 encoding for its form submissions, which simplifies various backend processes, for example database persistence.
If the parameter was instead utf8=true then this wouldn't trigger the UTF-8 encoding in these browsers.
-
8@LarsViklund I should have been clearer with my comment. I meant that the validation associated with character encoding is simplified, not bypassed.Gary– Gary2012年10月13日 13:48:54 +00:00Commented Oct 13, 2012 at 13:48
-
3@Lars Correct, it doesn't absolve you from having to check your input. But it does mean that encoding tweaks only become part of your security handling and don't taint the concept of your "standard processing" pathGareth– Gareth2012年10月14日 10:08:18 +00:00Commented Oct 14, 2012 at 10:08
-
40Also see stackoverflow.com/questions/3222013/…. Apparently Ruby on Rails used to use a snowman character, and was changed to a checkmark which was less ambiguous but less funny.Jack V.– Jack V.2012年10月17日 10:06:03 +00:00Commented Oct 17, 2012 at 10:06
-
11@JohnLBevan it's ignored by the receiving end, it's done it's job to force the browser to send things in utf8 instead of latin1. I've also seen it as ie=💩 (that's the 'pile of poo' code point, looks like it's not rendering in comments.)cabbey– cabbey2012年10月18日 19:54:13 +00:00Commented Oct 18, 2012 at 19:54
-
3@Gareth: Can you back-up the statement that IE <= 8 forms do not support the document and/or form encoding?hakre– hakre2012年10月22日 13:00:19 +00:00Commented Oct 22, 2012 at 13:00
Explore related questions
See similar questions with these tags.