[フレーム]
Last Updated: February 25, 2016
·
3.384K
· chluehr

Debugging encodings and character sets.

Garbled text on your screen?

  1. Put your data in a plain text file (using vim - you do not want BOMs in your data!)
  2. use the command hexdump -C file
  3. locate the strange characters and determine the byte (sequences)
  4. look them up, e.g. here: utf8 charset table (german)

An example, the german umlaut ü ("ue"):

Correct utf8 encoding is (you would see c3 bc in the hexdump):

U+00FC ü c3 bc LATIN SMALL LETTER U WITH DIA.

A valid UTF-8 character sequence that displays identically, but is not a "ü" (again, 75 cc 88 in the hexdump):

U+0075 u 75 LATIN SMALL LETTER U
U+0308 ̈ cc 88 COMBINING DIAERESIS

AltStyle によって変換されたページ (->オリジナル) /