Here you can find character set and code page information from software vendors (Microsoft, HP, IBM, Sun, etc.) and international standards organizations (e.g. ISO, ECMA, INCITS, etc.). Push any "button" and you will be taken either to the chart of a code page provided by the vendor, or the vendor's web page of links to code page charts. This gives you fast access to popular code pages, as well as access to more complete lists of code page charts.
The links are (mostly) organized by vendor or standard organization. Some code pages are listed redundantly, usually because the code page is being described by different vendors. Sometimes the difference is important. For example, one vendor's view of a code page may be different from another's. Certainly character conversion or mapping tables may be very different. Sometimes a code page has been updated and one vendor is still referring to an earlier version of the code page.
Note that a "code page" is also known by various other names: codepage, encoding, charset, character set, coded character set, (CCS), graphic character set, character map et al. Some of these have more specific names DBCS (double-byte character set), MBCS (multi-byte character set). Some encodings are the result of transformations, and are known as transformation formats, examples include Unicode UTF-8, UTF-16, UTF-32.
If you are interested in UTF-16 surrogate code points, or supplementary characters, see
Setting up Microsoft Windows NT, 2000 or Windows XP to Support Unicode Supplementary Characters
and
Conversion Table: Unicode Surrogates to Scalar Value/UTF-32.
Other Unicode pages on this site that may be of interest include:
Cheat Sheet: Unicode-Enabling Microsoft C/C++ Source Code,
Hiragana Characters,
Hebrew Characters,
Benefits of the Unicode Standard, and the
Compelling Unicode Demo.
I18n Guy's Hiragana Unicode Chart
Dik Winter's Character Set History
Piotr Trzcionkowski's Polish code page site (in Polish)
I18nGuru's Character Sets page
Character Encoding Model (TR-17)
I18n Guy's Hebrew Unicode Chart
ISO 6429 = ECMA-48 (pdf)
(Control codes)
RFC 1555 Hebrew Character Encoding for Internet Messages
RFC 1556 Handling of Bi-directional Texts in MIME
RFC 1556 defines ISO-8859-6-e, ISO-8859-6-i, ISO-8859-8-e,ISO-8859-8-iArmenian Character Sets ArmSCII
Czyborra's ISO 8859 Alphabet Soup
So vat's Unicode? Chicken soup?In the following web pages, leadbytes are indicated by light gray background shading. Each of these leadbytes links to a new page showing the 256 character block associated with that leadbyte. Unused leadbytes are identified by a darker gray background.
I18n Guy's Hiragana Unicode Chart
Conversion Problems CP932 & Unicode
Simplified Chinese GBK (CP 936)
Microsoft's Windows code pages
Microsoft's Windows code pages
by country
I18n Guy's Hiragana Unicode Chart
CP 00290 (EBCDIC) Japanese (Katakana) Non-extended
CP 00290 (EBCDIC) Japanese (Katakana) Extended
CP 00833 (EBCDIC) Korea Extended
CP 00836 (EBCDIC) Simplified Chinese Extended
CP 00903 (IBM PC) People's Republic of China (PRC)
CP 00904 (IBM PC) Republic of China (ROC)
CP 00905 (EBCDIC) Turkey Extended CP
CP 01027 (EBCDIC) Japanese (Latin) Extended
CP 01040 (IBM PC) Korean Extended
CP 01041 (IBM PC) Japanese Extended
CP 01042 (IBM PC) Simplified Chinese Extended
CP 01043 (IBM PC) Traditional Chinese
Copyright © 2002, 2003, 2004, 2005 Tex Texin. All rights reserved.
Top of page