Jump to content
Wikipedia The Free Encyclopedia

Cork encoding

From Wikipedia, the free encyclopedia
Latin script character encoding used by LaTeX
This article relies largely or entirely on a single source . Relevant discussion may be found on the talk page. Please help improve this article by introducing citations to additional sources.
Find sources: "Cork encoding" – news · newspapers · books · scholar · JSTOR
(November 2012)

The Cork (also known as T1 or EC) encoding is a character encoding used for encoding glyphs in fonts.[1] It is named after the city of Cork in Ireland, where during a TeX Users Group (TUG) conference in 1990 a new encoding was introduced for LaTeX.[1] It contains 256 characters supporting most west- and east-European languages with the Latin alphabet.[2]

Details

[edit ]

In 8-bit TeX engines the font encoding has to match the encoding of hyphenation patterns where this encoding is most commonly used.[3] In LaTeX one can switch to this encoding with \usepackage[T1]{fontenc}, while in ConTeXt MkII this is the default encoding already. In modern engines such as XeTeX and LuaTeX Unicode is fully supported and the 8-bit font encodings are obsolete.

Character set

[edit ]
Cork encoding
0 1 2 3 4 5 6 7 8 9 A B C D E F
0x `
0060 ́
00B4 ˆ
02C6 ̃
02DC ̈
00A8 ̋
02DD ̊
02DA ˇ
02C7 ̆
02D8 ̄
00AF ̇
02D9 ̧
00B8 ̨
02DB
201A
2039
203A
1x "
201C "
201D
201E «
00AB »
00BB
2013
2014 ZWSP [a]
200B 0 [b]
2080 ı [c]
0131 ȷ [c]
0237 ff
FB00 fi
FB01 fl
FB02 ffi
FB03 ffl
FB04
2x  SP  ! " # $ % &
2019 ( ) * + , - . /
3x 0 1 2 3 4 5 6 7 8 9 : ; < = > ?
4x @ A B C D E F G H I J K L M N O
5x P Q R S T U V W X Y Z [ \ ] ^ _
6x
2018 a b c d e f g h i j k l m n o
7x p q r s t u v w x y z { | } ~ SHY [d]
8x Ă
0102 Ą
0104 Ć
0106 Č
010C Ď
010E Ě
011A Ę
0118 Ğ
011E Ĺ
0139 Ľ
013D Ł
0141 Ń
0143 Ň
0147 Ŋ
014A Ő
0150 Ŕ
0154
9x Ř
0158 Ś
015A Š
0160 Ş
015E Ť
0164 Ţ
0162 Ű
0170 Ů
016E Ÿ
0178 Ź
0179 Ž
017D Ż
017B IJ
0132 İ
0130 đ
0111 §
00A7
Ax ă
0103 ą
0105 ć
0107 č
010D ď
010F ě
011B ę
0119 ğ
011F ĺ
013A ľ
013E ł
0142 ń
0144 ň
0148 ŋ
014B ő
0151 ŕ
0155
Bx ř
0159 ś
015B š
0161 ş
015F ť
0165 ţ
0163 ű
0171 ů
016F ÿ
00FF ź
017A ž
017E ż
017C ij
0133 ¡
00A1 ¿
00BF £
00A3
Dx Ð [e] Ñ Ò Ó Ô Õ Ö Œ
0152 Ø Ù Ú Û Ü Ý Þ SS [f]
1E9E
Fx ð ñ ò ó ô õ ö œ
0153 ø ù ú û ü ý þ ß
00DF

Notes

[edit ]
  • Hexadecimal values under the characters in the table are the Unicode character codes.
  • The first 12 characters are often used as combining characters.
  1. ^ 0x17 is dubbed a "compound word mark" (CWM) in the Cork encoding, and is an innovation of this standard. It is an invisible character that separates compounds in a complex word, for instance in German, in order to disallow esthetic ligatures at compound boundaries.[2] It is mapped to the Unicode "zero-width space" (ZWSP, U+200B), defined at about the same time, whose purpose is similar, if not identical.
  2. ^ 0x18 is a "small o", used to compose or (or arbitrary smaller quantities) out of percent sign (%).[2]
  3. ^ a b Dotless i and dotless j may be used to compose accented variants like i with macron (ī).
  4. ^ 0x7F is the hyphenation character, not really a soft hyphen (SHY) as defined by Unicode.
  5. ^ 0xD0 is used both as Eth (Ð, U+00D0) and as D with stroke (Đ, U+0110) which might be a problem at some occasions (like copying text from PDF, hyphenation, ...)
  6. ^ 0xDF contains SS (two letters S). It allows TeX to automatically convert the German lowercase ß into the uppercase form.

Supported languages

[edit ]

The encoding supports most European languages written in Latin alphabet. Notable exceptions are:

Languages with slightly suboptimal support include:

References

[edit ]
  1. ^ a b Petrlik, Lukas (1996年06月19日). "The Czech and Slovak Character Encoding Mess Explained". cs-encodings-faq. 1.10. Archived from the original on 2016年06月21日. Retrieved 2016年06月21日.
  2. ^ a b c Ferguson, Michael (1990), "Report on Multilingual Activities" (PDF), TUGboat, 11 (4): 514–516
  3. ^ TeX hyphenation patterns
[edit ]
Early telecommunications
ISO/IEC 8859
Bibliographic use
National standards
ISO/IEC 2022
Mac OS Code pages
("scripts")
DOS code pages
IBM AIX code pages
Windows code pages
EBCDIC code pages
DEC terminals (VTx)
Platform specific
Unicode / ISO/IEC 10646
TeX typesetting system
Miscellaneous code pages
Control character
Related topics

AltStyle によって変換されたページ (->オリジナル) /