Cork encoding

The Cork (also known as T1 or EC) encoding is a character encoding used for encoding glyphs in fonts.[1] It is named after the city of Cork in Ireland, where during a TeX Users Group (TUG) conference in 1990 a new encoding was introduced for LaTeX.[1] It contains 256 characters supporting most west and east-European languages with the Latin alphabet.[2]

Details

In 8-bit TeX engines the font encoding has to match the encoding of hyphenation patterns where this encoding is most commonly used.[3] In LaTeX one can switch to this encoding with \usepackage[T1]{fontenc}, while in ConTeXt MkII this is the default encoding already. In modern engines such as XeTeX and LuaTeX the Unicode is fully supported and the 8-bit font encodings are obsolete.

Character set

Cork encoding
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_ `
0060
0
´
00B4
1
ˆ
02C6
2
˜
02DC
3
¨
00A8
4
˝
02DD
5
˚
02DA
6
ˇ
02C7
7
˘
02D8
8
¯
00AF
9
˙
02D9
10
¸
00B8
11
˛
02DB
12

201A
13

2039
14

203A
15
1_
201C
16

201D
17

201E
18
«
00AB
19
»
00BB
20

2013
21

2014
22
ZWSP
200B
23
₀/
2080/2030
24[lower-alpha 1]
ı
0131[lower-alpha 2]
25
ȷ
0237
26

FB00
27

FB01
28

FB02
29

FB03
30

FB04
31
2_
0020
32
!
0021
33
"
0022
34
#
0023
35
$
0024
36
%
0025
37
&
0026
38

2019
39
(
0028
40
)
0029
41
*
002A
42
+
002B
43
,
002C
44
-
002D
45
.
002E
46
/
002F
47
3_ 0
0030
48
1
0031
49
2
0032
50
3
0033
51
4
0034
52
5
0035
53
6
0036
54
7
0037
55
8
0038
56
9
0039
57
:
003A
58
;
003B
59
<
003C
60
=
003D
61
>
003E
62
?
003F
63
4_ @
0040
64
A
0041
65
B
0042
66
C
0043
67
D
0044
68
E
0045
69
F
0046
70
G
0047
71
H
0048
72
I
0049
73
J
004A
74
K
004B
75
L
004C
76
M
004D
77
N
004E
78
O
004F
79
5_ P
0050
80
Q
0051
81
R
0052
82
S
0053
83
T
0054
84
U
0055
85
V
0056
86
W
0057
87
X
0058
88
Y
0059
89
Z
005A
90
[
005B
91
\
005C
92
]
005D
93
^
005E
94
_
005F
95
6_
2018
96
a
0061
97
b
0062
98
c
0063
99
d
0064
100
e
0065
101
f
0066
102
g
0067
103
h
0068
104
i
0069
105
j
006A
106
k
006B
107
l
006C
108
m
006D
109
n
006E
110
o
006F
111
7_ p
0070
112
q
0071
113
r
0072
114
s
0073
115
t
0074
116
u
0075
117
v
0076
118
w
0077
119
x
0078
120
y
0079
121
z
007A
122
{
007B
123
|
007C
124
}
007D
125
~
007E
126
SHY/
00AD/2010
127[lower-alpha 3]
8_ Ă
0102
128
Ą
0104
129
Ć
0106
130
Č
010C
131
Ď
010E
132
Ě
011A
133
Ę
0118
134
Ğ
011E
135
Ĺ
0139
136
Ľ
013D
137
Ł
0141
138
Ń
0143
139
Ň
0147
140
Ŋ
014A
141
Ő
0150
142
Ŕ
0154
143
9_ Ř
0158
144
Ś
015A
145
Š
0160
146
Ş
015E
147
Ť
0164
148
Ţ
0162
149
Ű
0170
150
Ů
016E
151
Ÿ
0178
152
Ź
0179
153
Ž
017D
154
Ż
017B
155
IJ
0132
156
İ
0130
157
đ
0111
158
§
00A7
159
A_ ă
0103
160
ą
0105
161
ć
0107
162
č
010D
163
ď
010F
164
ě
011B
165
ę
0119
166
ğ
011F
167
ĺ
013A
168
ľ
013E
169
ł
0142
170
ń
0144
171
ň
0148
172
ŋ
014B
173
ő
0151
174
ŕ
0155
175
B_ ř
0159
176
ś
015B
177
š
0161
178
ş
015F
179
ť
0165
180
ţ
0163
181
ű
0171
182
ů
016F
183
ÿ
00FF
184
ź
017A
185
ž
017E
186
ż
017C
187
ij
0133
188
¡
00A1
189
¿
00BF
190
£
00A3
191
C_ À
00C0
192
Á
00C1
193
Â
00C2
194
Ã
00C3
195
Ä
00C4
196
Å
00C5
197
Æ
00C6
198
Ç
00C7
199
È
00C8
200
É
00C9
201
Ê
00CA
202
Ë
00CB
203
Ì
00CC
204
Í
00CD
205
Î
00CE
206
Ï
00CF
207
D_ Ð/Đ
00D0[lower-alpha 4]
208
Ñ
00D1
209
Ò
00D2
210
Ó
00D3
211
Ô
00D4
212
Õ
00D5
213
Ö
00D6
214
Œ
0152
215
Ø
00D8
216
Ù
00D9
217
Ú
00DA
218
Û
00DB
219
Ü
00DC
220
Ý
00DD
221
Þ
00DE
222
SS
1E9E[lower-alpha 5]
223
E_ à
00E0
224
á
00E1
225
â
00E2
226
ã
00E3
227
ä
00E4
228
å
00E5
229
æ
00E6
230
ç
00E7
231
è
00E8
232
é
00E9
233
ê
00EA
234
ë
00EB
235
ì
00EC
236
í
00ED
237
î
00EE
238
ï
00EF
239
F_ ð
00F0
240
ñ
00F1
241
ò
00F2
242
ó
00F3
243
ô
00F4
244
õ
00F5
245
ö
00F6
246
œ
0153
247
ø
00F8
248
ù
00F9
249
ú
00FA
250
û
00FB
251
ü
00FC
252
ý
00FD
253
þ
00FE
254
ß
00DF
255
_0_1_2_3_4_5_6_7_8_9_A_B_C_D_E_F

Notes

  • Hexadecimal values under the characters in the table are the Unicode character codes.
  • The first 12 characters are often used as combining characters.
  1. 0x18 is just a "trailing zero", used to compose or (or arbitrary smaller quantities) out of percent sign (%).
  2. Dotless i and dotless j may be used to compose accented variants like i with macron (ī).
  3. 0x7F is the hyphenation character (not really a soft hyphen).
  4. 0xD0 is used both as Eth (Ð, U+00D0) and as D with stroke (Đ, U+0110) which might be a problem at some occasions (like copying text from PDF, hyphenation, ...)
  5. 0xDF contains SS (two letters S). It allows TeX to automatically convert the German lowercase ß into the uppercase form.

Supported languages

The encoding supports most European languages written in Latin alphabet. Notable exceptions are:

Languages with slightly suboptimal support include:

References

  1. 1 2 Petrlik, Lukas (1996-06-19). "The Czech and Slovak Character Encoding Mess Explained". cs-encodings-faq. 1.10. Archived from the original on 2016-06-21. Retrieved 2016-06-21.
  2. Ferguson, Michael (1990), "Report on Multilingual Activities" (PDF), TUGboat, Volume 11 (Issue 4): 514–516
  3. TeX hyphenation patterns
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.