ISO/IEC 646
ISO 646 Invariant. Red Bowen knots (⌘) denote national code points. Other red characters are changed in noteworthy minor modifications. | |
Standard | ISO/IEC 646, ITU T.50 |
---|---|
Classification | 7-bit Basic Latin encoding |
Preceded by | US-ASCII |
Succeeded by | ISO 8859, ISO 10646 |
Other related encoding(s) |
DEC NRCS, World System Teletext Adaptations to other alphabets: ELOT 927, Symbol, KOI-7, SRPSCII and MAKSCII, ASMO 449, SI 960 |
ISO/IEC 646 is the name of a set of ISO standards, described as Information technology — ISO 7-bit coded character set for information interchange and developed in cooperation with ASCII at least since 1964.[1][2] Since its first edition in 1967[3] it has specified a 7-bit character code from which several national standards are derived.
ISO/IEC 646 was also ratified by ECMA as ECMA-6. The first version of ECMA-6 had been published in 1965,[4] based on work the ECMA's Technical Committee TC1 had carried out since December 1960.[4]
Characters in the ISO/IEC 646 Basic Character Set are invariant characters.[5] Since that portion of ISO/IEC 646, that is the invariant character set shared by all countries, specified only those letters used in the ISO basic Latin alphabet, countries using additional letters needed to create national variants of ISO 646 to be able to use their native scripts. Since transmission and storage of 8-bit codes was not standard at the time, the national characters had to be made to fit within the constraints of 7 bits, meaning that some characters that appear in ASCII do not appear in other national variants of ISO 646.
History
ISO/IEC 646 and its predecessor ASCII (ASA X3.4) largely endorsed existing practice regarding character encodings in the telecommunications industry.
As ASCII did not provide a number of characters needed for languages other than English, a number of national variants were made that substituted some less-used characters with needed ones. Due to the incompatibility of the various national variants, an International Reference Version (IRV) of ISO/IEC 646 was introduced, in an attempt to at least restrict the replaced set to the same characters in all variants. The original version (ISO 646 IRV) differed from ASCII only in that in code point 0x24, ASCII's dollar sign ($) was replaced by the international currency symbol (¤). The final 1991 version of the code ISO 646:1991 is also known as ITU T.50, International Reference Alphabet or IRA, formerly International Alphabet No. 5 (IA5). This standard allows users to exercise the 12 variable characters (i.e., two alternative graphic characters and 10 national defined characters). Among these exercises, ISO 646:1991 IRV (International Reference Version) is explicitly defined and identical to ASCII.[6]
The ISO 8859 series of standards governing 8-bit character encodings supersede the ISO 646 international standard and its national variants, by providing 96 additional characters with the additional bit and thus avoiding any substitution of ASCII codes. The ISO 10646 standard, directly related to Unicode, supersedes all of the ISO 646 and ISO 8859 sets with one unified set of character encodings using a larger 21-bit value.
A legacy of ISO/IEC 646 is visible on Windows, where in many East Asian locales the backslash character used in filenames is rendered as ¥ or other characters such as ₩. Despite the fact that a different code for ¥ was available even on the original IBM PC's code page 437, and a separate double-byte code for ¥ is available in Shift_JIS (although this often uses alternative mapping), so much text was created with the backslash code used for ¥ (due to Shift_JIS being officially based on ISO 646:JP, although Microsoft maps it as ASCII) that even modern Windows fonts have found it necessary to render the code that way. A similar situation exists with ₩ and EUC-KR. Another legacy is the existence of trigraphs in the C programming language.
Published standards
- ISO/R646-1967[3]
- ISO 646:1972[7]
- ISO 646:1983[8]
- ISO/IEC 646:1991[7][9]
- ECMA-6 (1965-04-30), first edition[4]
- ECMA-6 (1967-06), second edition[3][4]
- ECMA-6 (1970-07), third edition[4][10]
- ECMA-6 (1973-08), fourth edition[4][10]
- ECMA-6 (1984-12, 1985-03), fifth edition[4]
- ECMA-6 (1991-12, 1997-08), sixth edition[7]
Code page layout
The following table shows the ISO/IEC 646 Invariant character set. National code points are shown empty. Each character is shown with the hex code of its Unicode equivalent and the decimal value of the ISO/IEC 646 code. Certain code points (with a heavy border below) contain characters which may represent combinable (with the backspace character) diacritics in certain regions, which may affect glyph choice.
Letter Number Punctuation Symbol Other undefined
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ | NUL 0000 0 |
SOH 0001 1 |
STX 0002 2 |
ETX 0003 3 |
EOT 0004 4 |
ENQ 0005 5 |
ACK 0006 6 |
BEL 0007 7 |
BS 0008 8 |
HT 0009 9 |
LF 000A 10 |
VT 000B 11 |
FF 000C 12 |
CR 000D 13 |
SO 000E 14 |
SI 000F 15 |
1_ | DLE 0010 16 |
DC1 0011 17 |
DC2 0012 18 |
DC3 0013 19 |
DC4 0014 20 |
NAK 0015 21 |
SYN 0016 22 |
ETB 0017 23 |
CAN 0018 24 |
EM 0019 25 |
SUB 001A 26 |
ESC 001B 27 |
FS 001C 28 |
GS 001D 29 |
RS 001E 30 |
US 001F 31 |
2_ | SP 0020 32 |
! 0021 33 |
" 0022 34 |
35 |
36 |
% 0025 37 |
& 0026 38 |
' 0027 39 |
( 0028 40 |
) 0029 41 |
* 002A 42 |
+ 002B 43 |
, 002C 44 |
- 002D 45 |
. 002E 46 |
/ 002F 47 |
3_ | 0 0030 48 |
1 0031 49 |
2 0032 50 |
3 0033 51 |
4 0034 52 |
5 0035 53 |
6 0036 54 |
7 0037 55 |
8 0038 56 |
9 0039 57 |
: 003A 58 |
; 003B 59 |
< 003C 60 |
= 003D 61 |
> 003E 62 |
? 003F 63 |
4_ | 64 |
A 0041 65 |
B 0042 66 |
C 0043 67 |
D 0044 68 |
E 0045 69 |
F 0046 70 |
G 0047 71 |
H 0048 72 |
I 0049 73 |
J 004A 74 |
K 004B 75 |
L 004C 76 |
M 004D 77 |
N 004E 78 |
O 004F 79 |
5_ | P 0050 80 |
Q 0051 81 |
R 0052 82 |
S 0053 83 |
T 0054 84 |
U 0055 85 |
V 0056 86 |
W 0057 87 |
X 0058 88 |
Y 0059 89 |
Z 005A 90 |
91 |
92 |
93 |
94 |
_ 005F 95 |
6_ | 96 |
a 0061 97 |
b 0062 98 |
c 0063 99 |
d 0064 100 |
e 0065 101 |
f 0066 102 |
g 0067 103 |
h 0068 104 |
i 0069 105 |
j 006A 106 |
k 006B 107 |
l 006C 108 |
m 006D 109 |
n 006E 110 |
o 006F 111 |
7_ | p 0070 112 |
q 0071 113 |
r 0072 114 |
s 0073 115 |
t 0074 116 |
u 0075 117 |
v 0076 118 |
w 0077 119 |
x 0078 120 |
y 0079 121 |
z 007A 122 |
123 |
124 |
125 |
126 |
DEL 007F 127 |
Related encoding families
National Replacement Character Set
The National Replacement Character Set (NRCS) is a family of 7-bit encodings introduced in 1983 by DEC with the VT200 series of computer terminals. It is closely related to ISO 646, being based on a similar invariant subset of ASCII, differing in retaining $
as invariant but not _
(although most NRCS variants retain the _
, and hence comply with the ISO 646 invariant set). Most NCRS variants are closely related to corresponding national ISO 646 variants where they exist, with the exception of the Dutch variant.
World System Teletext
The European telecommunications standard ETS 300 706, "Enhanced Teletext specification", defines Latin, Greek, Cyrillic, Arabic and Hebrew code sets with several national variants for both Latin and Cyrillic.[11] The G0 set of the Latin variant is a family of encodings, based on a similar invariant subset of ASCII as ISO 646 and NRCS, but retaining neither $
nor _
as invariant. Unlike NRCS, variants often differ considerably from corresponding national ISO 646 variants.
Variant codes and descriptions
ISO 646 national variants
Some national variants of ISO 646 are as follows:
Code | ISO-IR | ISO ESC | Approved | National Standard | Description |
---|---|---|---|---|---|
CA | 121 | ESC 2/8 7/7 | ISO 646 | CSA Z243.4-1985-1 | Canada (No. 1 alternative, with “î”) (French, classical) (Code page 1020[12]) |
CA2 | 122 | ESC 2/8 7/8 | ISO 646 | CSA Z243.4-1985-2 | Canada (No. 2 alternative, with “É”) (French, reformed orthography) |
CN | 57[13] | ESC 2/8 5/4 | ? | GB/T 1988-80 | People's Republic of China (Basic Latin) |
CU | 151 | ESC 2/8 2/1 4/1 | ISO 646 | NC 99-10:81 / NC NC00-10:81 | Cuba (Spanish) |
DANO | 9-1[14] | ESC 2/8 4/5[14] | SIS? | NATS-DANO | Norway and Denmark (journalistic texts). Invariant code point 0x22 is displayed as « , (compare " in the IRV). It is, however, still considered a double quotation mark.[15] Accompanies SEFI (NATS-SEFI). |
DE | 21[14][13] | ESC 2/8 4/11[14] | ISO 646 | DIN 66003 | Germany (German) (Code page 1011,[16] 20106[17][18][19]) |
DK | — | ? | DS 2089[20][21] | Denmark (Danish) (Code page 1017[22]) | |
ES | 17[14] | ESC 2/8 5/10[14] | ECMA | Olivetti | Spanish (international) (Code page 1023[23]) |
ES2 | 85[13] | ESC 2/8 6/8 | ECMA | IBM | Spain (Basque, Castilian, Catalan, Galician) (Code page 1014[24]) |
FI | 10[13] | ISO 646 | SFS 4017 | Finland (basic version) (Code page 1018[25]) | |
FR | 69[13] | ESC 2/8 6/6 | ISO 646 | AFNOR NF Z 62010-1982 | France (French) (Code page 1010[26]) |
FR1 | 25[14][13] | ESC 2/8 5/2[14] | ISO 646 | AFNOR NF Z 62010-1973 | France (obsolete since April 1985) (Code page 1104[27]) |
GB | 4[14][13] | ESC 2/8 4/1[14] | ISO 646 | BS 4730 | United Kingdom (English) (Code page 1013[28]) |
HU | 86 | ESC 2/8 6/9 | ISO 646 | MSZ 7795/3 | Hungary (Hungarian) |
IE | 207 | ? | NSAI 433:1996 | Ireland (Irish) | |
INV | 170 | ESC 2/8 2/1 4/2 | ISO 646 | ISO 646:1983 | Invariant subset |
(IRV) | 2[14][13] | ESC 2/8 4/0[14] | ISO 646 | ISO 646:1973 | International Reference Version. 0x7E as an overline (ISO-IR-002).[29] |
? | ? | ISO 646 | ISO 646:1983 | International Reference Version. 0x7E as a tilde (Code page 1009,[30] 20105[17][18][31]). | |
ISO 646:1991 International Reference Version matches the US variant (see below). | |||||
IS | ? | ? | ? | Iceland (Icelandic) | |
IT | 15[14][13] | ESC 2/8 5/9[14] | ECMA | UNI 0204-70 / Olivetti? | Italian (Code page 1012[32]) |
JP | 14[14][13] | ESC 2/8 4/10[14] | ISO 646 | JIS C 6220:1969-ro | Japan (Romaji) (Code page 895[33]). Also used as an 8-bit code with the corresponding Katakana supplementary set. |
JP-OCR-B | 92 | ESC 2/8 6/14 | ISO 646 | JIS C 6229-1984-b | Japan (OCR-B) |
KR | — | ? | KS C 5636-1989 | South Korea | |
MT | — | ? | ? | Malta (Maltese, English) | |
NL | — | ECMA | IBM | Netherlands (Dutch) (Code page 1019[34]) | |
NO | 60[13] | ESC 2/8 6/0 | ISO 646 | NS 4551 version 1[13] | Norway (Code page 1016[35]) |
NO2 | 61[13] | ESC 2/8 6/1 | ISO 646 | NS 4551 version 2[13] | Norway (obsolete since June 1987) (Code page 20108[17][18][36]) |
pl | — | BN-74/3101-01 | Poland (Polish has 18 letters with diacritical marks, but only 9 lowercase letters are normalized due to code space reasons. | ||
PT | 16[13] | ESC 2/8 4/12 | ECMA | Olivetti | Portuguese (international) |
PT2 | 84[13] | ESC 2/8 6/7 | ECMA | IBM | Portugal (Portuguese, Spanish) (Code page 1015[37] |
SE | 10[14][13] | ESC 2/8 4/7[14] | ISO 646 | SEN 850200 Annex B, SIS 63 61 27 | Sweden (basic Swedish) (Code page 1018,[25] D47) |
SE2 | 11[14][13] | ESC 2/8 4/8[14] | ISO 646 | SEN 850200 Annex C, SIS 63 61 27 | Sweden (extended Swedish for names) (Code page 20107,[17][18][38] E47) |
SEFI | 8-1[14] | ESC 2/8 4/3[14] | SIS | NATS-SEFI | Sweden and Finland (journalistic texts). Accompanies DANO (NATS-DANO). |
T.61-7bit | 102 | ESC 2/8 7/5 | ? | ITU/CCITT T.61 Recommendation | International (Teletex). Also used with the corresponding supplementary set as an 8-bit code. |
TW | — | ? | CNS 5205-1996 | Republic of China (Taiwan) | |
US / (IRV) | 6[14][13] | ESC 2/8 4/2[14] | ISO 646 | ANSI X3.4-1968 and ISO 646:1983 (also IRV in ISO/IEC 646:1991) | United States (ASCII, Code page 367,[39] 20127[17][18][40]) |
YU | 141 | ESC 2/8 7/10 | ISO 646 | JUS I.B1.002 (YUSCII) | former Yugoslavia (Croatian, Slovene, Serbian, Bosnian) |
INIS | 49 | ESC 2/8 5/7 | IAEA | INIS | ISO 646 IRV subset |
National derivatives
Some national character sets also exist which are based on ISO 646 but do not strictly follow its invariant set (see also § Derivatives for other alphabets):
Character set | ISO-IR | ISO ESC | Approved | National Standard | Description |
---|---|---|---|---|---|
BS_viewdata | 47 | ESC 2/8 5/6 | British Post Office | Viewdata and Teletext. Viewdata square (⌗) substituted for normally invariant underscore (_) which cannot be displayed on the target hardware.[41] This is actually the encoding of Microsoft's WST_Engl. | |
GR / greek7 | 88 | ESC 2/8 6/10 | ? | HOS ELOT 927 | Greece (withdrawn in November 1986). Uses Greek letters in place of Roman ones[42] and hence is not strictly speaking an ISO 646 variant. |
greek7-old | 18 | ESC 2/8 5/11 | ECMA | ? | Greek graphic set. Similar in concept to greek7, but uses a different mapping of letters. Also, the upper case follows the lower case. |
latin-greek | 19 | ESC 2/8 5/12 | ECMA | ? | Latin-Greek combined graphics (capitals only). Follows greek7-old, but includes Latin capitals without modification, and Greek capitals over the Latin lower case. |
Latin-greek-1 | 27[14] | ESC 2/8 5/5[14] | ECMA | Honeywell-Bull | Latin-Greek mixed graphics (Greek capitals only).[14] Visually unifies Greek capitals with Latin capitals where possible, and adds the remaining Greek capitals. Unlike the other Greek versions, all Basic Latin letters remain intact. Replaces invariant punctuation as well as national characters, however,[43] and hence is still not strictly speaking an ISO 646 variant. |
swi | — | ECMA | Olivetti | Switzerland (French, German) (Code page 1021[44]) Invariant code point 0x5F is changed from _ to è . Is a DEC NRCS variant, closely related to ISO 646, but lacks a fully ISO 646 compliant equivalent. |
Control characters
All the variants listed above are solely graphical character sets, and are to be used with a C0 control character set such as listed in the following table:
ISO-IR | ISO ESC | Approved | Description |
---|---|---|---|
1[14] | ESC 2/1 4/0[14] | ISO 646 | ISO 646 controls[14] ("ASCII controls") |
7[14] | ESC 2/1 4/1[14] | ISO 646 | Scandinavian newspaper (NATS) controls[14] |
26[14] | ESC 2/1 4/3[14] | ISO 646 | IPTC controls[14] |
Associated supplementary character sets
The following table lists supplementary graphical character sets defined by the same standard as specific ISO 646 variants. These would be selected by using a mechanism such as shift out or the NATS super shift (single shift),[45] or by setting the eighth bit in environments where one was available:
ISO-IR | ISO ESC | National Standard | Description |
---|---|---|---|
8-2[14] | ESC 2/8 4/4[14] | NATS-SEFI-ADD | Supplementary code used with NATS-SEFI. |
9-2[14] | ESC 2/8 4/6[14] | NATS-DANO-ADD | Supplementary code used with NATS-DANO. |
13[14][13] | ESC 2/8 4/9[14] | JIS C 6220:1969-jp | Katakana, used as a supplementary code with ISO-646-JP. |
103 | ESC 2/8 7/6 | ITU/CCITT T.61 Recommendation, Supplementary Set | Supplementary code used with T.61. |
Variant comparison chart
The specifics of the changes for some of these variants are given in the following table. Character assignments unchanged across all listed variants (i.e. which remain the same as ASCII) are not shown.
For ease of comparison, variants detailed include national variants of ISO 646, DEC's closely related National Replacement Character Set (NRCS) series used on VT200 terminals, the related European World System Teletext encoding series defined in ETS 300 706, and a few other closely related encodings based on ISO 646. Individual code charts are linked from the second column. The cells with non-white background emphasize the differences from US-ASCII (also the Basic Latin subset of ISO/IEC 10646 and Unicode).
Several characters could be used as combining characters, when preceded or followed with a backspace C0 control. This is attested in the code charts for IRV, GB, FR1, CA and CA2, which note that "',^
would behave as the diaeresis, acute accent, cedilla and circumflex (rather than quotation marks, a comma and an upward arrowhead) when preceded or followed by a backspace. The tilde character (~) was similarly introduced as a diacritic (˜). This encoding method originated in the typewriter/teletype era when use of backspace would overstamp a glyph, and may be considered deprecated.
Later, when wider character sets gained more acceptance, ISO 8859, vendor-specific character sets and eventually Unicode became the preferred methods of coding most of these variants.
Variant Code | Code Chart | Characters for each ISO 646 / NRCS compatible or derived charset | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
US / IRV (1991) | ISO-IR-006 | ! | " | # | $ | & | : | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ~ |
Older International Reference Versions | |||||||||||||||||||
IRV (1973) | ISO-IR-002 | ! | " | # | ¤ | & | : | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ‾ |
IRV (1983) | CP01009 | ! | " | # | ¤ | & | : | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ~ |
Invariant and other IRV subsets | |||||||||||||||||||
INV | ISO-IR-170 | ! | " | & | : | ? | _ | ||||||||||||
INV (NRCS)[lower-alpha 1] | --- | ! | " | $ | & | : | ? | ||||||||||||
INV (Teletext)[lower-alpha 1] | ETS WST[46] | ! | " | & | : | ? | |||||||||||||
INIS Subset[lower-alpha 1] | ISO-IR-049 | $ | : | [ | ] | | | |||||||||||||
T.61 | ISO-IR-102 | ! | " | # | ¤ | & | : | ? | @ | [ | ] | _ | | | ||||||
East Asian | |||||||||||||||||||
JP | ISO-IR-014 | ! | " | # | $ | & | : | ? | @ | [ | ¥ | ] | ^ | _ | ` | { | | | } | ‾ |
JP-OCR-B | ISO-IR-092 | ! | " | # | $ | & | : | ? | @ | [ | ¥ | ] | ^ | _ | { | | | } | ||
KR | (KS X 1003) | ! | " | # | $ | & | : | ? | @ | [ | ₩ | ] | ^ | _ | ` | { | | | } | ‾ |
CN | ISO-IR-057 | ! | " | # | ¥ | & | : | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ‾ |
TW | (CNS 5205) | ! | " | # | $ | & | : | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ‾ |
British and Irish | |||||||||||||||||||
GB | ISO-IR-004 | ! | " | £ | $ | & | : | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ‾ |
GB (NRCS) | CP01101 | ! | " | £ | $ | & | : | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ~ |
Viewdata[lower-alpha 2][lower-alpha 3] | ISO-IR-047 | ! | " | £ | $ | & | : | ? | @ | ← | ½ | → | ↑ | ⌗ | ― | ¼ | ‖ | ¾ | ÷ |
IE | ISO-IR-207 | ! | " | £ | $ | & | : | ? | Ó | É | Í | Ú | Á | _ | ó | é | í | ú | á |
Francophone | |||||||||||||||||||
FR (1983) | ISO-IR-069 | ! | " | £ | $ | & | : | ? | à | ° | ç | § | ^ | _ | µ | é | ù | è | ¨ |
FR (1973)[lower-alpha 4] | ISO-IR-025 | ! | " | £ | $ | & | : | ? | à | ° | ç | § | ^ | _ | ` | é | ù | è | ¨ |
FR Teletext[lower-alpha 3] | ETS WST[11] | ! | " | é | ï | & | : | ? | à | ë | ê | ù | î | ⌗ | è | â | ô | û | ç |
CA[lower-alpha 4] | ISO-IR-121 | ! | " | # | $ | & | : | ? | à | â | ç | ê | î | _ | ô | é | ù | è | û |
CA2 | ISO-IR-122 | ! | " | # | $ | & | : | ? | à | â | ç | ê | É | _ | ô | é | ù | è | û |
Francophone-Germanophone | |||||||||||||||||||
swi (NRCS)[lower-alpha 3] | CP01021 | ! | " | ù | $ | & | : | ? | à | é | ç | ê | î | è | ô | ä | ö | ü | û |
Germanophone | |||||||||||||||||||
DE[lower-alpha 4][lower-alpha 5] | ISO-IR-021 | ! | " | # | $ | & | : | ? | § | Ä | Ö | Ü | ^ | _ | ` | ä | ö | ü | ß |
Nordic (Eastern) and Baltic | |||||||||||||||||||
FI / SE | ISO-IR-010 | ! | " | # | ¤ | & | : | ? | @ | Ä | Ö | Å | ^ | _ | ` | ä | ö | å | ‾ |
SE2[lower-alpha 6] | ISO-IR-011 | ! | " | # | ¤ | & | : | ? | É | Ä | Ö | Å | Ü | _ | é | ä | ö | å | ü |
SE (NRCS) | CP01106 | ! | " | # | $ | & | : | ? | É | Ä | Ö | Å | Ü | _ | é | ä | ö | å | ü |
FI (NRCS) | CP01103 | ! | " | # | $ | & | : | ? | @ | Ä | Ö | Å | Ü | _ | é | ä | ö | å | ü |
SEFI (NATS)[lower-alpha 7] | ISO-IR-008-1 | ! | " | # | $ | & | : | ? | |
Ä | Ö | Å | ■ | _ | |
ä | ö | å | – |
EE (Teletext)[lower-alpha 3] | ETS WST[11] | ! | " | # | õ | & | : | ? | Š | Ä | Ö | Ž | Ü | Õ | š | ä | ö | ž | ü |
LV / LT (Teletext)[lower-alpha 3] | ETS WST[11] | ! | " | # | $ | & | : | ? | Š | ė | ę | Ž | č | ū | š | ą | ų | ž | į |
Nordic (Western) | |||||||||||||||||||
DK | CP01017 | ! | " | # | ¤ | & | : | ? | @ | Æ | Ø | Å | Ü | _ | ` | æ | ø | å | ü |
DK/NO (NRCS) | CP01105 | ! | " | # | $ | & | : | ? | Ä | Æ | Ø | Å | Ü | _ | ä | æ | ø | å | ü |
DK/NO-alt (NRCS) | CP01107 | ! | " | # | $ | & | : | ? | @ | Æ | Ø | Å | ^ | _ | ` | æ | ø | å | ~ |
NO | ISO-IR-060 | ! | " | # | $ | & | : | ? | @ | Æ | Ø | Å | ^ | _ | ` | æ | ø | å | ‾ |
NO2 | ISO-IR-061 | ! | " | § | $ | & | : | ? | @ | Æ | Ø | Å | ^ | _ | ` | æ | ø | å | | |
DANO (NATS)[lower-alpha 7][lower-alpha 8] | ISO-IR-009-1 | ! | « | » | $ | & | : | ? | |
Æ | Ø | Å | ■ | _ | |
æ | ø | å | – |
IS | ! | " | # | $ | & | : | ? | Ð | Þ | \ | Æ | Ö | _ | ð | þ | | | æ | ö | |
Hispanophone | |||||||||||||||||||
ES | ISO-IR-017 | ! | " | # | $ | & | : | ? | § | ¡ | Ñ | ¿ | ^ | _ | ` | ° | ñ | ç | ~ |
ES (NRCS) | CP01023 | ! | " | £ | $ | & | : | ? | § | ¡ | Ñ | ¿ | ^ | _ | ` | ° | ñ | ç | ~ |
ES2 | ISO-IR-085 | ! | " | # | $ | & | : | ? | · | ¡ | Ñ | Ç | ¿ | _ | ` | ´ | ñ | ç | ¨ |
CU | ISO-IR-151 | ! | " | # | ¤ | & | : | ? | @ | ¡ | Ñ | ] | ¿ | _ | ` | ´ | ñ | [ | ¨ |
Hispanophone-Lusophone | |||||||||||||||||||
ES/PT Teletext[lower-alpha 3] | ETS WST[11] | ! | " | ç | $ | & | : | ? | ¡ | á | é | í | ó | ú | ¿ | ü | ñ | è | à |
Lusophone | |||||||||||||||||||
PT | ISO-IR-016 | ! | " | # | $ | & | : | ? | § | Ã | Ç | Õ | ^ | _ | ` | ã | ç | õ | ° |
PT2 | ISO-IR-084 | ! | " | # | $ | & | : | ? | ´ | Ã | Ç | Õ | ^ | _ | ` | ã | ç | õ | ~ |
PT (NRCS) | --- | ! | " | # | $ | & | : | ? | @ | Ã | Ç | Õ | ^ | _ | ` | ã | ç | õ | ~ |
Greek | |||||||||||||||||||
Latin-GR mixed[lower-alpha 3] | ISO-IR-027 | Ξ | " | Γ | ¤ | & | Ψ | Π | Δ | Ω | Θ | Φ | Λ | Σ | ` | { | | | } | ‾ |
ISO-IR-088 (GR / ELOT 927), ISO-IR-018 and ISO-IR-019 replace Roman letters with Greek letters and are detailed in a separate chart. | |||||||||||||||||||
Slavic (Latin script) | |||||||||||||||||||
YU | ISO-IR-141 | ! | " | # | $ | & | : | ? | Ž | Š | Đ | Ć | Č | _ | ž | š | đ | ć | č |
YU Teletext[lower-alpha 3] | ETS WST[11] | ! | " | # | Ë | & | : | ? | Č | Ć | Ž | Đ | Š | ë | č | ć | ž | đ | š |
YU-alt Teletext[lower-alpha 3] | ETS WST[11] | ! | " | # | $ | & | : | ? | Č | Ć | Ž | Đ | Š | ë | č | ć | ž | đ | š |
CS/CZ/SK (Teletext)[lower-alpha 3] | ETS WST[11] | ! | " | # | ů | & | : | ? | č | ť | ž | ý | í | ř | é | á | ě | ú | š |
pl | (BN-74/3101-01) | ! | " | # | zł | & | : | ? | ę | ź | \ | ń | ś | _ | ą | ó | ł | ż | ć |
PL Teletext[lower-alpha 3] | ETS WST[11] | ! | " | # | ń | & | : | ? | ą | zł | Ś | Ł | ć | ó | ę | ż | ś | ł | ź |
Adaptations for the Cyrillic script replace Roman letters and are detailed in a separate chart | |||||||||||||||||||
Other | |||||||||||||||||||
NL | CP01019 | ! | " | # | $ | & | : | ? | @ | [ | \ | ] | ^ | _ | ` | { | | | } | ‾ |
NL NRCS | CP01102 | ! | " | £ | $ | & | : | ? | ¾ | ij | ½ | | | ^ | _ | ` | ¨ | ƒ | ¼ | ´ |
IT[lower-alpha 4] | ISO-IR-015 | ! | " | £ | $ | & | : | ? | § | ° | ç | é | ^ | _ | ù | à | ò | è | ì |
IT (Teletext)[lower-alpha 3] | ETS WST[11] | ! | " | £ | $ | & | : | ? | é | ° | ç | → | ↑ | ⌗ | ù | à | ò | è | ì |
HU | ISO-IR-086 | ! | " | # | ¤ | & | : | ? | Á | É | Ö | Ü | ^ | _ | á | é | ö | ü | ˝ |
MT | --- | ! | " | # | $ | & | : | ? | @ | ġ | ż | ħ | ^ | _ | ċ | Ġ | Ż | Ħ | Ċ |
RO (Teletext)[lower-alpha 3] | ETS WST[11] | ! | " | # | ¤ | & | : | ? | Ţ | Â | Ş | Ă | Î | ı | ţ | â | ş | ă | î |
TR (Teletext)[lower-alpha 3] | ETS WST[11] | ! | " | TL | ğ | & | : | ? | İ | Ş | Ö | Ç | Ü | Ğ | ı | ş | ö | ç | ü |
- 1 2 3 Is a subset of one of the International Reference Versions of ISO 646, but does not include all characters which are present in the invariant set. Included for comparison.
- ↑ Also UK Teletext.
- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Does not completely conform to the invariant set, but is a closely related derivative of ISO 646. Included here for comparison.
- 1 2 3 4 ISO 646 variant identical to NRCS variant.
- ↑ Also World System Teletext (DE)
- ↑ Also World System Teletext (SE/FI/HU)
- 1 2 The NATS charsets (e.g. NATS-SEFI) replace
@
(0x40) and`
(0x60) with "Unit space A" (UA) and "Unit space B" (UB). The plain space (0x20) expands on justification. UA and UB are for fixed widths, UA must be at least as wide as UB. RFC 1345 maps UA and UB to ISO 10646 (UCS) code points U+E002 and U+E003, both in the Private Use Area, respectively (although it also lists PUA mappings for several other characters which now have UCS code points). Unicode contains a number of space characters which might approximately correspond. - ↑ Conformance to the ISO 646 invariant set is questionable, but it is a closely related derivative of ISO 646. Included here for comparison.
Derivatives for other alphabets
Some 7-bit character sets for non-Latin alphabets are derived from the ISO 646 standard: these do not themselves constitute ISO 646 due to not following its invariant code points (often replacing the letters of at least one case), due to supporting differing alphabets which the set of national code points provide insufficient encoding space for. Examples include:
- 7-bit Turkmen (ISO-IR-230).[47]
- 7-bit Greek.
- In ELOT 927 (ISO-IR-088),[42] the Greek alphabet is mapped in alphabetical order (except for the final-sigma) to positions 0x61–0x71 and 0x73–0x79, on top of the Latin lowercase letters.
- ISO-IR-018[48] maps the Greek alphabet over both letter cases using a different scheme (not in alphabetical order, but trying where possible to match Greek letters over Roman letters which correspond in some sense), and ISO-IR-019[49] maps the Greek uppercase alphabet over the Latin lowercase letters using the same scheme as ISO-IR-018.
- The lower half of the Symbol font character encoding[50] uses its own scheme for mapping Greek letters of both cases over the ASCII Roman letters, also trying to map Greek letters over Roman letters which correspond in some sense, but making different decisions in this regard (see chart below). It also replaces invariant code points 0x22 and 0x27 and five national code points with mathematical symbols. Although not intended for use in typesetting Greek prose, it is sometimes used for that purpose.
- ISO-IR-027[43] (detailed in the chart above rather than below) includes the Latin alphabet unchanged, but adds some Greek capital letters which cannot be represented with Latin-script homoglyphs; while it is explicitly based on ISO 646, some of these are mapped to code points which are invariant in ISO 646 (0x21, 0x3A and 0x3F), and it is therefore not a true ISO 646 variant.
- The World System Teletext encoding for Greek uses yet another scheme of mapping Greek letters in alphabetical order over the ASCII letters of both cases, notably including several letters with diacritics.[51]
- 7-bit Cyrillic
- KOI-7 or Short KOI, used for Russian. The Cyrillic characters are mapped to positions 0x60–0x7E, on top of the Latin lowercase letters, matching homologous letters where possible (where в is mapped to w, not v). Superseded by the KOI-8 variants.
- SRPSCII and MAKSCII, Cyrillic variants of YUSCII (the Latin variant is YU/ISO-IR-141 in the chart above), used for Serbian and Macedonian respectively. Largely homologous to the Latin variant of YUSCII (following Serbian digraphia rules), except for Љ (lj), Њ (nj), Џ (dž) and ѕ (dz), which correspond to digraphs in Latin-script orthography, and are mapped over letters which are not used in Serbian or Macedonian (q, w, x, y).
- World System Teletext encodings for Russian/Bulgarian[52] and Ukrainian[53] use G0 sets similar to KOI-7 with some modifications. The corresponding encoding for Serbian Cyrillic and Macedonian[lower-alpha 1][54] uses a scheme based on the Teletext encoding for Latin-script Serbo-Croatian and Slovene, as opposed to the significantly different YUSCII.
- 7-bit Hebrew, SI 960. The Hebrew alphabet is mapped to positions 0x60–0x7A, on top of the lowercase Latin letters (and grave accent for aleph). 7-bit Hebrew was always stored in visual order. This mapping with the high bit set, i.e. with the Hebrew letters in 0xE0–0xFA, is ISO 8859-8. The World System Teletext encoding for Hebrew uses the same letter mappings, but uses BS_Viewdata as its base encoding (whereas SI 960 uses US-ASCII) and includes a shekel sign at 0x7B.
- 7-bit Arabic, ASMO 449 (ISO-IR-089).[55] The Arabic alphabet is mapped to positions 0x41–0x5A and 0x60–0x6A, on top of both uppercase and lowercase Latin letters.
A comparison of some of these encodings is below. Only one case is shown, except in instances where the cases are mapped to different letters. In such instances, the mapping with the smallest code is shown first. Possible transcriptions are given for some letters; where this is omitted, the letter can be considered to correspond to the Roman one which it is mapped over.
English (ASCII) | Cyrillic alphabets | Greek alphabet | Hebrew | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Semi-transliterative | Naturally ordered | ||||||||||
Russian (KOI-7) | Russian, Bulgarian (WST RU/BG) | Ukrainian (WST UKR) | Serbian (SRPSCII) | Macedonian (MAKSCII) | Serbian, Macedonian (WST SRP) | Greek (Symbol) | Greek (IR-18) | Greek (ELOT 927) | Greek (WST EL) | Hebrew (SI 960) | |
@ ` | Ю (ju/yu) | Ю (ju/yu) | Ю (ju/yu) | Ж (ž) | Ж (ž) | Ч (č) | ≅ ‾ | ´ ` | @ ` | ΐ ΰ | א (ʾ/ʔ) |
A | А | А (a/á) | А | А | А | А | Α | Α | Α | Α | ב (b) |
B | Б | Б | Б | Б | Б | Б | Β | Β | Β | Β | ג (g) |
C | Ц (c/ts) | Ц (c/ts) | Ц (c/ts) | Ц (c/ts) | Ц (c/ts) | Ц (c/ts) | Χ (ch/kh) | Ψ (ps) | Γ (g) | Γ (g) | ד (d) |
D | Д | Д | Д | Д | Д | Д | Δ | Δ | Δ | Δ | ה (h) |
E | Е (je/ye) | Е (je/ye) | Е (e) | Е (e) | Е (e) | Е (e) | Ε | Ε | Ε | Ε | ו (w) |
F | Ф | Ф | Ф | Ф | Ф | Ф | Φ (ph/f) | Φ (ph/f) | Ζ (z) | Ζ (z) | ז (z) |
G | Г | Г | Г | Г | Г | Γ | Γ | Γ | Η (ē) | Η (ē) | ח (ch/kh) |
H | Х (h/kh/ch) | Х (h/kh/ch) | Х (h/kh/ch) | Х (h/kh/ch) | Х (h/kh/ch) | Х (h/kh/ch) | Η (ē) | Η (ē) | Θ (th) | Θ (th) | ט (tt) |
I | И | И | И (y) | И | И | И | Ι | Ι | Ι | Ι | י (j/y) |
J | Й (j/y) | Й (j/y) | Й (j/y) | Ј (j/y) | Ј (j/y) | Ј (j/y) | ϑ (th) ϕ (ph/f) | Ξ (x/ks) | Κ (k) | ך (k final) | |
K | К | К | К | К | К | К | Κ | Κ | Κ | Λ (l) | כ |
L | Л | Л | Л | Л | Л | Л | Λ | Λ | Λ | Μ (m) | ל |
M | М | М | М | М | М | М | Μ | Μ | Μ | Ν (n) | ם (m final) |
N | Н | Н | Н | Н | Н | Н | Ν | Ν | Ν | Ξ (x/ks) | מ (m) |
O | О | О | О | О | О | О | Ο | Ο | Ξ (x/ks) | Ο | ן (n final) |
P | П | П | П | П | П | П | Π | Π | Ο (o) | Π | נ (n) |
Q | Я (ja/ya) | Я (ja/ya) | Я (ja/ya) | Љ (lj/ly) | Љ (lj/ly) | Ќ (Ḱ/kj) | Θ (th) | ͺ ( | Π (p) | Ρ (r) | ס (s) |
R | Р | Р | Р | Р | Р | Р | Ρ | Ρ | Ρ | ʹ ς (s final) | ע (ʿ/ŋ) |
S | С | С | С | С | С | С | Σ | Σ | Σ | Σ | ף (p final) |
T | Т | Т | Т | Т | Т | Т | Τ | Τ | Τ | Τ | פ (p) |
U | У | У | У | У | У | У | Υ | Θ (th) | Υ | Υ | ץ (ṣ/ts final) |
V | Ж (ž) | Ж (ž) | Ж (ž) | В | В | В | ς (s final) ϖ (p) | Ω (ō) | Φ (f/ph) | Φ (f/ph) | צ (ṣ/ts) |
W | В (v) | В (v) | В (v) | Њ (nj/ny/ñ) | Њ (nj/ny/ñ) | Ѓ (ǵ/gj) | Ω (ō) | ς (s final) | ς (s final) | Χ (ch/kh) | ק (q) |
X | Ь (’) | Ь (’) | Ь (’) | Џ (dž) | Џ (dž) | Љ (lj/ly) | Ξ | Χ (ch/kh) | Χ (ch/kh) | Ψ (ps) | ר (r) |
Y | Ы (y/ı) | Ъ (″/ǎ/ŭ) | І (i) | Ѕ (dz) | Ѕ (dz) | Њ (nj/ny/ñ) | Ψ (ps) | Υ (u) | Ψ (ps) | Ω (ō) | ש (š/sh) |
Z | З | З | З | З | З | З | Ζ | Ζ | Ω (ō) | Ϊ | ת (t) |
[ { | Ш (š/sh) | Ш (š/sh) | Ш (š/sh) | Ш (š/sh) | Ш (š/sh) | Ћ (ć) | [ { | ᾿̃ ῾̃ | [ { | Ϋ | [ { |
\ | | Э (e) | Э (e) | Є (je/ye) | Ђ (đ/dj) | Ѓ (ǵ/gj) | Ж (ž) | ∴ | | ᾿ ῾ (h) | \ | | ά ό | \ | |
] } | Щ (šč) | Щ (šč) | Щ (šč) | Ћ (ć) | Ќ (Ḱ/kj) | Ђ (đ/dj) | ] } | ᾿´ ῾´ | ] } | έ ύ | ] } |
^ ~ | Ч (č) | Ч (č) | Ч (č) | Ч (č) | Ч (č) | Ш (š/sh) | ⊥ ~ | ˜ ¨ | ^ ‾ | ή ώ | ^ ‾ |
_ | Ъ (″) | Ы (y/ı) | Ї (ji/yi) | _ | _ | Џ (dž) | _ | _ | _ | ί | _ |
See also
- ISO basic Latin alphabet (consisting exactly of the letters in ISO 646)
- ASCII
- DEC National Replacement Character Set (NRCS)
- ISO/IEC 2022 Information technology: Character code structure and extension techniques
- ISO/IEC 6937 (ANSI)
- C Trigraph
- ITU T.50
- ISO/IEC JTC 1/SC 2
Footnotes
- ↑ Labelled "Cyrillic G0 Primary Set - Option 1 - Serbian/Croatian", but includes letters specific to Macedonian (and Croatian is written in Latin script).
References
- ↑ Mullendore, Ralph Elvin (1964) [1963]. Ptak, John F., ed. "On the Early Development of ASCII - The History of ASCII". JF Ptak Science Books (published March 2012). Archived from the original on 2016-05-26. Retrieved 2016-05-26.
- ↑ 6 and 7 Bit Coded Character Sets for Information Processing Interchange (draft), International Organization for Standardization, July 1964 (NB. 21 pages. With cover letter for the members of the X3.2 and Task Groups from Eric Clamons.)
- 1 2 3 Mackenzie, Charles E. (1980). Coded Character Sets, History and Development. The Systems Programming Series (1 ed.). Addison-Wesley Publishing Company, Inc. pp. 7, 9, 412. ISBN 0-201-14460-3. LCCN 77-90165. ISBN 978-0-201-14460-4. Retrieved 2016-05-22.
- 1 2 3 4 5 6 7 Standard ECMA-6: 7-Bit Coded Character Set (PDF) (5th ed.). Geneva, Switzerland: European Computer Manufacturers Association (Ecma). March 1985. Archived (PDF) from the original on May 29, 2016. Retrieved 2016-05-29.
The Technical Committee TC1 of ECMA met for the first time in December 1960 to prepare standard codes for Input/Output purposes. On April 30, 1965, Standard ECMA-6 was adopted by the General Assembly of ECMA.
- ↑ Bodfish, John; Wilson, Mark; Gregory, Stephen; Nye, Julie Blume. Bodfish, John, ed. "Invariant Character Handling". NISO Circulation Interchange Protocol. Colorado Department of Education, USA: NCIP Standing Committee (NCIP-SC). Archived from the original on 2013-12-24. Retrieved 2016-05-30.
- ↑ Demchenko, Yuri (2000) [1997]. "International Standardization of 7-Bit Codes, ISO 646". TERENA. 4. Archived from the original on 2016-06-17. Retrieved 2012-08-13.
- 1 2 3 Standard ECMA-6: 7-Bit coded Character Set (PDF) (6th ed.). Geneva, Switzerland: European Computer Manufacturers Association (Ecma). August 1997 [December 1991]. Archived (PDF) from the original on 2016-05-29. Retrieved 2016-05-29.
- ↑ "Information processing -- ISO 7-bit coded character set for information interchange". 1983-07-01. ISO 646:1983. Archived from the original on 2016-05-30. Retrieved 2016-05-30.
- ↑ "Information technology -- ISO 7-bit coded character set for information interchange" (3rd ed.). 1991-12-16. ISO/IEC 646:1991. Archived from the original on 2016-05-30. Retrieved 2016-05-30.
- 1 2 Standard ECMA-6: 7-Bit Input/Output Coded Character Set (PDF) (4th ed.). Geneva, Switzerland: European Computer Manufacturers Association (Ecma). August 1973. Archived (PDF) from the original on 2016-05-29. Retrieved 2016-05-29.
- 1 2 3 4 5 6 7 8 9 10 11 12 "15.6.2 Latin National Option Sub-Sets, Table 36". ETS 300 706: Enhanced Teletext specification (PDF). European Telecommunications Standards Institute (ETSI). p. 115.
- ↑ "SBCS code page information - CPGID: 01020 / Name: Canadian (French) Variant". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1992-10-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 "HP PCL/PJL Reference PCL 5 Comparison Guide" (PDF) (2 ed.). Hewlett-Packard Company, LP. June 2003. HP part-number 502-0378. Archived from the original (PDF) on 2016-08-10. Retrieved 2016-08-10.
- 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 Bemer, Robert William (1980). "Chapter 1: Inside ASCII". General Purpose Software (PDF). Best of Interface Age. 2. Portland, OR, USA: dilithium Press. pp. 1–50. ISBN 0-918398-37-1. LCCN 79-67462. Archived from the original on 2016-08-27. Retrieved 2016-08-27, from: Bemer, Robert William (May 1978). "Inside ASCII - Part I". Interface Age. Portland, OR, USA: dilithium Press. 3 (5): 96–102. , Bemer, Robert William (June 1978). "Inside ASCII - Part II". Interface Age. Portland, OR, USA: dilithium Press. 3 (6): 64–74. , Bemer, Robert William (July 1978). "Inside ASCII - Part III". Interface Age. Portland, OR, USA: dilithium Press. 3 (7): 80–87.
- ↑ "Graphic Character Set ISO-IR-009-1" (PDF). Itscj.ipsj.or.jp. Retrieved 1 February 2018.
- ↑ "SBCS code page information - CPGID: 01011 / Name: 7-Bit Germany F.R." IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- 1 2 3 4 5 "Code Page Identifiers". Microsoft Developer Network. Microsoft. 2014. Archived from the original on 2016-06-19. Retrieved 2016-06-19.
- 1 2 3 4 5 "Web Encodings - Internet Explorer - Encodings". WHATWG Wiki. 2012-10-23. Archived from the original on 2016-06-20. Retrieved 2016-06-20.
- ↑ Foller, Antonin (2014) [2011]. "German (IA5) encoding - Windows charsets". WUtils.com - Online web utility and help. Motobit Software. Archived from the original on 2016-06-20. Retrieved 2016-06-20.
- ↑ Danish Standard DS 2089: Application of ISO 7-bit coded character set. February 1974. UDC 681.3:003.62.
- ↑ Stroustrup, Bjarne (1994-03-29). Design and Evolution of C++ (1st ed.). Addison-Wesley Publishing Company. ISBN 0-201-54330-3.
- ↑ "SBCS code page information - CPGID: 01017 / Name: 7-Bit Denmark". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ "SBCS code page information - CPGID: 01023 / Name: Spain Variant". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1992-10-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ "SBCS code page information - CPGID: 01014 / Name: 7-Bit Spain". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-10-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- 1 2 "SBCS code page information - CPGID: 01018 / Name: 7-Bit Finland/Sweden". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ "SBCS code page information - CPGID: 01010 / Name: 7-Bit France". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ "SBCS code page information - CPGID: 01104 / Name: French NRC Set". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-21. Retrieved 2016-06-21.
- ↑ "SBCS code page information - CPGID: 01013 / Name: 7-Bit United Kingdom". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ "Graphic Character Set ISO-IR-002" (PDF). Itscj.ipsj.or.jp. Retrieved 1 February 2018.
- ↑ "SBCS code page information - CPGID: 01009 / Name: ISO IRV". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1990-04-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ Foller, Antonin (2014) [2011]. "Western European (IA5) encoding - Windows charsets". WUtils.com - Online web utility and help. Motobit Software. Archived from the original on 2016-06-20. Retrieved 2016-06-20.
- ↑ "SBCS code page information - CPGID: 01012 / Name: 7-Bit Italy". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ "SBCS code page information - CPGID: 00895 / Name: Japan 7-Bit Latin". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1986-10-01. Archived from the original on 2016-06-18. Retrieved 2016-06-18.
- ↑ "SBCS code page information - CPGID: 01019 / Name: 7-Bit Netherlands". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ "SBCS code page information - CPGID: 01016 / Name: 7-Bit Norway". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ Foller, Antonin (2014) [2011]. "Norwegian (IA5) encoding - Windows charsets". WUtils.com - Online web utility and help. Motobit Software. Archived from the original on 2016-06-20. Retrieved 2016-06-20.
- ↑ "SBCS code page information - CPGID: 01015 / Name: 7-Bit Portugal". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1987-08-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ Foller, Antonin (2014) [2011]. "Swedish (IA5) encoding - Windows charsets". WUtils.com - Online web utility and help. Motobit Software. Archived from the original on 2016-06-20. Retrieved 2016-06-20.
- ↑ "SBCS code page information - CPGID: 00367 / Name: ASCII". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1978-01-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ Foller, Antonin (2014) [2011]. "US-ASCII encoding - Windows charsets". WUtils.com - Online web utility and help. Motobit Software. Archived from the original on 2016-06-20. Retrieved 2016-06-20.
- ↑ "Graphic Character Set ISO-IR-047" (PDF). Information Technology Standards Commission of Japan (ITSCJ/IPSJ). Retrieved 1 February 2018.
- 1 2 "Graphic Character Set ISO-IR-088" (PDF). Information Technology Standards Commission of Japan (ITSCJ/IPSJ). Retrieved 1 February 2018.
- 1 2 "Graphic Character Set ISO-IR-027" (PDF). Information Technology Standards Commission of Japan (ITSCJ/IPSJ). Retrieved 1 February 2018.
- ↑ "SBCS code page information - CPGID: 01021 / Name: Switzerland Variant". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. 1. IBM. 1992-10-01. Archived from the original on 2016-06-17. Retrieved 2016-06-17.
- ↑ "Graphic Character Set ISO-IR-007" (PDF). Itscj.ipsj.or.jp. Retrieved 1 February 2018.
- ↑ "15.6.1
Latin G0 Set". ETS 300 706: Enhanced Teletext specification (PDF). European Telecommunications Standards Institute (ETSI). p. 114. line feed character in
|section=
at position 7 (help) - ↑ "Graphic Character Set ISO-IR-230" (PDF). Information Technology Standards Commission of Japan (ITSCJ/IPSJ). Retrieved 23 March 2018.
- ↑ "Graphic Character Set ISO-IR-018" (PDF). Information Technology Standards Commission of Japan (ITSCJ/IPSJ). Retrieved 1 February 2018.
- ↑ "Graphic Character Set ISO-IR-019" (PDF). Information Technology Standards Commission of Japan (ITSCJ/IPSJ). Retrieved 1 February 2018.
- ↑ "Map (external version) from Mac OS Symbol character set to Unicode 4.0 and later".
- ↑ "15.6.8: Greek G0 Set". ETS 300 706: Enhanced Teletext specification (PDF). European Telecommunications Standards Institute (ETSI). p. 121.
- ↑ "15.6.5: Cyrillic G0 Set - Option 2 - Russian/Bulgarian". ETS 300 706: Enhanced Teletext specification (PDF). European Telecommunications Standards Institute (ETSI). p. 118.
- ↑ "15.6.6: Cyrillic G0 Set - Option 3 - Ukrainian". ETS 300 706: Enhanced Teletext specification (PDF). European Telecommunications Standards Institute (ETSI). p. 119.
- ↑ "15.6.4: Cyrillic G0 Set - Option 1 - Serbian/Croatian". ETS 300 706: Enhanced Teletext specification (PDF). European Telecommunications Standards Institute (ETSI). p. 117.
- ↑ "Graphic Character Set ISO-IR-089" (PDF). Information Technology Standards Commission of Japan (ITSCJ/IPSJ).
External links
- Zeichensatz nach ISO 646 (ASCII) (in German)
- History at GNU Aspell website
- ISO646 Character Tables Character Tables by Koichi Yasuoka (安岡孝) (see Domestic ISO646 Character Tables and Quasi-ISO646 Character Tables)
- Turkish Text Deasciifier a tool (based on statistical pentagram analysis of the Turkish language) which reverts an ASCII'fied Turkish text by determining the appropriate (but ambiguous) diacritics normally needed in Turkish but missing in the US-ASCII set.