ISO/IEC 2022

ISO 2022
Language(s)	Various.
Standard	ISO 2022, ECMA 35, JIS X 0202
Classification	Stateful encoding
Transforms / Encodes	US-ASCII and, depending on implementation: GB 2312; JIS X 0201; JIS X 0208; JIS X 0212; JIS X 0213; KS X 1001; CNS 11643; Various others;
Succeeded by	ISO 10646 (Unicode)

ISO/IEC 2022 Information technology—Character code structure and extension techniques, is an ISO standard (equivalent to the ECMA standard ECMA-35^[1]) specifying

a technique for including multiple character sets in a single character encoding system, and
a technique for representing these character sets in both 7 and 8 bit systems using the same encoding.

Many of the character sets included as ISO/IEC 2022 encodings are 'double byte' encodings where two bytes correspond to a single character. This makes ISO-2022 a variable width encoding. But a specific implementation does not have to implement all of the standard; the conformance level and the supported character sets are defined by the implementation.

Introduction

Many languages or language families not based on the Latin alphabet such as Greek, Cyrillic, Arabic, or Hebrew have historically been represented on computers with different 8-bit extended ASCII encodings. Written East Asian languages, specifically Chinese, Japanese, and Korean, use far more characters than can be represented in an 8-bit computer byte and were first represented on computers with language-specific double byte encodings.

ISO/IEC 2022 was developed as a technique to attack both of these problems: to represent characters in multiple character sets within a single character encoding, and to represent large character sets.

A second requirement of ISO-2022 was that it should be compatible with 7-bit communication channels. So even though ISO-2022 is an 8-bit character set any 8-bit sequence can be reencoded to use only 7-bits without loss and normally only a small increase in size.

To represent multiple character sets, the ISO/IEC 2022 character encodings include escape sequences which indicate the character set for characters which follow. The escape sequences are registered with ISO and follow the patterns defined within the standard. These character encodings require data to be processed sequentially in a forward direction since the correct interpretation of the data depends on previously encountered escape sequences. Note, however, that other standards such as ISO-2022-JP may impose extra conditions such as the current character set is reset to US-ASCII before the end of a line.

To represent large character sets, ISO/IEC 2022 builds on ISO/IEC 646's property that one seven bit character will normally define 94 graphic (printable) characters (in addition to space and 33 control characters). Using two bytes, it is thus possible to represent up to 8836 (94×94) characters; and, using three bytes, up to 830584 (94×94×94) characters. Though the standard defines it, no registered character set uses three bytes (although EUC-TW's unregistered G2 is). For the two-byte character sets, the code point of each character is normally specified in so-called kuten (Japanese: 区点) form (sometimes called quwei (Chinese: 区位), especially when dealing with GB2312 and related standards), which specifies a zone (区, Japanese: ku, Chinese: qu), and the point (Japanese: 点 ten) or position (Chinese: 位 wei) of that character within the zone.

The escape sequences therefore do not only declare which character set is being used, but also, by knowing the properties of these character sets, know whether a 94-, 96-, 8836-, or 830584-character (or some other sized) encoding is being dealt with.

In practice, the escape sequences declaring the national character sets may be absent if context or convention dictates that a certain national character set is to be used. For example, ISO-8859-1 states that no defining escape sequence is needed and RFC 1922, which defines ISO-2022-CN, allows ISO-2022 SHIFT characters to be used without explicit use of escape sequences.

The ISO-2022 definitions of the ISO-8859-X character sets are specific fixed combinations of the components that form ISO-2022. Specifically the lower control characters (C0) the US-ASCII character set (in GL) and the upper control characters (C1) are standard and the high characters (GR) are defined for each of the ISO-8859-X variants; for example ISO-8859-1 is defined by the combination of ISO-IR-1, ISO-IR-6, ISO-IR-77 and ISO-IR-100 with no shifts or character changes allowed.

Although ISO/IEC 2022 character sets using control sequences are still in common use, particularly ISO-2022-JP, most modern e-mail applications are converting to use the simpler Unicode transforms such as UTF-8. The encodings that don't use control sequences, such as the ISO-8859 sets are still very common.

Code structure

ISO/IEC 2022 coding specifies a two-layer mapping between character codes and displayed characters. Escape sequences allow any of a large registry of graphic character sets to be "designated" into one of four working sets, named G0 through G3, and shorter control sequences specify the working set that is "invoked" to interpret bytes in the stream.

Character codes from the 7-bit ASCII graphic range (0x20–0x7F), being on the left side of a character code table, are referred to as "GL" codes (with "GL" standing for "graphics left") while codes from the "high ASCII" range (0xA0–0xFF), if available, are referred to as the "GR" codes ("graphics right").

By default, GL codes specify G0 characters, and GR codes specify G1 characters, but this may be modified with control codes or by prior agreement:

Code	Abbr.	Name	Effect
`0x0F`	SI LS0	Shift In Locking shift zero	GL encodes G0 from now on
`0x0E`	SO LS1	Shift Out Locking shift one	GL encodes G1 from now on
`ESC 0x6E` (n)	LS2	Locking shift two	GL encodes G2 from now on
`ESC 0x6F` (o)	LS3	Locking shift three	GL encodes G3 from now on
`0x8E ESC 0x4E` (N)	SS2	Single shift two	GL encodes G2 for next character only
`0x8F ESC 0x4F` (O)	SS3	Single shift three	GL encodes G3 for next character only
`ESC 0x7E` (~)	LS1R	Locking shift one right	GR encodes G1 from now on
`ESC 0x7D` (})	LS2R	Locking shift two right	GR encodes G2 from now on
`ESC 0x7C` (\|)	LS3R	Locking shift three right	GR encodes G3 from now on

Each of the four working sets may be a 94-character set or a 94ⁿ-character set. Additionally, G1 through G3 may be a 96- or 96ⁿ-character set. When one of the latter is invoked in the GL region, the space and delete characters (codes 0x20 and 0x7F) are not available.

There are additional (rarely used) features for switching control character sets, but this is a single-level lookup: the 0x00–0x1F range is the C0 control character set, the 0x80–0x9F range is the C1 control character set, and there are escape sequences which switch in various alternatives. It is required that any C0 character set include the ESC character at position 0x1B, so that further changes are possible.

As seen in the SS2 and SS3 examples above, single control characters from the C1 control character set may be invoked using only 7 bits using the sequences ESC 0x40 (@) through ESC 0x5F (_). Additional control functions are assigned in the range ESC 0x60 (`) through ESC 0x7E (~). While this article describes escape sequences using the corresponding ASCII characters, they are actually defined in terms of byte values, and the graphic assigned to that byte value may be altered without affecting the control sequence.

Escape sequences to designate character sets take the form ESC I [I...] F, where there are one or more intermediate I bytes from the range 0x20–0x2F, and a final F byte from the range 0x40–0x7F. (The range 0x30–0x3F is reserved for private-use F bytes.) The I bytes identify the type of character set and the working set it is to be designated to, while the F byte identifies the character set itself.

Code	Hex	Abbr.	Name	Effect
`ESC ! F`	`1B 21 F`	CZD	C0-designate	F selects a C0 control character set to be used.
`ESC " F`	`1B 22 F`	C1D	C1-designate	F selects a C1 control character set to be used.
`ESC % F`	`1B 25 F`	DOCS	Designate other coding system	F selects an 8-bit code; use `ESC % @` to return to ISO/IEC 2022.
`ESC % / F`	`1B 25 2F F`	DOCS	Designate other coding system	F selects an 8-bit code; there is no standard way to return.
`ESC & F`	`1B 26 F`	IRR	Identify revised registration	F, adjusted to the range 1-63, indicates which revision of the immediately-following registration is needed, so that old systems know that they are old.
`ESC ( F`	`1B 28 F`	GZD4	G0-designate 94-set	F selects a 94-character set to be used for G0.
`ESC ) F`	`1B 29 F`	G1D4	G1-designate 94-set	F selects a 94-character set to be used for G1.
`ESC * F`	`1B 2A F`	G2D4	G2-designate 94-set	F selects a 94-character set to be used for G2.
`ESC + F`	`1B 2B F`	G3D4	G3-designate 94-set	F selects a 94-character set to be used for G3.
`ESC - F`	`1B 2D F`	G1D6	G1-designate 96-set	F selects a 96-character set to be used for G1.
`ESC . F`	`1B 2E F`	G2D6	G2-designate 96-set	F selects a 96-character set to be used for G2.
`ESC / F`	`1B 2F F`	G3D6	G3-designate 96-set	F selects a 96-character set to be used for G3.
`ESC $ F ESC $ ( F`	`1B 24 F 1B 24 28 F`	GZDM4	G0-designate multibyte 94-set	F selects a 94ⁿ-character set to be used for G0.
`ESC $ ) F`	`1B 24 29 F`	G1DM4	G1-designate multibyte 94-set	F selects a 94ⁿ-character set to be used for G1.
`ESC $ * F`	`1B 24 2A F`	G2DM4	G2-designate multibyte 94-set	F selects a 94ⁿ-character set to be used for G2.
`ESC $ + F`	`1B 24 2B F`	G3DM4	G3-designate multibyte 94-set	F selects a 94ⁿ-character set to be used for G3.
`ESC $ - F`	`1B 24 2D F`	G1DM6	G1-designate multibyte 96-set	F selects a 96ⁿ-character set to be used for G1.
`ESC $ . F`	`1B 24 2E F`	G2DM6	G2-designate multibyte 96-set	F selects a 96ⁿ-character set to be used for G2.
`ESC $ / F`	`1B 24 2F F`	G3DM6	G3-designate multibyte 96-set	F selects a 96ⁿ-character set to be used for G3.

Note that the registry of F bytes is independent for the different types. The 94-character graphic set designated by ESC ( A through ESC + A is not related in any way to the 96-character set designated by ESC - A through ESC / A. And neither of those is related to the 94ⁿ-character set designated by ESC $ ( A through ESC $ + A, and so on; the final bytes must be interpreted in context. (Indeed, without any intermediate bytes, ESC A is a way of specifying the C1 control code 0x81.)

Also note that C0 and C1 control character sets are independent; the C0 control character set designated by ESC ! A (which happens to be the NATS control set for newspaper text transmission) is not the same as the C1 control character set designated by ESC " A (the CCITT attribute control set for Videotex).

Additional I bytes may be added before the F byte to extend the F byte range. This is currently only used with 94-character sets, where codes of the form ESC ( ! F have been assigned. At the other extreme, no multibyte 96-sets have been registered, so the sequences above are strictly theoretical.

ISO/IEC 2022 character sets

Various ISO 2022 and other CJK encodings supported by Mozilla Firefox as of 2004. (This support has been reduced in later versions to avoid certain cross site scripting attacks.)

Character encodings using ISO/IEC 2022 mechanism include:

ISO-2022-JP. A widely used encoding for Japanese. Starts in ASCII and includes the following escape sequences
- ESC ( B to switch to ASCII (1 byte per character)
- ESC ( J to switch to JIS X 0201-1976 (ISO/IEC 646:JP) Roman set (1 byte per character)
- ESC $ @ to switch to JIS X 0208-1978 (2 bytes per character)
- ESC $ B to switch to JIS X 0208-1983 (2 bytes per character)
ISO-2022-JP-1. The same as ISO-2022-JP with one additional escape sequence
- ESC $ ( D to switch to JIS X 0212-1990 (2 bytes per character)
ISO-2022-JP-2. A multilingual extension of ISO-2022-JP. The same as ISO-2022-JP-1 with the following additional escape sequences ^[2]
- ESC $ A to switch to GB 2312-1980 (2 bytes per character)
- ESC $ ( C to switch to KS X 1001-1992 (2 bytes per character)
- ESC . A to switch to ISO/IEC 8859-1 high part, Extended Latin 1 set (1 byte per character) [designated to G2]
- ESC . F to switch to ISO/IEC 8859-7 high part, Basic Greek set (1 byte per character) [designated to G2]
ISO-2022-JP-3. The same as ISO-2022-JP with three additional escape sequences
- ESC ( I to switch to JIS X 0201-1976 Kana set (1 byte per character)
- ESC $ ( O to switch to JIS X 0213-2000 Plane 1 (2 bytes per character)
- ESC $ ( P to switch to JIS X 0213-2000 Plane 2 (2 bytes per character)
ISO-2022-JP-2004. The same as ISO-2022-JP-3 with one additional escape sequence
- ESC $ ( Q to switch to JIS X 0213-2004 Plane 1 (2 bytes per character)
ISO-2022-KR. An encoding for Korean.
- ESC $ ) C to switch to KS X 1001-1992,^[3]^[4] previously named KS C 5601-1987 (2 bytes per character) [designated to G1]
ISO-2022-CN. An encoding for Chinese.
- ESC $ ) A to switch to GB 2312-1980 (2 bytes per character) [designated to G1]
- ESC $ ) G to switch to CNS 11643-1992 Plane 1 (2 bytes per character) [designated to G1]
- ESC $ * H to switch to CNS 11643-1992 Plane 2 (2 bytes per character)
ISO-2022-CN-EXT. The same as ISO-2022-CN with six additional escape sequences
- ESC $ ) E to switch to ISO-IR-165 (2 bytes per character) [designated to G1]
- ESC $ + I to switch to CNS 11643-1992 Plane 3 (2 bytes per character) [designated to G3]
- ESC $ + J to switch to CNS 11643-1992 Plane 4 (2 bytes per character) [designated to G3]
- ESC $ + K to switch to CNS 11643-1992 Plane 5 (2 bytes per character) [designated to G3]
- ESC $ + L to switch to CNS 11643-1992 Plane 6 (2 bytes per character) [designated to G3]
- ESC $ + M to switch to CNS 11643-1992 Plane 7 (2 bytes per character) [designated to G3]

The character after the ESC (for single-byte character sets) or ESC $ (for multi-byte character sets) specifies the type of character set and working set that is designated to. In the above examples, the character ( (0x28) designates a 94-character set to the G0 character set. This may be replaced by ), * or + (0x29–0x2B) to designate to the G1–G3 character sets.

Two of the codes above are 96-character codes, and in the above examples, the character - (0x2D) designates to the G1 character set. This may be replaced with . or / (0x2E or 0x2F) to designate to the G2 or G3 character sets. As mentioned earlier, a 96-character set may not be designated to the G0 set.

There are three special cases for multi-byte codes. The code sequences ESC $ @, ESC $ A, and ESC $ B were all registered before the ISO/IEC 2022 standard was finalized, so must be accepted as synonyms for the sequences ESC $ ( @ through ESC $ ( B to designate to the G0 character set. The latter form may also be used, and may be adapted by changing the ( character to designate to the G1 through G3 character sets.

The standard also defines a way to specify coding systems that do not follow its own structure. Of particular interest, the sequence ESC % G designates the UTF-8 coding system, which does not reserve the range 0x80–0x9F for control characters.

Comparison with other encodings

Advantages

As ISO/IEC 2022's entire range of 94-set graphical character encodings can be delegated to GL, the available glyphs are not significantly limited by an inability to represent GR and C1, such as in a system limited to 7-bit encodings. It accordingly enables the representation of large set of characters in such a system. Generally, this 7-bit compatibility is not really an advantage, except for backwards compatibility with older systems. The vast majority of modern computers use 8 bits for each byte.
As compared to Unicode, ISO/IEC 2022 sidesteps Han unification by using sequence codes to switch between discrete encodings for different East Asian languages. This avoids the issues associated with unification, such as difficulty supporting multiple CJK languages with their associated character variants in a single document and font.

Disadvantages

Since ISO/IEC 2022 is a stateful encoding, a program cannot jump in the middle of a block of text to search, insert or delete characters. This makes manipulation of the text very cumbersome and slow when compared to non-stateful encodings. Any jump in the middle of the text may require a back up to the previous escape sequence before the bytes following the escape sequence can be interpreted.
Due to the stateful nature of ISO/IEC 2022, an identical and equivalent character may be encoded in different character sets, which may be delegated to any of G0 through G3, which may be accessed using single shifts or by using locking shifts to GL or GR. Consequently, characters can be represented in multiple ways, meaning that two visually identical and equivalent strings can not be reliably compared for equality.
Some systems, like DICOM and several e-mail clients, use a variant of ISO-2022 in addition to supporting several other encodings.^[5] This type of variation makes it difficult to portably transfer text between computer systems.
UTF-1, the multi-byte Unicode transformation format compatible with ISO/IEC 2022, has various disadvantages in comparison with UTF-8, and switching from or to other charsets, as supported by ISO/IEC 2022, is typically unnecessary in Unicode documents.
Because of its escape sequences, it is possible to construct attack byte sequences that round-trip from ISO/IEC 2022 to Unicode and back. Use of this encoding is thus treated as suspicious by malware protection suites.^[6]

References

↑ "Standard ECMA 35" (PDF).
↑ RFC 1554 - ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP. Tools.ietf.org. Retrieved on 2014-05-20.
↑ "KS X 1001:1992" (PDF).
↑ "KS C 5601:1987" (PDF). 1988-10-01.
↑ "DICOM ISO 2022 variation".
↑ https://bugzilla.mozilla.org/show_bug.cgi?id=935453

Lunde, Ken. CJKV Information Processing. Cambridge, Massachusetts: O'Reilly & Associates, 1998. ISBN 1-56592-224-7.

External links

ISO/IEC 2022:1994
ISO/IEC 2022:1994/Cor 1:1999
ECMA-35, equivalent to ISO/IEC 2022 and freely downloadable.
International Register of Coded Character Sets to be Used with Escape Sequences, a full list of assigned character sets and their escape sequences
History of Character Codes in North America, Europe, and East Asia from 1999, rev. 2004
CJK.INF: a document on encoding Chinese, Japanese, and Korean (CJK) languages, including a discussion of the various variants of ISO/IEC 2022.

RFCs

RFC 1468: description of ISO-2022-JP
RFC 2237: description of ISO-2022-JP-1
RFC 1554: description of ISO-2022-JP-2
RFC 1922: description of ISO-2022-CN and ISO-2022-CN-EXT
RFC 1557: description of ISO-2022-KR

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[ECMA-35-1] "Standard ECMA 35" (PDF).

[2] RFC 1554 - ISO-2022-JP-2: Multilingual Extension of ISO-2022-JP. Tools.ietf.org. Retrieved on 2014-05-20.

[ksx-3] "KS X 1001:1992" (PDF).

[ksc-4] "KS C 5601:1987" (PDF). 1988-10-01.

[DICOM-5] "DICOM ISO 2022 variation".

[6] ttps://bugzilla.mozilla.org/show_bug.cgi?id=935453

Character encodings
Early telecommunications	ASCII ISO/IEC 646 ISO/IEC 6937 T.61 BCDIC Baudot code Morse code Telegraph code Wabun code Special telegraphy codes Non-Latin Chinese Cyrillic Needle telegraph codes
ISO/IEC 8859	-1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 -14 -15 -16
Bibliographic use	ANSEL ISO 5426 / 5426-2 / 5427 / 5428 / 6438 / 6861 / 6862 / 10585 / 10586 / 10754 / 11822 MARC-8
National standards	ArmSCII BraSCII CNS 11643 ELOT 927 GOST 10859 GB 18030 HKSCS I.S. 434 ISCII JIS X 0201 JIS X 0208 JIS X 0212 JIS X 0213 KOI-7 KPS 9566 KS X 1001 PASCII SI 960 TIS-620 TSCII VISCII VSCII YUSCII
EUC	CN JP KR TW
ISO/IEC 2022	CN JP KR CCCII
MacOS code pages ("scripts")	Armenian Arabic Barents Cyrillic Celtic CentEuro ChineseSimp / EUC-CN ChineseTrad / Big5 Croatian Cyrillic Devanagari Dingbats Farsi (Persian) Gaelic Georgian Greek Gujarati Gurmukhi Hebrew Iceland Inuit Japanese / ShiftJIS Keyboard Korean / EUC-KR Latin (Kermit) Maltese/Esperanto Ogham / I.S. 434 Roman Romanian Sámi Symbol Thai / TIS-620 Turkish Turkic Latin Turkic Cyrillic Ukrainian
DOS code pages	100 111 112 113 151 152 161 162 163 164 165 166 210 220 301 437 449 489 620 667 668 707 708 709 710 711 714 715 720 721 737 768 770 771 772 773 774 775 776 777 778 790 850 851 852 853 854 855/872 856 857 858 859 860 861 862 863 864/17248 865 866/808 867 868 869 874/1161/1162 876 877 878 881 882 883 884 885 891 895 896 897 898 899 900 903 904 906 907 909 910 911 926 927 928 929 932 934 936 938 941 942 943 944 946 947 948 949 950/1370 951 966 991 1034 1039 1040 1041 1042 1043 1044 1046 1086 1088 1092 1093 1098 1108 1109 1114 1115 1116 1117 1118 1119 1125/848 1126 1127 1131/849 1139 1167 1168 1300 1351 1361 1362 1363 1372 1373 1374 1375 1380 1381 1385 1386 1391 1392 1393 1394 CWI-2 Iran System Kamenický KOI8 Mazovia MIK
IBM AIX code pages	367 371 806 813 819 895 896 912 913 914 915 916 919 920 921/901 922/902 923 952 953 954 955 956 957 958 959 960 961 963 964 965 970 971 1004 1006 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1029 1036 1089 1111 1124 1129/1163 1133 1350 1382 1383
IBM Apple MacIntosh emulations	1275 1280 1281 1282 1283 1284 1285 1286
IBM Adobe emulations	1038 1276 1277
IBM DEC emulations	1020 1021 1023 1090 1100 1101 1102 1103 1104 1105 1106 1107 1287 1288
IBM HP emulations	1050 1051 1052 1053 1054 1055 1056 1057 1058
Windows code pages	CER-GS 874/1162 (TIS-620) 932/943 (Shift JIS) 936/1386 (GBK) 950/1370 (Big5) 949/1363 (EUC-KR) 1169 1174 Extended Latin-8 1200 (UTF-16LE) 1201 (UTF-16BE) 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1261 1270 54936 (GB18030)
EBCDIC code pages	1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37/1140 37-2 38 39 40 251 252 254 256 257 258 259 260 264 273/1141 274 275 276 277/1142 278/1143 279 280/1144 281 282 283 284/1145 285/1146 286 287 288 289 290 293 297/1147 298 300 310 320 321 322 330 351 352 353 355 357 358 359 360 361 363 382 383 384 385 386 387 388 389 390 391 392 393 394 395 410 420/16804 421 423 424/8616/12712 425 435 500/1148 803 829 833 834 835 836 837 838/838 839 870/1110/1153 871/1149 875/4971/9067 880 881 882 883 884 885 886 887 888 889 890 892 893 905 918 924 930/1390 931 933/1364 935/1388 937/1371 939/1399 1001 1002 1003 1005 1007 1024 1025/1154 1026/1155 1027 1028 1030 1031 1032 1033 1037 1047 1068 1069 1070 1071 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1087 1091 1097 1112/1156 1113 1122/1157 1123/1158 1130/1164 1132 1136 1137 1150 1151 1152 1159 1165 1166 1278 1279 1303 1364 1376 1377 JEF KEIS
Platform specific	Acorn Adobe Standard Adobe Latin 1 Apple II ATASCII Atari ST BICS Casio calculators CDC CPC DEC Radix-50 DEC MCS/NRCS DG International ELWRO-Junior FIELDATA GEM GEOS GSM 03.38 HP Roman Extension HP Roman-8 HP Roman-9 HP FOCAL HP RPL LICS LMBCS Mattel Aquarius Minitel MSX NEC APC NeXT PCW PETSCII Sharp calculators Sinclair QL Teletext TI calculators TRS-80 Ventura International Ventura Symbol WISCII XCCS ZX80 ZX81 ZX Spectrum
Unicode / ISO/IEC 10646	UTF-1 UTF-7 UTF-8 UTF-16 (UTF-16LE/UTF-16BE) / UCS-2 UTF-32 (UTF-32LE/UTF-32BE) / UCS-4 UTF-EBCDIC GB 18030 BOCU-1 CESU-8 SCSU
TeX typesetting system	Cork LGR LY1 OML OMS OMX OT1 OT2 OT3 OT4 T2A T2B T2C T2D T3 T4 T5 TS1 TS3 U X2
Miscellaneous code pages	ABICOMP APL ARIB STD-B24 HZ INIS INIS-8 ISO-IR-111 ISO-IR-182 ISO-IR-197 ISO-IR-200 ISO-IR-201 Johab SEASCII Stanford/ITS TACE16 TRON UTF-5 UTF-6 WTF-8
Related topics	Code page Control character (C0 C1) CCSID Character encodings in HTML Charset detection Han unification Hardware ISO 6429/IEC 6429/ANSI X3.64 Mojibake
Character sets

Standards of ECMA International
Application interfaces	ANSI escape code Common Language Infrastructure Office Open XML OpenXPS
File systems (tape)	Advanced Intelligent Tape DDS DLT Super DLT Holographic Versatile Disc Linear Tape-Open (Ultrium-1) VXA
File systems (disk)	CD-ROM CD File System (CDFS) FAT FAT12 FAT16 FAT16B FD UDF Ultra Density Optical Universal Media Disc
Graphics	Universal 3D
Programming languages	C++/CLI C# Eiffel JavaScript (E4X, ECMAScript)
Radio link interfaces	NFC UWB
Other	ECMA-35
List of ECMA Standards (1961 - Present)

ISO standards by standard number
List of ISO standards / ISO romanizations / IEC standards
1–9999	1 2 3 4 5 6 7 9 16 17 31 -0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 128 216 217 226 228 233 259 269 302 306 361 428 500 518 519 639 -1 -2 -3 -5 -6 646 657 668 690 704 732 764 838 843 860 898 965 999 1000 1004 1007 1073-1 1155 1413 1538 1629 1745 1989 2014 2015 2022 2033 2047 2108 2145 2146 2240 2281 2533 2709 2711 2720 2788 2848 2852 3029 3103 3166 -1 -2 -3 3297 3307 3601 3602 3864 3901 3950 3977 4031 4157 4165 4217 4909 5218 5426 5427 5428 5725 5775 5776 5800 5964 6166 6344 6346 6385 6425 6429 6438 6523 6709 6943 7001 7002 7010 7027 7064 7098 7185 7200 7498 -1 7736 7810 7811 7812 7813 7816 7942 8000 8093 8178 8217 8373 8501-1 8571 8583 8601 8613 8632 8651 8652 8691 8805/8806 8807 8820-5 8859 -1 -2 -3 -4 -5 -6 -7 -8 -8-I -9 -10 -11 -12 -13 -14 -15 -16 8879 9000/9001 9036 9075 9126 9141 9227 9241 9293 9314 9362 9407 9506 9529 9564 9592/9593 9594 9660 9797-1 9897 9899 9945 9984 9985 9995
10000–19999	10005 10006 10007 10116 10118-3 10160 10161 10165 10179 10206 10218 10303 -11 -21 -22 -28 -238 10383 10487 10585 10589 10646 10664 10746 10861 10957 10962 10967 11073 11170 11179 11404 11544 11783 11784 11785 11801 11898 11940 (-2) 11941 11941 (TR) 11992 12006 12182 12207 12234-2 13211 -1 -2 13216 13250 13399 13406-2 13450 13485 13490 13567 13568 13584 13616 14000 14031 14224 14289 14396 14443 14496 -2 -3 -6 -10 -11 -12 -14 -17 -20 14644 14649 14651 14698 14750 14764 14882 14971 15022 15189 15288 15291 15292 15398 15408 15444 -3 15445 15438 15504 15511 15686 15693 15706 -2 15707 15897 15919 15924 15926 15926 WIP 15930 16023 16262 16612-2 16750 16949 (TS) 17024 17025 17100 17203 17369 17442 17799 18000 18004 18014 18245 18629 18916 19005 19011 19092 (-1 -2) 19114 19115 19125 19136 19407 19439 19500 19501 19502 19503 19505 19506 19507 19508 19509 19510 19600 19752 19757 19770 19775-1 19794-5 19831
20000+	20000 20022 20121 20400 21000 21047 21500 21827:2002 22000 23270 23271 23360 24517 24613 24617 24707 25178 25964 26000 26300 26324 27000 series 27000 27001 27002 27006 27729 28000 29110 29148 29199-2 29500 30170 31000 32000 38500 40500 42010 55000 80000 -1 -2 -3
Category

List of International Electrotechnical Commission standards
IEC standards	IEC 60027 IEC 60034 IEC 60038 IEC 60062 IEC 60063 IEC 60068 IEC 60112 IEC 60228 IEC 60269 IEC 60297 IEC 60309 IEC 60320 IEC 60364 IEC 60446 IEC 60559 IEC 60601 IEC 60870 IEC 60870-5 IEC 60870-6 IEC 60906-1 IEC 60908 IEC 60929 IEC 60958 AES3 S/PDIF IEC 61030 IEC 61131 IEC 61131-3 IEC 61158 IEC 61162 IEC 61334 IEC 61346 IEC 61355 IEC 61400 IEC 61499 IEC 61508 IEC 61511 IEC 61850 IEC 61851 IEC 61883 IEC 61960 IEC 61968 IEC 61970 IEC 62014-4 IEC 62056 IEC 62061 IEC 62196 IEC 62262 IEC 62264 IEC 62304 IEC 62325 IEC 62351 IEC 62365 IEC 62366 IEC 62379 IEC 62386 IEC 62455 IEC 62680 IEC 62682 IEC 62700
ISO/IEC standards	ISO/IEC 646 ISO/IEC 2022 ISO/IEC 4909 ISO/IEC 5218 ISO/IEC 6429 ISO/IEC 6523 ISO/IEC 7810 ISO/IEC 7811 ISO/IEC 7812 ISO/IEC 7813 ISO/IEC 7816 ISO/IEC 7942 ISO/IEC 8613 ISO/IEC 8632 ISO/IEC 8652 ISO/IEC 8859 ISO/IEC 9126 ISO/IEC 9293 ISO/IEC 9592 ISO/IEC 9593 ISO/IEC 9899 ISO/IEC 9945 ISO/IEC 9995 ISO/IEC 10021 ISO/IEC 10116 ISO/IEC 10165 ISO/IEC 10179 ISO/IEC 10646 ISO/IEC 10967 ISO/IEC 11172 ISO/IEC 11179 ISO/IEC 11404 ISO/IEC 11544 ISO/IEC 11801 ISO/IEC 12207 ISO/IEC 13250 ISO/IEC 13346 ISO/IEC 13522-5 ISO/IEC 13568 ISO/IEC 13818 ISO/IEC 14443 ISO/IEC 14496 ISO/IEC 14882 ISO/IEC 15288 ISO/IEC 15291 ISO/IEC 15408 ISO/IEC 15444 ISO/IEC 15445 ISO/IEC 15504 ISO/IEC 15511 ISO/IEC 15693 ISO/IEC 15897 ISO/IEC 15938 ISO/IEC 16262 ISO/IEC 17024 ISO/IEC 17025 ISO/IEC 18000 ISO/IEC 18004 ISO/IEC 18014 ISO/IEC 19752 ISO/IEC 19757 ISO/IEC 19770 ISO/IEC 19788 ISO/IEC 20000 ISO/IEC 21000 ISO/IEC 21827 ISO/IEC 23000 ISO/IEC 23003 ISO/IEC 23008 ISO/IEC 23270 ISO/IEC 23360 ISO/IEC 24707 ISO/IEC 24727 ISO/IEC 24744 ISO/IEC 24752 ISO/IEC 26300 ISO/IEC 27000 ISO/IEC 27000-series ISO/IEC 27002 ISO/IEC 27040 ISO/IEC 29119 ISO/IEC 33001 ISO/IEC 38500 ISO/IEC 42010 ISO/IEC 80000
Related	International Electrotechnical Commission