ISO/IEC 10367

ISO/IEC 10367:1991 is a standard developed by ISO/IEC JTC 1/SC 2,[1] defining graphical character sets for use in character encodings implementing levels 2 and 3 of ISO/IEC 4873[2] (as opposed to ISO/IEC 8859, which defines character encodings at level 1 of ISO/IEC 4873).

Relationship to ISO/IEC 8859

The parts of ISO/IEC 8859 define complete encodings at level 1 of ISO/IEC 4873 (i.e. as stateless extended ASCII single-byte encodings, reserving the C1 area), and do not allow for use of multiple parts together. For use at levels 2 and 3 of ISO/IEC 4873 (i.e. with shift codes for additional graphical character sets), ISO/IEC 8859 stipulates that equivalent sets from ISO/IEC 10367 should be used instead.[3]

ISO/IEC 10367:1991 includes ASCII, as well as sets matching the G1 sets used for the right-hand sides (non-ASCII parts) of ISO/IEC 6937 (ITU T.51) and of ISO/IEC 8859 parts 1 through 9 (i.e. those parts which existed as of 1991, when it was published), a set of additional Roman characters supplementing some of those parts, and a set of box drawing characters (shown below).[2][4]

Supplementary G3 Latin set

ISO/IEC 10367 includes the ISO-IR-154 graphical set, which is intended to supplement Latin alphabets number 1, 2 and 5 (i.e. ISO-8859-1, ISO-8859-2 and ISO-8859-9).[4] Specifically, it is intended for use as a G3 set in a profile of ISO/IEC 4873 in which the G1 and G2 sets include the right hand side of ISO-8859-2, and also that of either ISO-8859-1 or ISO-8859-9.[5] These configurations allow the entire ISO/IEC 6937 repertoire (ITU T.51 Annex A) to be represented without the use of non-spacing codes.[6]

For instance, the letter Ĉ would be encoded under ISO/IEC 4873 level 2 as 0x8F 0x23 if this set is included.

Characters which also appear in ISO-8859-1 are shown below with a grey box, while those which also appear in ISO-8859-9 are shown with a green box. Under the current edition of ISO/IEC 4873 / ECMA-43 (although not earlier editions),[7] characters must be used from the lowest-numbered working set they appear in, hence those characters are not used from this G3 set when the respective ISO-8859 right-hand side set is used as the G1 or G2 set.[8]

ISO/IEC 10367 supplementary G3 Latin set[5]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
2_/A_ Ā
0100
Ĉ
0108
Ċ
010A
Ė
0116
Ē
0112
Ĝ
011C

2018

201C

2122

2190

2191

2192

2193
3_/B_ ā
0101
ĉ
0109
ċ
010B
ð
00F0
ė
0117
ē
0113
ĝ
011D

2019

201D

266A

215B

215C

215D

215E
4_/C_ Ğ
011E
Ġ
0120
Ģ
0122
Ĥ
0124
Ħ
0126
Ĩ
0128
İ
0130
Ī
012A
Į
012E
IJ
0132
Ĵ
0134
Ķ
0136
Ļ
013B
Ŀ
013F
Ņ
0145
5_/D_
2014
Ŋ
014A
Ō
014C
Œ
0152
Ŗ
0156
Ŝ
015C
Ŧ
0166
Þ
00DE
Ũ
0168
Ŭ
016C
Ū
016A
Ų
0172
Ŵ
0174
Ý
00DD
Ŷ
0176
Ÿ
0178
6_/E_
2126
ğ
011F
ġ
0121
ģ
0123
ĥ
0125
ħ
0127
ĩ
0129
ı
0131
ī
012B
į
012F
ij
0133
ĵ
0135
ķ
0137
ļ
013C
ŀ
0140
ņ
0146
7_/F_ ĸ
0138
ŋ
014B
ō
014D
œ
0153
ŗ
0157
ŝ
015D
ŧ
0167
þ
00FE
ũ
0169
ŭ
016D
ū
016B
ų
0173
ŵ
0175
ý
00FD
ŷ
0177
ʼn
0149

Box drawing set

The following shows the box drawing set from ISO/IEC 10367, which is registered for ISO/IEC 2022 use as ISO-IR-155. Although it does not make use of the 0x20/A0 or 0x7F/FF positions, it is registered as a 96-character set.[9]

Perl libintl includes a "ISO_10367-BOX" codec. This encodes/decodes ASCII over GL and the ISO-IR-155 box drawing set over GR with a few deviations. Specifically, it includes double-lined box-drawing characters in place of heavy-lined characters, and it replaces the upper half block (▀) at 0xCB with a private use character U+E019, documented as "Unit space B".[10]

ISO/IEC 10367 box drawing set[9]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
2_/A_
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 
3_/B_
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 
4_/C_
2503

2501

250F

2513

2517

251B

2523

252B

2533

253B

254B

2580

2584

2588

25AA

 
5_/D_
2502

2500

250C

2510

2514

2518

251C

2524

252C

2534

253C

2591

2592

2593

 

 
6_/E_
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 
7_/F_
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

References

  1. ISO/IEC JTC 1/SC 2 (1991). "Information technology — Standardized coded graphic character sets for use in 8-bit codes". ISO. ISO/IEC 10367:1991.
  2. van Wingen, Johan W (1999). "8. Code Extension, ISO 2022 and 2375, ISO 4873 and 10367". Character sets. Letters, tokens and codes. Terena.
  3. ISO/IEC JTC 1/SC 2 (1998-02-12). Final Text of DIS 8859-10, Information Technology — 8-bit single-byte coded graphic character sets — Part 10: Latin alphabet No. 6 (PDF). ISO/IEC FDIS 8859-10:1998, JTC1/SC2 N2992, WG3 N415.
  4. "8-Bit Character Sets - ISO/IEC 10367". Guide to the use of Character Sets in Europe. DKUUG.
  5. ECMA (1990-03-01). "Supplementary Set for Latin Alphabets 1, 2 and 5" (PDF). ITSCJ/IPSJ. ISO-IR-155.
  6. ISO/IEC JTC 1/SC 2/WG 3 (1998-04-15). "Annex E: Alternative coded representation of the repertoire with no non-spacing diacritical marks". WD 6937, Coded graphic character set for text communication - Latin alphabet (PDF). p. 37. JTC1/SC2/N454.
  7. ECMA (1991). "Main differences between the second edition (1985) and the present (third) edition of this ECMA Standard". ECMA-43: 8-Bit Coded Character Set Structure and Rules (PDF) (ECMA Standard) (3rd ed.). p. 23.
  8. ECMA (1991). "Unique coding of characters". ECMA-43: 8-Bit Coded Character Set Structure and Rules (PDF) (ECMA Standard) (3rd ed.). p. 10.
  9. ISO/IEC/JTC1/SC2/WG3 (1990-04-16). "Basic Box-Drawings Set" (PDF). ITSCJ/IPSJ. ISO-IR-155.
  10. Flohr, Guido. "Conversion routines for ISO_10367_BOX". libintl-perl. Locale::RecodeData::ISO_10367_BOX.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.