KOI8-R

KOI8-R (RFC 1489) is an 8-bit character encoding, designed to cover Russian, which uses a Cyrillic alphabet. It also happens to cover Bulgarian, but has not been used for that purpose since CP1251 was accepted. A derivative encoding is KOI8-U, which adds Ukrainian characters. The original KOI-8 encoding was designed by Soviet authorities in 1974. KOI8 remains much more commonly used than ISO 8859-5, which never really caught on. Another common Cyrillic character encoding is Windows-1251. The use of these older code pages is being replaced with Unicode as a more common way to represent Cyrillic together with other languages.

KOI8-R
Language(s)Russian, Bulgarian
Classification8-bit KOI, extended ASCII
ExtendsKOI8-B
Based onKOI-8
Other related encoding(s)KOI8-U, KOI8-RU

In Microsoft Windows, KOI8-R is assigned the code page number 20866. In IBM, KOI8-R is assigned code page 878.[1][2]

KOI8 stands for Kod Obmena Informatsiey, 8 bit (Russian: Код Обмена Информацией, 8 бит) which means "Code for Information Exchange, 8 bit".

The KOI8 character sets have the property that the Russian Cyrillic letters are in pseudo-Roman order rather than the normal Cyrillic alphabetical order as in ISO 8859-5 or Unicode. Although this may seem unnatural, it has the useful property that if the 8th bit is stripped, the text is partially readable in ASCII and may convert to syntactically correct KOI7. For instance, "Русский Текст" in KOI8-R becomes rUSSKIJ tEKST ("Russian Text") if the 8th bit is stripped; attempting to interpret the ASCII string rUSSKIJ tEKST as KOI7 yields "РУССКИЙ ТЕКСТ". KOI8 was based on Russian Morse code, which was created from Latin Morse code based on sound similarities, and which has the same connection to the Latin Morse codes for A-Z as KOI8 has with ASCII.

Character set

The following table shows the KOI8-R encoding. Each character is shown with its equivalent Unicode code point.

KOI8-R[3][4][5][6]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_
0
1_
16
2_
32
SP
0020
!
0021
"
0022
#
0023
$
0024
%
0025
&
0026
'
0027
(
0028
)
0029
*
002A
+
002B
,
002C
-
002D
.
002E
/
002F
3_
48
0
0030
1
0031
2
0032
3
0033
4
0034
5
0035
6
0036
7
0037
8
0038
9
0039
:
003A
;
003B
<
003C
=
003D
>
003E
?
003F
4_
64
@
0040
A
0041
B
0042
C
0043
D
0044
E
0045
F
0046
G
0047
H
0048
I
0049
J
004A
K
004B
L
004C
M
004D
N
004E
O
004F
5_
80
P
0050
Q
0051
R
0052
S
0053
T
0054
U
0055
V
0056
W
0057
X
0058
Y
0059
Z
005A
[
005B
\
005C
]
005D
^
005E
_
005F
6_
96
`
0060
a
0061
b
0062
c
0063
d
0064
e
0065
f
0066
g
0067
h
0068
i
0069
j
006A
k
006B
l
006C
m
006D
n
006E
o
006F
7_
112
p
0070
q
0071
r
0072
s
0073
t
0074
u
0075
v
0076
w
0077
x
0078
y
0079
z
007A
{
007B
|
007C
}
007D
~
007E
8_
128

2500

2502

250C

2510

2514

2518

251C

2524

252C

2534

253C

2580

2584

2588

258C

2590
9_
144

2591

2592

2593

2320

25A0

2219

221A

2248

2264

2265
NBSP
00A0

2321
°
00B0
²
00B2
·
00B7
÷
00F7
A_
160

2550

2551

2552
ё
0451

2553

2554

2555

2556

2557

2558

2559

255A

255B

255C

255D

255E
B_
176

255F

2560

2561
Ё
0401

2562

2563

2564

2565

2566

2567

2568

2569

256A

256B

256C
©
00A9
C_
192
ю
044E
а
0430
б
0431
ц
0446
д
0434
е
0435
ф
0444
г
0433
х
0445
и
0438
й
0439
к
043A
л
043B
м
043C
н
043D
о
043E
D_
208
п
043F
я
044F
р
0440
с
0441
т
0442
у
0443
ж
0436
в
0432
ь
044C
ы
044B
з
0437
ш
0448
э
044D
щ
0449
ч
0447
ъ
044A
E_
224
Ю
042E
А
0410
Б
0411
Ц
0426
Д
0414
Е
0415
Ф
0424
Г
0413
Х
0425
И
0418
Й
0419
К
041A
Л
041B
М
041C
Н
041D
О
041E
F_
240
П
041F
Я
042F
Р
0420
С
0421
Т
0422
У
0423
Ж
0416
В
0412
Ь
042C
Ы
042B
З
0417
Ш
0428
Э
042D
Щ
0429
Ч
0427
Ъ
042A

  Letter  Number  Punctuation  Symbol  Other  Undefined

See also

References

  1. "SBCS code page information - CPGID: 00878 / Name: Russian internet koi8-r". IBM Software: Globalization: Coded character sets and related resources: Code pages by CPGID: Code page identifiers. IBM. C-H 3-3220-050. Archived from the original on 2017-02-18. Retrieved 2017-02-18.
  2. "CCSID information document; CCSID 878; KOI8-R CYRILLIC". IBM. Retrieved 2017-02-18.
  3. Richter, Helmut (2016-01-04) [1999-08-18]. "KOI8-R.TXT". 2.0. Retrieved 2016-12-09.
  4. Code Page CPGID 00878 (pdf) (PDF), IBM
  5. Code Page CPGID 00878 (txt), IBM
  6. International Components for Unicode (ICU), ibm-878_P100-1996.ucm, 2002-12-03

Further reading

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.