CJK Unified Ideographs Extension B

CJK Unified Ideographs Extension B
Range U+20000..U+2A6DF
(42,720 code points)
Plane SIP
Scripts Han
Assigned 42,711 code points
Unused 9 reserved code points
Unicode version history
3.1 42,711 (+42,711)
Note: [1][2]

CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.

The block has dozens of variation sequences defined for standardized variants.[3]

It also has thousands of ideographic variation sequences registered in the Unicode Ideographic Variation Database (IVD).[4][5] These sequences specify the desired glyph variant for a given Unicode character.

It is the only CJK Unified Ideographs Extension block with a UCS2003 source identifier. Since Extension B contained too many characters, the original code charts were produced with a single glyph for all regions. The glyphs were designed by Beijing Zhongyi Electronic Ltd.. After the introduction of multi-column code charts, the original glyphs were retained under the UCS2003 source identifier. The glyphs are packaged in the "SimSun-ExtB" font distributed with the Simplified Chinese versions of Windows, and do not adhere to the glyphs for the Mainland China region.

Known Issues

Other 3 glyphs in Extension B

In CJK Unified Ideographs Extension B, some characters are incorrectly unified with others. These characters include U+2017B (𠅻), U+204AF (𠒯) and U+24CB2 (𤲲). The first two characters contained a wrong unification of Chinese Mainland and Vietnamese source of their glyph, while the last one unifies the Chinese Mainland and Taiwanese ones.[6]

Unifiable variants and exact duplicates in Extension B

Also in CJK Unified Ideographs Extension B, hundreds of glyph variants were encoded.[7] In addition to the deliberate encoding of close glyph variants, six exact duplicates (where the same character has inadvertently been encoded twice) and two semi-duplicates (where the CJK-B character represents a de facto disunification of two glyph forms unified in the corresponding BMP character) were encoded by mistake:[8]

  • U+34A8 㒨 = U+20457 𠑗 : U+20457 is the same as the China-source glyph for U+34A8, but it is significantly different from the Taiwan-source glyph for U+34A8
  • U+3DB7 㶷 = U+2420E 𤈎 : same glyph shapes
  • U+8641 虁 = U+27144 𧅄 : U+27144 is the same as the Korean-source glyph for U+8641, but it is significantly different from the Chinese Mainland-, Taiwan- and Japan-source glyphs for U+8641
  • U+204F2 𠓲 = U+23515 𣔕 : same glyph shapes, but ordered under different radicals
  • U+249BC 𤦼 = U+249E9 𤧩 : same glyph shapes
  • U+24BD2 𤯒 = U+2A415 𪐕 : same glyph shapes, but ordered under different radicals
  • U+26842 𦡂 = U+26866 𦡦 : same glyph shapes
  • U+FA23 﨣 = U+27EAF 𧺯 : same glyph shapes (U+FA23 﨣 is a unified CJK ideograph, despite its name "CJK COMPATIBILITY IDEOGRAPH-FA23.")

History

The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Unified Ideographs Extension B block:

VersionFinal code points[lower-alpha 1]CountL2 IDWG2 IDIRG IDDocument
3.1U+20000..2A6D642,711L2/99-239Addition of three hundred and fourteen KANJIs (from JIS X0213), 1999-07-15
L2/99-310Addition of three hundred and thirteen KANJIs (from JIS X0213), 1999-08-23
L2/99-335N2109N674Zhoucai, Zhang (1999-09-03), SuperCJK, version 9.0 with Kangxi and HYD data
L2/99-336N2105N675CJK Unified Ideographs Extension B WD 6.0, 1999-09-03
L2/99-316Whistler, Ken (1999-09-13), Comments on JCS proposal
L2/99-312excerpt of usages and sources of proposed KANJIs in contemporary Japanese, 1999-10-06
L2/99-366Suignard, Michel (1999-11-24), Text for CD ballot of ISO/IEC 10646 part 2
L2/99-366.1Cover page for N3393, 1999-11-24
L2/99-366.2Suignard, Michel (1999-11-24), Text of CD 10646-2
L2/99-366.3Suignard, Michel (1999-11-24), CJK Ext. B pages 001-100
L2/99-366.4Suignard, Michel (1999-11-24), CJK Ext. B pages 101-200
L2/99-366.5Suignard, Michel (1999-11-24), CJK Ext. B pages 201-300
L2/99-366.6Suignard, Michel (1999-11-24), CJK Ext. B pages 301-335
L2/99-366.7Suignard, Michel (1999-11-24), Special Purpose Plane and Annexes
L2/99-366.8Suignard, Michel (1999-11-24), Mapping of CJK Ext. B characters
L2/99-385N2144N713RJenkins, John (1999-12-08), Clarification of the Non-Cognate Rule
L2/00-021RISO CD 10646 Part-2 vote -- A proposal to move JIS X 0213 Kanji characters on Extension-B into BMP, 2000-01-21
L2/00-030Enomoto, Yoshi (2000-01-31), Background of the proposal (for encoding of 302 ideographs from JIS X 0213)
L2/00-036Umamaheswaran, V. S.; Sargent, Murray (2000-02-03), Expert contribution on the placement of additional unified ideographs from JIS X0213, HK, and Korea
L2/01-026N2298N758CJK Unified Ideographs Extension B, PreDIS R1 For ISO/IEC DIS 10646-2:2000, 2000-11-21
L2/01-027N2299N759Zhoucai, Zhang (2000-11-21), SuperCJK 11.1, A Super Set of Unified CJK Ideographs and Its Extension A & B
L2/01-136N2334Sato, T. K. (2001-03-28), Notification of an error and request for a correction regarding mapping information for a particular JIS X 0213 character in CJK UNIFIED IDEOGRAPHS EXTENSION-B
L2/01-163N2347N785CJK Unified Ideographs Extension B PreIS For ISO/IEC 10646-2:2000, 2001-03-30
L2/01-162N2349N787Zhoucai, Zhang (2001-04-02), Clarification On Versions of CJK Unified Ideographs Extension B As Well As SuperCJK
L2/02-122N2427Ksar, Mike (2002-03-18), Proposal to add 1 Hanja code of D P R of Korea into 10646-2:2001
L2/02-156N2427Proposal to add 1 Hanja code of D P R of Korea into ISO/IEC 10646-2:2001 [duplicate of L2/02-122], 2002-03-18
L2/02-201N2448N924Error Correction, 2002-05-08
L2/02-416N2518Proposal to add 2 hanja codes of D P R of Korea into 10646-2:2001, 2002-11-01
L2/03-017Late DPRK Comments on SC 2 N 3625, 10646-2: 2001/FPDAM 1, 2002-12-09
L2/03-287Cook, Richard (2003-08-24), 16 UniHan.txt errors
L2/03-301Cook, Richard (2003-08-27), 24 more UniHan.txt errors
L2/03-311West, Andrew (2003-09-17), Unicode 4.0.1 Beta Review, comments from Andrew C. West
L2/03-399Fok, Anthony (2003-10-13), Unihan reported errors / changes re kHKSCS entries
L2/03-398Nguyen, D. (2003-10-29), Unihan reported errors / changes re kCowles
L2/03-453Minutes of the Editorial Group Ad Hoc Discussion, 2003-12-17
L2/04-008N2695N1026China's confirmation on fonts for CJK_B 21E2D and 21E45, 2004-01-05
L2/04-208N2774RN1064Proposal to add 6 KP source references to existing CJK Unified Ideographs, 2004-05-25
L2/04-281N2830Suignard, Michel (2004-06-23), CJK Ideograph source visual references information
L2/04-417Cook, Richard (2004-11-18), Extension B font versioning: preliminary work
L2/05-022Cook, Richard (2005-01-25), Extension B font versioning: follow-up report, part 1 [text]
L2/05-023Cook, Richard (2005-01-25), Extension B font versioning: follow-up report, part 2 [tables]
L2/07-208N3285Proposal to replace 11 KP source references to existing ISO/IEC 10646:2003, 2007-07-18
L2/08-234N1406Cook, Richard; Bishop, Thomas; Lunde, Ken (2008-06-06), Han Unification Issues
L2/08-310Cook, Richard (2008-08-12), Fonts for Extension B and C and IRG
L2/10-215Lunde, Ken (2010-06-22), "Hanyo-Denshi" IVD Collection (PRI 167) to Adobe-Japan1-6 Mapping Table
L2/11-243N4111Sources for Orphaned CJK Ideographs, 2011-06-14
L2/11-254Constable, Peter (2011-06-20), UTC Liaison Report from WG2
L2/14-260Suignard, Michel (2014-10-23), CJK chart and source references update
  1. Proposed code points and characters names may differ from final code points and names

See also

References

  1. "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
  2. "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
  3. "Unicode Character Database: Standardized Variation Sequences". The Unicode Consortium.
  4. "Ideographic Variation Database". Unicode Consortium.
  5. "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.
  6. Eiso Chan (陈永聪), Comments on four error glyphs on CJK Unified Ideographs Ext B & E.
  7. unifiable glyph variants
  8. Cook, Richard (6 October 2003). "Defect Report on Duplicate Encoded CJK Forms" (PDF). ISO/IEC JTC1/SC2/WG2. Retrieved 2012-03-28.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.