CJK Unified Ideographs Extension B
CJK Unified Ideographs Extension B is a Unicode block containing rare and historic CJK ideographs for Chinese, Japanese, Korean, and Vietnamese.
- List of CJK Unified Ideographs Extension B (Part 1 of 7). Range: U+20000–U+215FF.
- List of CJK Unified Ideographs Extension B (Part 2 of 7). Range: U+21600–U+230FF.
- List of CJK Unified Ideographs Extension B (Part 3 of 7). Range: U+23100–U+245FF.
- List of CJK Unified Ideographs Extension B (Part 4 of 7). Range: U+24600–U+260FF.
- List of CJK Unified Ideographs Extension B (Part 5 of 7). Range: U+26100–U+275FF.
- List of CJK Unified Ideographs Extension B (Part 6 of 7). Range: U+27600–U+290FF.
- List of CJK Unified Ideographs Extension B (Part 7 of 7). Range: U+29100–U+2A6DF.
CJK Unified Ideographs Extension B | |
---|---|
Range | U+20000..U+2A6DF (42,720 code points) |
Plane | SIP |
Scripts | Han |
Assigned | 42,718 code points |
Unused | 2 reserved code points |
Unicode version history | |
3.1 | 42,711 (+42,711) |
13.0 | 42,718 (+7) |
Note: [1][2] |
The block has dozens of variation sequences defined for standardized variants.[3]
It also has thousands of ideographic variation sequences registered in the Unicode Ideographic Variation Database (IVD).[4][5] These sequences specify the desired glyph variant for a given Unicode character.
It is the only CJK Unified Ideographs Extension block with a UCS2003 source identifier. Since Extension B contained too many characters, the original code charts were produced with a single glyph for all regions. The glyphs were designed by Beijing Zhongyi Electronic Ltd.. After the introduction of multi-column code charts, the original glyphs were retained under the UCS2003 source identifier. The glyphs are packaged in the "SimSun-ExtB" font distributed with the Simplified Chinese versions of Windows, and do not adhere to the glyphs for the Mainland China region.
Known issues
Other 3 glyphs in Extension B
In CJK Unified Ideographs Extension B, some characters are incorrectly unified with others. These characters include U+2017B (𠅻), U+204AF (𠒯) and U+24CB2 (𤲲). The first two characters contained a wrong unification of Chinese Mainland and Vietnamese source of their glyph, while the last one unifies the Chinese Mainland and Taiwanese ones.[6]
Unifiable variants and exact duplicates in Extension B
Also in CJK Unified Ideographs Extension B, hundreds of glyph variants were encoded.[7] In addition to the deliberate encoding of close glyph variants, six exact duplicates (where the same character has inadvertently been encoded twice) and two semi-duplicates (where the CJK-B character represents a de facto disunification of two glyph forms unified in the corresponding BMP character) were encoded by mistake:[8]
- U+34A8 㒨 = U+20457 𠑗 : U+20457 is the same as the China-source glyph for U+34A8, but it is significantly different from the Taiwan-source glyph for U+34A8
- U+3DB7 㶷 = U+2420E 𤈎 : same glyph shapes
- U+8641 虁 = U+27144 𧅄 : U+27144 is the same as the Korean-source glyph for U+8641, but it is significantly different from the Chinese Mainland-, Taiwan- and Japan-source glyphs for U+8641
- U+204F2 𠓲 = U+23515 𣔕 : same glyph shapes, but ordered under different radicals
- U+249BC 𤦼 = U+249E9 𤧩 : same glyph shapes
- U+24BD2 𤯒 = U+2A415 𪐕 : same glyph shapes, but ordered under different radicals
- U+26842 𦡂 = U+26866 𦡦 : same glyph shapes
- U+FA23 﨣 = U+27EAF 𧺯 : same glyph shapes (U+FA23 﨣 is a unified CJK ideograph, despite its name "CJK COMPATIBILITY IDEOGRAPH-FA23.")
History
The following Unicode-related documents record the purpose and process of defining specific characters in the CJK Unified Ideographs Extension B block:
Version | Final code points[lower-alpha 1] | Count | L2 ID | WG2 ID | IRG ID | Document |
---|---|---|---|---|---|---|
3.1 | U+20000..2A6D6 | 42,711 | L2/98-260 | Ng, Nelson; Kung, Michael (1998-05-26), "CJK UNIFIED IDEOGRAPHS EXTENSION B", Report on IRG meeting #11 | ||
L2/99-239 | Addition of three hundred and fourteen KANJIs (from JIS X0213), 1999-07-15 | |||||
L2/99-310 | Addition of three hundred and thirteen KANJIs (from JIS X0213), 1999-08-23 | |||||
L2/99-335 | N2109 | N674 | Zhang, Zhoucai (1999-09-03), SuperCJK, version 9.0 with Kangxi and HYD data | |||
L2/99-336 | N2105 | N675 | CJK Unified Ideographs Extension B WD 6.0, 1999-09-03 | |||
L2/99-316 | Whistler, Ken (1999-09-13), Comments on JCS proposal | |||||
L2/99-312 | excerpt of usages and sources of proposed KANJIs in contemporary Japanese, 1999-10-06 | |||||
L2/99-366 | Suignard, Michel (1999-11-24), Text for CD ballot of ISO/IEC 10646 part 2 | |||||
L2/99-366.1 | Cover page for N3393, 1999-11-24 | |||||
L2/99-366.2 | Suignard, Michel (1999-11-24), Text of CD 10646-2 | |||||
L2/99-366.3 | Suignard, Michel (1999-11-24), CJK Ext. B pages 001-100 | |||||
L2/99-366.4 | Suignard, Michel (1999-11-24), CJK Ext. B pages 101-200 | |||||
L2/99-366.5 | Suignard, Michel (1999-11-24), CJK Ext. B pages 201-300 | |||||
L2/99-366.6 | Suignard, Michel (1999-11-24), CJK Ext. B pages 301-335 | |||||
L2/99-366.7 | Suignard, Michel (1999-11-24), Special Purpose Plane and Annexes | |||||
L2/99-366.8 | Suignard, Michel (1999-11-24), Mapping of CJK Ext. B characters | |||||
L2/99-385 | N2144 | N713R | Jenkins, John (1999-12-08), Clarification of the Non-Cognate Rule | |||
L2/00-010 | N2103 | Umamaheswaran, V. S. (2000-01-05), "10.3", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13--16 | ||||
L2/00-021R (pdf, rtf) | ISO CD 10646 Part-2 vote -- A proposal to move JIS X 0213 Kanji characters on Extension-B into BMP, 2000-01-21 | |||||
L2/00-030 | Enomoto, Yoshi (2000-01-31), Background of the proposal (for encoding of 302 ideographs from JIS X 0213) | |||||
L2/00-036 | Umamaheswaran, V. S.; Sargent, Murray (2000-02-03), Expert contribution on the placement of additional unified ideographs from JIS X0213, HK, and Korea | |||||
L2/01-026 (pdf, doc) | N2298 | N758 | CJK Unified Ideographs Extension B, PreDIS R1 For ISO/IEC DIS 10646-2:2000, 2000-11-21 | |||
L2/01-136 | N2334 (pdf, doc) | Sato, T. K. (2001-03-28), Notification of an error and request for a correction regarding mapping information for a particular JIS X 0213 character in CJK UNIFIED IDEOGRAPHS EXTENSION-B | ||||
L2/01-163 | N2347 | N785 | CJK Unified Ideographs Extension B PreIS For ISO/IEC 10646-2:2000, 2001-03-30 | |||
L2/01-162 | N2349 (pdf, doc) | N787 | Zhang, Zhoucai (2001-04-02), Clarification On Versions of CJK Unified Ideographs Extension B As Well As SuperCJK | |||
L2/02-122 | N2427 | Ksar, Mike (2002-03-18), Proposal to add 1 Hanja code of D P R of Korea into 10646-2:2001 | ||||
L2/02-201 | N2448 | N924 | Error Correction, 2002-05-08 | |||
L2/02-416 | N2518 | Proposal to add 2 hanja codes of D P R of Korea into 10646-2:2001, 2002-11-01 | ||||
L2/03-017 | Late DPRK Comments on SC 2 N 3625, 10646-2: 2001/FPDAM 1, 2002-12-09 | |||||
L2/03-287 | Cook, Richard (2003-08-24), 16 UniHan.txt errors | |||||
L2/03-301 | Cook, Richard (2003-08-27), 24 more UniHan.txt errors | |||||
L2/03-311 | West, Andrew (2003-09-17), Unicode 4.0.1 Beta Review, comments from Andrew C. West | |||||
L2/03-399 | Fok, Anthony (2003-10-13), Unihan reported errors / changes re kHKSCS entries | |||||
L2/03-398 | Nguyen, D. (2003-10-29), Unihan reported errors / changes re kCowles | |||||
L2/03-453 | Minutes of the Editorial Group Ad Hoc Discussion, 2003-12-17 | |||||
L2/04-008 | N2695 | N1026 | China's confirmation on fonts for CJK_B 21E2D and 21E45, 2004-01-05 | |||
L2/04-208 | N2774R | N1064 | Proposal to add 6 KP source references to existing CJK Unified Ideographs, 2004-05-25 | |||
L2/04-281 | N2830 | Suignard, Michel (2004-06-23), CJK Ideograph source visual references information | ||||
L2/04-417 | Cook, Richard (2004-11-18), Extension B font versioning: preliminary work | |||||
L2/05-022 | Cook, Richard (2005-01-25), Extension B font versioning: follow-up report, part 1 [text] | |||||
L2/05-023 | Cook, Richard (2005-01-25), Extension B font versioning: follow-up report, part 2 [tables] | |||||
N3353 (pdf, doc) | Umamaheswaran, V. S. (2007-10-10), "M51.9", Unconfirmed minutes of WG 2 meeting 51 Hanzhou, China; 2007-04-24/27 | |||||
L2/07-208 | N3285 | Proposal to replace 11 KP source references to existing ISO/IEC 10646:2003, 2007-07-18 | ||||
L2/08-234 | N1406 | Cook, Richard; Bishop, Thomas; Lunde, Ken (2008-06-06), Han Unification Issues | ||||
L2/08-310 | Cook, Richard (2008-08-12), Fonts for Extension B and C and IRG | |||||
L2/10-215 | Lunde, Ken (2010-06-22), "Hanyo-Denshi" IVD Collection (PRI 167) to Adobe-Japan1-6 Mapping Table | |||||
N3903 (pdf, doc) | "M57.07 (CJK Ext. B glyphs from 2nd edition)", Unconfirmed minutes of WG2 meeting 57, 2011-03-31 | |||||
L2/11-243 | N4111 | Sources for Orphaned CJK Ideographs, 2011-06-14 | ||||
L2/11-254 | Constable, Peter (2011-06-20), "Update to UTR #45 U-Source Ideographs requested", UTC Liaison Report from WG2 | |||||
N4103 | "Resolution 58.05", Unconfirmed minutes of WG 2 meeting 58, 2012-01-03 | |||||
L2/14-260 | N4621 | Suignard, Michel (2014-10-23), CJK chart and source references update | ||||
L2/16-052 | N4603 (pdf, doc) | Umamaheswaran, V. S. (2015-09-01), "M63.05", Unconfirmed minutes of WG 2 meeting 63 | ||||
L2/17-180 | N2202 | Chan, Eiso (2017-06-02), Request for consideration to add kIRG_GSource values to thirteen ideographs and change two G-source glyphs for the Table of General Standard Chinese Characters [Affects 20164] | ||||
L2/17-362 | Moore, Lisa (2018-02-02), "Consensus 153-C16", UTC #153 Minutes | |||||
N4974 | N2301 | Request of TCA’s Horizontal Extension for Chemical Terminology [Affects U+20BBF, U+20C02, U+20CED, U+26B4C, U+26CBE, U+26E3D, U+28834, U+289A1, U+289C0, U+28A0F, and U+28B46], 2018-06-12 | ||||
N4987 | Proposal on China’s Horizontal Extension for 14 CJK Ideographs [Affects U+37C3, 3FE0, 9FD4, 20164, 24A7D, 25ED7, 2677C, 26C21, 2A917, 2AA30, 2BD77, 2C494, 2C72F, and 2CB38], 2018-06-13 | |||||
N4988 | Proposal on Updating 11 G glyphs of CJK Unified Ideographs to ISO/IEC 10646 [Affects U+3B9D, 3CFD, 4A76, 6FF9, 809E, 891D, 21D4C, 2278B, 23AB8, 2459B, and 2A8FB], 2018-06-13 | |||||
N2336 | Modify the G glyph for U+23517, 2018-09-10 | |||||
N5016 | N2349 | Shin, Sanghyun; Cho, Sungduk; Pyo, Seungju; Kim, Kyongsok (2018-12-13), Request to move character K6-1022 in Horizontal Extension of KS X 1027-5 from U+3EAC to U+248F2 | ||||
N5020 (pdf, doc) | Umamaheswaran, V. S. (2019-01-11), "10.4.6, 10.4.8, and 10.4.9", Unconfirmed minutes of WG 2 meeting 67 | |||||
N2369 | Chan, Eiso (2019-05-06), Feedback on IRGN2369 [Affects U+20219 U+21249, U+21827, U+22C3A, U+2327B, U+2363B, U+23839, U+23FD5, U+24261, U+2548E, and U+26C9E] | |||||
N5086 | N2379 | Proposal of China’s horizontal extension for technical used characters [Affects U+23496, U+2355E, U+236ED, U+24726, U+26FE1, U+27334, and U+2A38C], 2019-05-10 | ||||
L2/19-237 | N5068 | Editorial Report on Miscellaneous Issues (meeting IRG#52) [Affects U+23517, U+248F2, and U+26657], 2019-05-17 | ||||
L2/19-244 | N5107 | TCA's UNC Proposal for WG2 submission [Affects U+27C0E], 2019-05-24 | ||||
L2/19-241 | N5083 | N2391 | Errata report for WG2 submission_TCA [Affects U+26657], 2019-05-31 | |||
N5082 | N2391 | Updated G Font of U+23517, 2019-05-31 | ||||
13.0 | U+2A6D7..2A6DD | 7 | L2/17-087 | Chan, Eiso; Wang, Xiaolei; Le, Hou; You, Jerry (2017-04-03), Proposal to encode characters for Gongche Notation | ||
L2/17-103 | Moore, Lisa (2017-05-18), "E.5", UTC #151 Minutes | |||||
N2299 | Chan, Eiso (2018-04-22), Request to discuss how to handle seven unencoded Gongche characters for Kunqu Opera | |||||
L2/18-245 | N4967 | Chan, Eiso; You, Jerry; Wang, Xiaolei; Le, Hou (2018-06-01), Updated proposal on Gongche characters for Kunqu Opera | ||||
L2/18-241 | Anderson, Deborah; et al. (2018-07-25), "17", Recommendations to UTC # 156 July 2018 on Script Proposals | |||||
L2/18-183 | Moore, Lisa (2018-11-20), "B.4.1", UTC #156 Minutes | |||||
N5020 (pdf, doc) | Umamaheswaran, V. S. (2019-01-11), "10.2.3", Unconfirmed minutes of WG 2 meeting 67 | |||||
N5122 | "M68.01", Unconfirmed minutes of WG 2 meeting 68, 2019-12-31 | |||||
L2/19-243 | N5106 | Suignard, Michel (2019-06-20), "Gongche", Disposition of comments on ISO/IEC CD.2 10646 6th edition | ||||
L2/19-270 | Moore, Lisa (2019-08-02), "Consensus 160-C9", UTC #160 Minutes | |||||
|
See also
References
- "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
- "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
- "Unicode Character Database: Standardized Variation Sequences". The Unicode Consortium.
- "Ideographic Variation Database". Unicode Consortium.
- "UTS #37, Unicode Ideographic Variation Database". Unicode Consortium.
- Eiso Chan (陈永聪), Comments on four error glyphs on CJK Unified Ideographs Ext B & E.
- "unifiable glyph variants" (PDF). Archived from the original (PDF) on 2006-05-15. Retrieved 2017-12-01.
- Cook, Richard (6 October 2003). "Defect Report on Duplicate Encoded CJK Forms" (PDF). ISO/IEC JTC1/SC2/WG2. Retrieved 2012-03-28.