General Punctuation
General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interobang, and invisible mathematical operators.
General Punctuation | |
---|---|
Range | U+2000..U+206F (112 code points) |
Plane | BMP |
Scripts | Common (109 char.) Inherited (2 char.) |
Symbol sets | Punctuation Spaces Format controls |
Assigned | 111 code points |
Unused | 1 reserved code points 6 deprecated |
Unicode version history | |
1.0.0 | 67 (+67) |
1.1 | 76 (+9) |
3.0 | 83 (+7) |
3.2 | 95 (+12) |
4.0 | 97 (+2) |
4.1 | 106 (+9) |
5.1 | 107 (+1) |
6.3 | 111 (+4) |
Note: [1][2] |
Additional punctuation characters are in the Supplemental Punctuation block and sprinkled in dozens of other Unicode blocks.
Block
General Punctuation[1][2][3] Official Unicode Consortium code chart (PDF) | ||||||||||||||||
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
U+200x | NQ SP |
MQ SP |
EN SP |
EM SP |
3/M SP |
4/M SP |
6/M SP |
F SP |
P SP |
TH SP |
H SP |
ZW SP |
ZW NJ |
ZW J |
LRM | RLM |
U+201x | ‐ | NB ‑ |
‒ | – | — | ― | ‖ | ‗ | ‘ | ’ | ‚ | ‛ | “ | ” | „ | ‟ |
U+202x | † | ‡ | • | ‣ | ․ | ‥ | … | ‧ | L SEP |
P SEP |
LRE | RLE | LRO | RLO | NNB SP | |
U+203x | ‰ | ‱ | ′ | ″ | ‴ | ‵ | ‶ | ‷ | ‸ | ‹ | › | ※ | ‼ | ‽ | ‾ | ‿ |
U+204x | ⁀ | ⁁ | ⁂ | ⁃ | ⁄ | ⁅ | ⁆ | ⁇ | ⁈ | ⁉ | ⁊ | ⁋ | ⁌ | ⁍ | ⁎ | ⁏ |
U+205x | ⁐ | ⁑ | ⁒ | ⁓ | ⁔ | ⁕ | ⁖ | ⁗ | ⁘ | ⁙ | ⁚ | ⁛ | ⁜ | ⁝ | ⁞ | MM SP |
U+206x | WJ | ƒ() | × | , | + | LRI | RLI | FSI | PDI | I SS |
A SS |
I AFS |
A AFS |
NA DS |
NO DS | |
Notes |
Several characters in this block are usually not rendered with a directly visible glyph. Ten whitespace characters U+2002 through U+200B (fixed en or 1⁄2em, em, 1⁄3em, 1⁄4em, 1⁄6em, figure and punctuation space, variable thin or 1⁄5em and hair space, fixed zero-width space) and U+205F (math medium or 2⁄9 em space) differ by horizontal width, while U+2000 and U+2001 (en and em quad) are effectively aliases of U+2002 and U+2003, respectively; another two, U+202F and U+2060 (ill-termed word joiner) are variants of U+2009 or U+2004 and U+200B that prohibit line-breaks. Three zero-width characters U+200B through U+200D (space, non-joiner and joiner) differ in how they affect ligation and shaping of adjacent letters. Eleven invisible characters U+200E, U+200F (left-to-right and right-to-left mark), U+202A through U+202E (embeds, pops and overrides) and U+2066 through U+2069 (isolates) control the directionality of text unless higher-level markup overrides them. There are explicit line and paragraph separators at U+2018 and U+2019.
Emoji
The General Punctuation block contains two emoji: U+203C and U+2049.[3][4]
The block has four standardized variants defined to specify emoji-style (U+FE0F VS16) or text presentation (U+FE0E VS15) for the two emoji, both of which default to a text presentation.[5]
U+ | 203C | 2049 |
base code point | ‼ | ⁉ |
base+VS15 (text) | ‼︎ | ⁉︎ |
base+VS16 (emoji) | ‼️ | ⁉️ |
History
The following Unicode-related documents record the purpose and process of defining specific characters in the General Punctuation block:
Version | Final code points[lower-alpha 1] | Count | UTC ID | L2 ID | WG2 ID | Document |
---|---|---|---|---|---|---|
1.0.0 | U+2000..202E, 2030..203E, 2040..2044 | 67 | (to be determined) | |||
L2/11-438[lower-alpha 2][lower-alpha 3] | N4182 | Edberg, Peter (2011-12-22), Emoji Variation Sequences (Revision of L2/11-429) | ||||
L2/17-086 | Burge, Jeremy; et al. (2017-03-27), Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component | |||||
L2/17-103 | Moore, Lisa (2017-05-18), "E.1.7 Add ZWJ, VS-16, Keycaps & Tags to Emoji_Component", UTC #151 Minutes | |||||
1.1 | U+203F, 2045..2046 | 3 | (to be determined) | |||
U+206A..206F | 6 | (to be determined) | ||||
UTC/1992-xxx | Freytag, Asmus (1992-05-12), "C. Bidi", Unconfirmed minutes for UTC Meeting #52, May 8, 1992 at Xerox | |||||
L2/01-275 | Davis, Mark (2001-07-16), New Properties (ReservedForCf, Deprecated, Discouraged) | |||||
L2/01-301 | Whistler, Ken (2001-08-01), "Alternate format controls inherited from 10646", Analysis of Character Deprecation in the Unicode Standard | |||||
L2/01-326 | Davis, Mark (2001-08-15), New Properties: Reserved_Cf_Code_Point & Deprecated | |||||
L2/01-295R | Moore, Lisa (2001-11-06), "Motion 88-M13", Minutes from the UTC/L2 meeting #88 | |||||
3.0 | U+202F, 2048..2049 | 3 | L2/97-288 | N1603 | Umamaheswaran, V. S. (1997-10-24), "8.18", Unconfirmed Meeting Minutes, WG 2 Meeting # 33, Heraklion, Crete, Greece, 20 June - 4 July 1997 | |
L2/98-088 | N1711 | The Working Meeting on Mongolian Encoding Attended by Representatives of China and Mongolia, 1998-02-15 | ||||
L2/98-104 | N1734 | Whistler, Ken (1998-03-20), Comments on the Mongolian Encoding Proposal, WG2 N1711 | ||||
L2/98-252 (pdf, txt) | N1833RM (pdf, doc) | Moore, Richard (1998-05-04), Feedback on Ken Whistler's Comments on Mongolian Encoding: N 1734 | ||||
L2/98-251 (pdf, html, txt) | N1808 (pdf, doc) | Reply to "Proposal WG2 N1734" Raised at the Seattle Meeting Regarding "Proposal WG 2 N1711", 1998-07-09 | ||||
L2/98-281R (pdf, html) | Aliprand, Joan (1998-07-31), "Mongolian (IV.A)", Unconfirmed Minutes - UTC #77 & NCITS Subgroup L2 # 174 JOINT MEETING, Redmond, WA -- July 29-31, 1998 | |||||
N1862 | Revision of N1711 - Mongolian, 1998-09-17 | |||||
N1865 | US Position - Mongolian (N1711, N1734 and N1808), 1998-09-18 | |||||
N1918 | Paterson, Bruce (1998-10-28), Text for Combined PDAM registration and consideration ballot - SC2 N 3208 | |||||
L2/99-010 | N1903 (pdf, html, doc) | Umamaheswaran, V. S. (1998-12-30), "8.1.3", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25 | ||||
L2/99-075.1 | N1973 | Irish Comments on SC 2 N 3208, 1999-01-19 | ||||
L2/99-075 | N1972 (pdf, html, doc) | Summary of Voting on SC 2 N 3208, PDAM ballot on WD for ISO/IEC 10646-1/Amd. 29: Mongolian, 1999-02-12 | ||||
N2020 | Paterson, Bruce (1999-04-05), FPDAM 29 Text - Mongolian | |||||
L2/99-113 | Text for FPDAM ballot of ISO/IEC 10646, Amd. 29 - Mongolian, 1999-04-06 | |||||
L2/99-232 | N2003 | Umamaheswaran, V. S. (1999-08-03), "6.1.3 PDAM29 – Mongolian script", Minutes of WG 2 meeting 36, Fukuoka, Japan, 1999-03-09--15 | ||||
L2/99-304 | N2126 | Paterson, Bruce (1999-10-01), Revised Text for FDAM ballot of ISO/IEC 10646-1/FDAM 29, AMENDMENT 29: Mongolian | ||||
L2/99-381 | Final text for ISO/IEC 10646-1, FDAM 29 -- Mongolian, 1999-12-07 | |||||
L2/00-010 | N2103 | Umamaheswaran, V. S. (2000-01-05), "6.4.4", Minutes of WG 2 meeting 37, Copenhagen, Denmark: 1999-09-13--16 | ||||
L2/07-209 | Whistler, Ken (2007-07-05), UTR 14 and U+202F NARROW NO-BREAK SPACE | |||||
L2/11-438[lower-alpha 2][lower-alpha 3] | N4182 | Edberg, Peter (2011-12-22), Emoji Variation Sequences (Revision of L2/11-429) | ||||
L2/15-187 | Moore, Lisa (2015-08-11), "B.14.5", UTC #144 Minutes | |||||
L2/16-258 | N4752R2 | Eck, Greg (2016-09-19), Mongolian Base Forms, Positional Forms, & Variant Forms | ||||
L2/16-259 | N4753 | Eck, Greg; Rileke, Orlog Ou (2016-09-20), WG2 #65 Mongolian Discussion Points | ||||
L2/16-266 | N4763 | Anderson, Deborah; Whistler, Ken; McGowan, Rick; Pournader, Roozbeh; Glass, Andrew; Iancu, Laurențiu; Moore, Lisa (2016-09-26), "1. Mongolian", Comments on Mongolian, Small Khitan, and other WG2 #65 documents | ||||
L2/16-297 | N4769 | Anderson, Deborah (2016-10-27), Mongolian ad hoc report | ||||
U+204A | 1 | L2/98-214 | N1747 | Everson, Michael (1998-05-25), Contraction characters for the UCS | ||
L2/98-281R (pdf, html) | Aliprand, Joan (1998-07-31), "Characters from ISO 5426-2 (IV.C.5-6)", Unconfirmed Minutes - UTC #77 & NCITS Subgroup L2 # 174 JOINT MEETING, Redmond, WA -- July 29-31, 1998 | |||||
L2/98-292R (pdf, html, Figure 1) | "2.6", Comments on proposals to add characters from ISO standards developed by ISO/TC 46/SC 4, 1998-08-19 | |||||
L2/98-292 | N1840 | "2.6", Comments on proposals to add characters from ISO standards developed by ISO/TC 46/SC 4, 1998-08-25 | ||||
L2/98-301 | N1847 | Everson, Michael (1998-09-12), Responses to NCITS/L2 and Unicode Consortium comments on numerous proposals | ||||
L2/98-372 | N1884R2 (pdf, doc) | Whistler, Ken; et al. (1998-09-22), Additional Characters for the UCS | ||||
L2/98-329 | N1920 | Combined PDAM registration and consideration ballot on WD for ISO/IEC 10646-1/Amd. 30, AMENDMENT 30: Additional Latin and other characters, 1998-10-28 | ||||
L2/99-010 | N1903 (pdf, html, doc) | Umamaheswaran, V. S. (1998-12-30), "8.1.5.1", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25 | ||||
U+204B..204D | 3 | L2/98-215 | N1748 | Everson, Michael (1998-05-25), Additional signature mark characters for the UCS | ||
L2/98-281R (pdf, html) | Aliprand, Joan (1998-07-31), "Signature Marks (IV.C.7)", Unconfirmed Minutes - UTC #77 & NCITS Subgroup L2 # 174 JOINT MEETING, Redmond, WA -- July 29-31, 1998 | |||||
L2/98-292R (pdf, html, Figure 1) | "2.7", Comments on proposals to add characters from ISO standards developed by ISO/TC 46/SC 4, 1998-08-19 | |||||
L2/98-292 | N1840 | "2.7", Comments on proposals to add characters from ISO standards developed by ISO/TC 46/SC 4, 1998-08-25 | ||||
L2/98-301 | N1847 | Everson, Michael (1998-09-12), Responses to NCITS/L2 and Unicode Consortium comments on numerous proposals | ||||
L2/98-372 | N1884R2 (pdf, doc) | Whistler, Ken; et al. (1998-09-22), Additional Characters for the UCS | ||||
L2/98-329 | N1920 | Combined PDAM registration and consideration ballot on WD for ISO/IEC 10646-1/Amd. 30, AMENDMENT 30: Additional Latin and other characters, 1998-10-28 | ||||
L2/99-010 | N1903 (pdf, html, doc) | Umamaheswaran, V. S. (1998-12-30), "8.1.5.1", Minutes of WG 2 meeting 35, London, U.K.; 1998-09-21--25 | ||||
3.2 | U+2047, 2051 | 2 | L2/99-238 | Consolidated document containing 6 Japanese proposals, 1999-07-15 | ||
N2092 | Addition of forty eight characters, 1999-09-13 | |||||
L2/99-365 | Moore, Lisa (1999-11-23), Comments on JCS Proposals | |||||
L2/00-024 | Shibano, Kohji (2000-01-31), JCS proposal revised | |||||
L2/99-260R | Moore, Lisa (2000-02-07), "JCS Proposals", Minutes of the UTC/L2 meeting in Mission Viejo, October 26-28, 1999 | |||||
L2/00-098, L2/00-098-page5 | N2195 | Rationale for non-Kanji characters proposed by JCS committee, 2000-03-15 | ||||
L2/00-119[lower-alpha 4] | N2191R | Whistler, Ken; Freytag, Asmus (2000-04-19), Encoding Additional Mathematical Symbols in Unicode | ||||
L2/00-234 | N2203 (rtf, txt) | Umamaheswaran, V. S. (2000-07-21), "8.18, 8.20", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24 | ||||
L2/00-115R2 | Moore, Lisa (2000-08-08), "Motion 83-M11", Minutes Of UTC Meeting #83 | |||||
L2/00-297 | N2257 | Sato, T. K. (2000-09-04), JIS X 0213 symbols part-1 | ||||
L2/00-342 | N2278 | Sato, T. K.; Everson, Michael; Whistler, Ken; Freytag, Asmus (2000-09-20), Ad hoc Report on Japan feedback N2257 and N2258 | ||||
L2/01-050 | N2253 | Umamaheswaran, V. S. (2001-01-21), "7.16 JIS X0213 Symbols", Minutes of the SC2/WG2 meeting in Athens, September 2000 | ||||
U+204E..2050, 2057, 205F, 2061..2062 | 7 | L2/00-005R2 | Moore, Lisa (2000-02-14), "Motion 82-M11", Minutes of UTC #82 in San Jose | |||
L2/00-119[lower-alpha 4] | N2191R | Whistler, Ken; Freytag, Asmus (2000-04-19), Encoding Additional Mathematical Symbols in Unicode | ||||
L2/00-234 | N2203 (rtf, txt) | Umamaheswaran, V. S. (2000-07-21), "8.18", Minutes from the SC2/WG2 meeting in Beijing, 2000-03-21 -- 24 | ||||
L2/00-115R2 | Moore, Lisa (2000-08-08), "Motion 83-M11", Minutes Of UTC Meeting #83 | |||||
U+2052, 2063 | 2 | L2/01-142[lower-alpha 4] | N2336 | Beeton, Barbara; Freytag, Asmus; Ion, Patrick (2001-04-02), Additional Mathematical Symbols | ||
L2/01-156 | N2356 | Freytag, Asmus (2001-04-03), Additional Mathematical Characters (Draft 10) | ||||
L2/01-344 | N2353 (pdf, doc) | Umamaheswaran, V. S. (2001-09-09), "7.7 Mathematical Symbols", Minutes from SC2/WG2 meeting #40 -- Mountain View, April 2001 | ||||
U+2060 | 1 | L2/99-260R | Moore, Lisa (2000-02-07), "Unicode in Markup Languages", Minutes of the UTC/L2 meeting in Mission Viejo, October 26-28, 1999 | |||
L2/00-005R2 | Moore, Lisa (2000-02-14), "Zero Width Grapheme Break/Join", Minutes of UTC #82 in San Jose, Action Item for Arnold Winkler: As the zero width grapheme break/join proposal was withdrawn, re-open Action Item 81-12 (for Mark Davis to prepare a proposal for WG2 for the Zero Width Word Joiner.) | |||||
L2/00-258 | N2235 | Davis, Mark (2000-08-09), Proposal for addition of ZERO WIDTH WORD JOINER | ||||
L2/00-369 | Whistler, Ken (2000-10-06), "e. (ZERO WIDTH) WORD JOINER", WG2 in Vouliagmeni (Athens) | |||||
L2/01-050 | N2253 | Umamaheswaran, V. S. (2001-01-21), "7.7 Proposal for addition of ZERO WIDTH WORDJOINER", Minutes of the SC2/WG2 meeting in Athens, September 2000 | ||||
4.0 | U+2053..2054 | 2 | L2/02-141 | N2419 | Everson, Michael; et al. (2002-03-20), Uralic Phonetic Alphabet characters for the UCS | |
L2/02-192 | Everson, Michael (2002-05-02), Everson's Reply on UPA | |||||
N2442 | Everson, Michael; Kolehmainen, Erkki I.; Ruppel, Klaas; Trosterud, Trond (2002-05-21), Justification for placing the Uralic Phonetic Alphabet in the BMP | |||||
L2/02-291 | Whistler, Ken (2002-05-31), WG2 report from Dublin | |||||
L2/02-292 | Whistler, Ken (2002-06-03), Early look at WG2 consent docket | |||||
L2/02-166R2 | Moore, Lisa (2002-08-09), "Scripts and New Characters - UPA", UTC #91 Minutes | |||||
L2/02-253 | Moore, Lisa (2002-10-21), "Consensus 92-C2", UTC #92 Minutes | |||||
4.1 | U+2055 | 1 | L2/03-151R | Constable, Peter; Lloyd-Williams, James; Lloyd-Williams, Sue; Chowdhury, Shamsul Islam; Ali, Asaddar; Sadique, Mohammed; Chowdhury, Matiar Rahman (2003-05-10), Revised Proposal for Encoding Syloti Nagri Script in the BMP | ||
L2/03-136 | Moore, Lisa (2003-08-18), "Scripts and New Characters - Syloti Nagri Script", UTC #95 Minutes | |||||
U+2056, 2058..2059 | 3 | L2/03-282R | N2610R | Everson, Michael; Cleminson, Ralph (2003-09-04), Final proposal for encoding the Glagolitic script in the UCS | ||
L2/03-324 | N2642 | Pantelia, Maria (2003-10-06), Proposal to encode additional Greek editorial and punctuation characters in the UCS | ||||
U+205A..205C | 3 | L2/03-157 | Pantelia, Maria (2003-05-19), Additional Beta Code Characters not in Unicode (WIP) | |||
L2/03-193R | N2612-7 | Pantelia, Maria (2003-06-11), Proposal to encode additional Punctuation Characters in the UCS | ||||
U+205D | 1 | L2/02-312R | Pantelia, Maria (2002-11-07), Proposal to encode additional Greek editorial and punctuation characters in the UCS | |||
L2/03-324 | N2642 | Pantelia, Maria (2003-10-06), Proposal to encode additional Greek editorial and punctuation characters in the UCS | ||||
U+205E | 1 | L2/03-354 | N2655 | Freytag, Asmus (2003-10-10), Proposal -- Symbols used in Dictionaries | ||
L2/03-356R2 | Moore, Lisa (2003-10-22), "Consensus 97-C15", UTC #97 Minutes | |||||
5.1 | U+2064 | 1 | L2/07-011R | N3198R | Freytag, Asmus; Beeton, Barbara; Ion, Patrick; Sargent, Murray; Carlisle, David; Pournader, Roozbeh (2007-01-15), 29 Additional Mathematical and Symbol Characters | |
L2/07-015 | Moore, Lisa (2007-02-08), "Mathematical Characters and Symbols (C.4)", UTC #110 Minutes | |||||
L2/07-268 | N3253 (pdf, doc) | Umamaheswaran, V. S. (2007-07-26), "M50.16", Unconfirmed minutes of WG 2 meeting 50, Frankfurt-am-Main, Germany; 2007-04-24/27 | ||||
6.3 | U+2066..2069 | 4 | L2/12-186R | Lanin, Aharon; Davis, Mark; Pournader, Roozbeh (2012-07-24), A Proposal for Bidi Isolates in Unicode | ||
L2/12-290 | N4310 | Lanin, Aharon; Davis, Mark; Pournader, Roozbeh (2012-07-31), Proposal for Four Characters for Bidi | ||||
L2/12-239 | Moore, Lisa (2012-08-14), "Consensus 132-C12", UTC #132 Minutes | |||||
L2/13-040 | Pournader, Roozbeh; Lanin, Aharon (2013-01-29), Fasttracking Arabic Letter Mark (ALM) | |||||
L2/13-125 | N4447 | Constable, Peter (2013-06-10), Unicode Liaison Report to WG2 | ||||
|
References
- "Unicode character database". The Unicode Standard. Retrieved 2016-07-09.
- "Enumerated Versions of The Unicode Standard". The Unicode Standard. Retrieved 2016-07-09.
- "UTR #51: Unicode Emoji". Unicode Consortium. 2020-02-11.
- "UCD: Emoji Data for UTR #51". Unicode Consortium. 2020-01-28.
- "UTS #51 Emoji Variation Sequences". The Unicode Consortium.