Unicode alias names and abbreviations
In Unicode, characters can have a unique name. A character can also have one or more alias names. An alias name can be an abbreviation, a C0 or C1 control name, a correction, an alternate name or a figment. An alias too is unique over all names and aliases, and therefore identifying.
Background
The formal, primary Unicode name is unique over all names, only uses certain characters & format, and is guaranteed never to change. The formal name consists of characters A–Z (uppercase), 0–9, " " (space), and "-" (hyphen). Next to this name, a character can have one or more formal (normative) alias names. Such an alias name also follows the rules of a name: characters used (A-Z, -, 0-9, <space>) and not used (a-z, %, $, etc.). Alias names are also unique in the full name set (that is, all names and alias names are all unique in their combined set). Alias names are formally described in the Unicode Standard.[1][2]
In this sense, an abbreviation is also considered a name.
The Unicode standard also uses (publishes) "alias names" that are not formal, and are not listed in the normative NameAliases.txt
file. These names may not be unique and may use incorrect characters in their name.
Reason to add an alias
There are five possible reasons to assign an alias name to a code point.[1] A character can have multiple aliases: for example U+0008 <control-0008> has control alias BACKSPACE and abbreviation alias BS.
- 1. Abbreviation
- Commonly occurring abbreviations (or acronyms) for control codes, format characters, spaces, and variation selectors.
- There are 352 such aliases, including 256 aliases for variant selectors (VS-1 ... VS-256).
- For example, U+00A0 NO-BREAK SPACE has alias NBSP.
- Presentation: in the code charts, the abbreviation is shown in a dashed box: NBSP.
- 2. Control
- ISO 6429 names for C0 and C1 control functions and similar commonly occurring names, are added as an alias to the character.
- There are 84 such aliases.
- For example, U+0008 <control-0008> has alias BACKSPACE.
- Presentation: Control characters do not have a primary name, they are labeled like <control-0008>. Its alias name like BACKSPACE is used in the chart documentation, never as a primary name.
- 3. Correction
- This is a correction for a "serious problem" in the primary character name, usually an error.
- There are 28 such aliases.
- For example, U+2118 ℘ SCRIPT CAPITAL P is actually a lowercase p, and so is given alias name ※ WEIERSTRASS ELLIPTIC FUNCTION: "actually this has the form of a lowercase calligraphic p, despite its name, and through the alias the correct spelling is added."
- Presentation: A corrected name is preceded by symbol ※ (the reference mark).
- 4. Alternate
- A few widely used alternate names for format characters.
- There is 1 such alias.
- Example: U+FEFF ZERO WIDTH NO-BREAK SPACE has alternate BYTE ORDER MARK.
- Presentation: listed in character charts description.
- 5. Figment
- Several documented labels for C1 control code points which were never actually approved in any standard (wikt:figment = feigned, in fiction).
- There are 3 such aliases.
- For example, U+0099 <control-0099> has figment alias SINGLE GRAPHIC CHARACTER INTRODUCER. This name is an architectural concept from early drafts of ISO/IEC 10646-1, but it was never approved and standardized.
- Presentation: These figment abbreviations are not published in Standard; the chart shows "XXX" for each informally, that is: not a unique or identifying abbreviation.
Formal aliases
U+ | html decimal |
Name or <label> |
Alias | Reason | Chart | Note | |
---|---|---|---|---|---|---|---|
Abbr | Name | ||||||
0000 | � |
<control-0000> | NUL |
NULL | Control | C0 Controls and Basic Latin (pdf) | |
0001 |  |
<control-0001> | SOH |
START OF HEADING | Control | C0 Controls and Basic Latin (pdf) | |
0002 |  |
<control-0002> | STX |
START OF TEXT | Control | C0 Controls and Basic Latin (pdf) | |
0003 |  |
<control-0003> | ETX |
END OF TEXT | Control | C0 Controls and Basic Latin (pdf) | |
0004 |  |
<control-0004> | EOT |
END OF TRANSMISSION | Control | C0 Controls and Basic Latin (pdf) | |
0005 |  |
<control-0005> | ENQ |
ENQUIRY | Control | C0 Controls and Basic Latin (pdf) | |
0006 |  |
<control-0006> | ACK |
ACKNOWLEDGE | Control | C0 Controls and Basic Latin (pdf) | |
0007 |  |
<control-0007> | BEL |
ALERT | Control | C0 Controls and Basic Latin (pdf) | |
0008 |  |
<control-0008> | BS |
BACKSPACE | Control | C0 Controls and Basic Latin (pdf) | |
0009 | 	 	 |
<control-0009> | TAB |
CHARACTER TABULATION | Control | C0 Controls and Basic Latin (pdf) | |
HT |
HORIZONTAL TABULATION | Control | |||||
000A | |
<control-000A> | LF |
LINE FEED | Control | C0 Controls and Basic Latin (pdf) | |
NL |
NEW LINE | Control | |||||
EOL |
END OF LINE | Control | |||||
000B |  |
<control-000B> | LINE TABULATION | Control | C0 Controls and Basic Latin (pdf) | ||
VT |
VERTICAL TABULATION | Control | |||||
000C |  |
<control-000C> | FF |
FORM FEED | Control | C0 Controls and Basic Latin (pdf) | |
000D | |
<control-000D> | CR |
CARRIAGE RETURN | Control | C0 Controls and Basic Latin (pdf) | |
000E |  |
<control-000E> | SO |
SHIFT OUT | Control | C0 Controls and Basic Latin (pdf) | |
LOCKING-SHIFT ONE | Control | ||||||
000F |  |
<control-000F> | SI |
SHIFT IN | Control | C0 Controls and Basic Latin (pdf) | |
LOCKING-SHIFT ZERO | Control | ||||||
0010 |  |
<control-0010> | DLE |
DATA LINK ESCAPE | Control | C0 Controls and Basic Latin (pdf) | |
0011 |  |
<control-0011> | DC1 |
DEVICE CONTROL ONE | Control | C0 Controls and Basic Latin (pdf) | |
0012 |  |
<control-0012> | DC2 |
DEVICE CONTROL TWO | Control | C0 Controls and Basic Latin (pdf) | |
0013 |  |
<control-0013> | DC3 |
DEVICE CONTROL THREE | Control | C0 Controls and Basic Latin (pdf) | |
0014 |  |
<control-0014> | DC4 |
DEVICE CONTROL FOUR | Control | C0 Controls and Basic Latin (pdf) | |
0015 |  |
<control-0015> | NAK |
NEGATIVE ACKNOWLEDGE | Control | C0 Controls and Basic Latin (pdf) | |
0016 |  |
<control-0016> | SYN |
SYNCHRONOUS IDLE | Control | C0 Controls and Basic Latin (pdf) | |
0017 |  |
<control-0017> | ETB |
END OF TRANSMISSION BLOCK | Control | C0 Controls and Basic Latin (pdf) | |
0018 |  |
<control-0018> | CAN |
CANCEL | Control | C0 Controls and Basic Latin (pdf) | |
0019 |  |
<control-0019> | EOM |
END OF MEDIUM | Control | C0 Controls and Basic Latin (pdf) | |
001A |  |
<control-001A> | SUB |
SUBSTITUTE | Control | C0 Controls and Basic Latin (pdf) | |
001B |  |
<control-001B> | ESC |
ESCAPE | Control | C0 Controls and Basic Latin (pdf) | |
001C |  |
<control-001C> | INFORMATION SEPARATOR FOUR | Control | C0 Controls and Basic Latin (pdf) | ||
FS |
FILE SEPARATOR | Control | |||||
001D |  |
<control-001D> | INFORMATION SEPARATOR THREE | Control | C0 Controls and Basic Latin (pdf) | ||
GS |
GROUP SEPARATOR | Control | |||||
001E |  |
<control-001E> | INFORMATION SEPARATOR TWO | Control | C0 Controls and Basic Latin (pdf) | ||
RS |
RECORD SEPARATOR | Control | |||||
001F |  |
<control-001F> | INFORMATION SEPARATOR ONE | Control | C0 Controls and Basic Latin (pdf) | ||
US |
UNIT SEPARATOR | Control | |||||
0020 |   |
SPACE | SP |
Abbreviation | C0 Controls and Basic Latin (pdf) | ||
007F |  |
<control-007F> | DEL |
DELETE | Control | C0 Controls and Basic Latin (pdf) | |
0080 | € |
<control-0080> | PAD |
PADDING CHARACTER | Figment | C1 Controls and Latin-1 Supplement (pdf) | Aliases are not widely published by Unicode; chart shows non-unique XXX |
0081 |  |
<control-0081> | HOP |
HIGH OCTET PRESET | Figment | C1 Controls and Latin-1 Supplement (pdf) | Aliases are not widely published by Unicode; chart shows non-unique XXX |
0082 | ‚ |
<control-0082> | BPH |
BREAK PERMITTED HERE | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0083 | ƒ |
<control-0083> | NBH |
NO BREAK HERE | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0084 | „ |
<control-0084> | IND |
INDEX | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0085 | … |
<control-0085> | NEL |
NEXT LINE | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0086 | † |
<control-0086> | SSA |
START OF SELECTED AREA | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0087 | ‡ |
<control-0087> | ESA |
END OF SELECTED AREA | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0088 | ˆ |
<control-0088> | CHARACTER TABULATION SET | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
HTS |
HORIZONTAL TABULATION SET | Control | |||||
0089 | ‰ |
<control-0089> | CHARACTER TABULATION WITH JUSTIFICATION | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
HTJ |
HORIZONTAL TABULATION WITH JUSTIFICATION | Control | |||||
008A | Š |
<control-008A> | LINE TABULATION SET | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
VTS |
VERTICAL TABULATION SET | Control | |||||
008B | ‹ |
<control-008B> | PARTIAL LINE FORWARD | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
PLD |
PARTIAL LINE DOWN | Control | |||||
008C | Œ |
<control-008C> | PARTIAL LINE BACKWARD | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
PLU |
PARTIAL LINE UP | Control | |||||
008D |  |
<control-008D> | REVERSE LINE FEED | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
RI |
REVERSE INDEX | Control | |||||
008E | Ž |
<control-008E> | SINGLE SHIFT TWO | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
SS2 |
SINGLE-SHIFT-2 | Control | |||||
008F |  |
<control-008F> | SINGLE SHIFT THREE | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
SS3 |
SINGLE-SHIFT-3 | Control | |||||
0090 |  |
<control-0090> | DCS |
DEVICE CONTROL STRING | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0091 | ‘ |
<control-0091> | PRIVATE USE ONE | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
PU1 |
PRIVATE USE-1 | Control | |||||
0092 | ’ |
<control-0092> | PRIVATE USE TWO | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
PU2 |
PRIVATE USE-2 | Control | |||||
0093 | “ |
<control-0093> | STS |
SET TRANSMIT STATE | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0094 | ” |
<control-0094> | CCH |
CANCEL CHARACTER | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0095 | • |
<control-0095> | MW |
MESSAGE WAITING | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0096 | – |
<control-0096> | START OF GUARDED AREA | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
SPA |
START OF PROTECTED AREA | Control | |||||
0097 | — |
<control-0097> | END OF GUARDED AREA | Control | C1 Controls and Latin-1 Supplement (pdf) | ||
EPA |
END OF PROTECTED AREA | Control | |||||
0098 | ˜ |
<control-0098> | SOS |
START OF STRING | Control | C1 Controls and Latin-1 Supplement (pdf) | |
0099 | ™ |
<control-0099> | SGC |
SINGLE GRAPHIC CHARACTER INTRODUCER | Figment | C1 Controls and Latin-1 Supplement (pdf) | Aliases are not widely published by Unicode; chart shows non-unique XXX |
009A | š |
<control-009A> | SCI |
SINGLE CHARACTER INTRODUCER | Control | C1 Controls and Latin-1 Supplement (pdf) | |
009B | › |
<control-009B> | CSI |
CONTROL SEQUENCE INTRODUCER | Control | C1 Controls and Latin-1 Supplement (pdf) | |
009C | œ |
<control-009C> | ST |
STRING TERMINATOR | Control | C1 Controls and Latin-1 Supplement (pdf) | |
009D |  |
<control-009D> | OSC |
OPERATING SYSTEM COMMAND | Control | C1 Controls and Latin-1 Supplement (pdf) | |
009E | ž |
<control-009E> | PM |
PRIVACY MESSAGE | Control | C1 Controls and Latin-1 Supplement (pdf) | |
009F | Ÿ |
<control-009F> | APC |
APPLICATION PROGRAM COMMAND | Control | C1 Controls and Latin-1 Supplement (pdf) | |
00A0 | ,     |
NO-BREAK SPACE | NBSP |
Abbreviation | C1 Controls and Latin-1 Supplement (pdf) | ||
00AD | ­ ­ |
SOFT HYPHEN | SHY |
Abbreviation | C1 Controls and Latin-1 Supplement (pdf) | ||
01A2 | Ƣ |
LATIN CAPITAL LETTER OI | LATIN CAPITAL LETTER GHA | ※ Correction | Latin Extended-B (pdf) | ||
01A3 | ƣ |
LATIN SMALL LETTER OI | LATIN SMALL LETTER GHA | ※ Correction | Latin Extended-B (pdf) | ||
034F | ͏ |
COMBINING GRAPHEME JOINER | CGJ |
Abbreviation | Combining Diacritical Marks (pdf) | The name of this character is misleading; it does not actually join graphemes | |
061C | ؜ |
ARABIC LETTER MARK | ALM |
Abbreviation | Arabic (pdf) | See RLM | |
0709 | ܉ |
SYRIAC SUBLINEAR COLON SKEWED RIGHT | SYRIAC SUBLINEAR COLON SKEWED LEFT | ※ Correction | Syriac (pdf) | ||
0CDE | ೞ |
KANNADA LETTER FA | KANNADA LETTER LLLA | ※ Correction | Kannada (pdf) | ||
0E9D | ຝ |
LAO LETTER FO TAM | LAO LETTER FO FON | ※ Correction | Lao (pdf) | ||
0E9F | ຟ |
LAO LETTER FO SUNG | LAO LETTER FO FAY | ※ Correction | Lao (pdf) | ||
0EA3 | ຣ |
LAO LETTER LO LING | LAO LETTER RO | ※ Correction | Lao (pdf) | ||
0EA5 | ລ |
LAO LETTER LO LOOT | LAO LETTER LO | ※ Correction | Lao (pdf) | ||
0FD0 | ࿐ |
TIBETAN MARK BSKA- SHOG GI MGO RGYAN | TIBETAN MARK BKA- SHOG GI MGO RGYAN | ※ Correction | Tibetan (pdf) | ||
11EC | ᇬ |
HANGUL JONGSEONG IEUNG-KIYEOK | HANGUL JONGSEONG YESIEUNG-KIYEOK | ※ Correction | Hangul Jamo (pdf) | ||
11ED | ᇭ |
HANGUL JONGSEONG IEUNG-SSANGKIYEOK | HANGUL JONGSEONG YESIEUNG-SSANGKIYEOK | ※ Correction | Hangul Jamo (pdf) | ||
11EE | ᇮ |
HANGUL JONGSEONG SSANGIEUNG | HANGUL JONGSEONG SSANGYESIEUNG | ※ Correction | Hangul Jamo (pdf) | ||
11EF | ᇯ |
HANGUL JONGSEONG IEUNG-KHIEUKH | HANGUL JONGSEONG YESIEUNG-KHIEUKH | ※ Correction | Hangul Jamo (pdf) | ||
180B | ᠋ |
MONGOLIAN FREE VARIATION SELECTOR ONE | FVS1 |
Abbreviation | Mongolian (pdf) | ||
180C | ᠌ |
MONGOLIAN FREE VARIATION SELECTOR TWO | FVS2 |
Abbreviation | Mongolian (pdf) | ||
180D | ᠍ |
MONGOLIAN FREE VARIATION SELECTOR THREE | FVS3 |
Abbreviation | Mongolian (pdf) | ||
180E | ᠎ |
MONGOLIAN VOWEL SEPARATOR | MVS |
Abbreviation | Mongolian (pdf) | ||
200B | ​, ​, ​, ​, ​ ​ |
ZERO WIDTH SPACE | ZWSP |
Abbreviation | General Punctuation (pdf) | ||
200C | ‌ ‌ |
ZERO WIDTH NON-JOINER | ZWNJ |
Abbreviation | General Punctuation (pdf) | ||
200D | ‍ ‍ |
ZERO WIDTH JOINER | ZWJ |
Abbreviation | General Punctuation (pdf) | ||
200E | ‎ ‎ |
LEFT-TO-RIGHT MARK | LRM |
Abbreviation | General Punctuation (pdf) | ||
200F | ‏ ‏ |
RIGHT-TO-LEFT MARK | RLM |
Abbreviation | General Punctuation (pdf) | ||
202A | ‪ |
LEFT-TO-RIGHT EMBEDDING | LRE |
Abbreviation | General Punctuation (pdf) | ||
202B | ‫ |
RIGHT-TO-LEFT EMBEDDING | RLE |
Abbreviation | General Punctuation (pdf) | ||
202C | ‬ |
POP DIRECTIONAL FORMATTING | PDF |
Abbreviation | General Punctuation (pdf) | ||
202D | ‭ |
LEFT-TO-RIGHT OVERRIDE | LRO |
Abbreviation | General Punctuation (pdf) | ||
202E | ‮ |
RIGHT-TO-LEFT OVERRIDE | RLO |
Abbreviation | General Punctuation (pdf) | ||
202F |   |
NARROW NO-BREAK SPACE | NNBSP |
Abbreviation | General Punctuation (pdf) | ||
205F |     |
MEDIUM MATHEMATICAL SPACE | MMSP |
Abbreviation | General Punctuation (pdf) | ||
2060 | ⁠ ⁠ |
WORD JOINER | WJ |
Abbreviation | General Punctuation (pdf) | ||
2066 | ⁦ |
LEFT-TO-RIGHT ISOLATE | LRI |
Abbreviation | General Punctuation (pdf) | ||
2067 | ⁧ |
RIGHT-TO-LEFT ISOLATE | RLI |
Abbreviation | General Punctuation (pdf) | ||
2068 | ⁨ |
FIRST STRONG ISOLATE | FSI |
Abbreviation | General Punctuation (pdf) | ||
2069 | ⁩ |
POP DIRECTIONAL ISOLATE | PDI |
Abbreviation | General Punctuation (pdf) | ||
2118 | ℘, ℘ ℘ |
SCRIPT CAPITAL P | WEIERSTRASS ELLIPTIC FUNCTION | ※ Correction | Letterlike Symbols (pdf) | ||
2448 | ⑈ |
OCR DASH | MICR ON US SYMBOL | ※ Correction | Optical Character Recognition (pdf) | ||
2449 | ⑉ |
OCR CUSTOMER ACCOUNT NUMBER | MICR DASH SYMBOL | ※ Correction | Optical Character Recognition (pdf) | ||
2B7A | ⭺ |
LEFTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE HORIZONTAL STROKE | LEFTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE VERTICAL STROKE | ※ Correction | Miscellaneous Symbols and Arrows (pdf) | ||
2B7C | ⭼ |
RIGHTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE HORIZONTAL STROKE | RIGHTWARDS TRIANGLE-HEADED ARROW WITH DOUBLE VERTICAL STROKE | ※ Correction | Miscellaneous Symbols and Arrows (pdf) | ||
A015 | ꀕ |
YI SYLLABLE WU | YI SYLLABLE ITERATION MARK | ※ Correction | Yi Syllables (pdf) | ||
FE00 ... FE0F |
︀ ... ️ |
VARIATION SELECTOR-1 ... VARIATION SELECTOR-16 |
VS1 ... VS16 |
Abbreviation | Variation Selectors (pdf) | ||
(16 code points) | |||||||
Abbreviation | |||||||
FE18 | ︘ |
PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET | PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET | ※ Correction | Vertical Forms (pdf) | ||
FEFF |  |
ZERO WIDTH NO-BREAK SPACE | BOM |
BYTE ORDER MARK | Alternate | Arabic Presentation Forms-B (pdf) | |
ZWNBSP |
Abbreviation | ||||||
122D4 | 𒋔 |
CUNEIFORM SIGN SHIR TENU | CUNEIFORM SIGN NU11 TENU | ※ Correction | Cuneiform (pdf) | ||
122D5 | 𒋕 |
CUNEIFORM SIGN SHIR OVER SHIR BUR OVER BUR | CUNEIFORM SIGN NU11 OVER NU11 BUR OVER BUR | ※ Correction | Cuneiform (pdf) | ||
16E56 | 𖹖 |
MEDEFAIDRIN CAPITAL LETTER HP | MEDEFAIDRIN CAPITAL LETTER H | ※ Correction | Medefaidrin (pdf) | ||
16E57 | 𖹗 |
MEDEFAIDRIN CAPITAL LETTER NY | MEDEFAIDRIN CAPITAL LETTER NG | ※ Correction | Medefaidrin (pdf) | ||
16E76 | 𖹶 |
MEDEFAIDRIN SMALL LETTER HP | MEDEFAIDRIN SMALL LETTER H | ※ Correction | Medefaidrin (pdf) | ||
16E77 | 𖹷 |
MEDEFAIDRIN SMALL LETTER NY | MEDEFAIDRIN SMALL LETTER NG | ※ Correction | Medefaidrin (pdf) | ||
1B001 | 𛀁 |
HIRAGANA LETTER ARCHAIC YE | HENTAIGANA LETTER E-1 | ※ Correction | Kana Supplement (pdf) | ||
1D0C5 | 𝃅 |
BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS | BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS | ※ Correction | Byzantine Musical Symbols (pdf) | ||
E0100 ... E01EF |
󠄀 ... 󠇯 |
VARIATION SELECTOR-17 ... VARIATION SELECTOR-1256 |
VS17 ... VS256 |
Abbreviation | Variation Selectors Supplement (pdf) | ||
(240 code points) | |||||||
Abbreviation |
See also
- Control Pictures Separate characters (glyphs) to represent a control character. For example, U+2407 ␇ SYMBOL FOR BELL (U+0007).
- U+FFFD � REPLACEMENT CHARACTER (HTML
�
) - Regional Indicator Symbols in the Enclosed Alphanumeric Supplement (Unicode block)
- Tags (Unicode block)
References
- "NameAliases-13.0.0.txt". The Unicode Consortium. 2019-09-09. Retrieved 2020-03-12.
- "The Unicode Standard" (PDF). 13.0.0. The Unicode Consortium. 2020. ISBN 978-1-936213-26-9.