ANSEL
ANSEL, the American National Standard for Extended Latin Alphabet Coded Character Set for Bibliographic Use, was a character set used in text encoding. It provided a table of coded values for the representation of characters of the extended Latin alphabet in machine-readable form for thirty-five languages written in the Latin alphabet and for fifty-one romanized languages. The standard was reaffirmed in 2003 although it has been administratively withdrawn by ANSI effective 14 February 2013.[1] It is registered as Registration # 231 in the ISO International Register of Coded Character Sets to be Used with Escape Sequences.[2][3]
Alias(es) | ISO-IR 231 |
---|---|
Standard | ANSI/NISO Z39.47 (withdrawn) |
Classification | Extended ASCII, 8-bit encoding |
Extends | US-ASCII |
Extensions | MARC Extended Latin, GEDCOM ANSEL |
ANSEL is composed of a set of 63 graphic characters intended for use with ASCII, the American National Standard Code for Information Interchange, ANSI X3.4-1986,[3] including 29 combining diacritic characters. A combining diacritic character precedes the spacing character on which it should be superimposed.[1] The initial revision of ANSEL was released in 1985.
Code page layout
The following table shows ANSI/NISO Z39.47-1993 (R2003).[1] Each character is shown with its Unicode equivalent.
_0 | _1 | _2 | _3 | _4 | _5 | _6 | _7 | _8 | _9 | _A | _B | _C | _D | _E | _F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0_ | NUL 0000 |
SOH 0001 |
STX 0002 |
ETX 0003 |
EOT 0004 |
ENQ 0005 |
ACK 0006 |
BEL 0007 |
BS 0008 |
HT 0009 |
LF 000A |
VT 000B |
FF 000C |
CR 000D |
SO 000E |
SI 000F |
1_ | DLE 0010 |
DC1 0011 |
DC2 0012 |
DC3 0013 |
DC4 0014 |
NAK 0015 |
SYN 0016 |
ETB 0017 |
CAN 0018 |
EM 0019 |
SUB 001A |
ESC 001B |
FS 001C |
GS 001D |
RS 001E |
US 001F |
2_ | SP 0020 |
! 0021 |
" 0022 |
# 0023 |
$ 0024 |
% 0025 |
& 0026 |
' 0027 |
( 0028 |
) 0029 |
* 002A |
+ 002B |
, 002C |
- 002D |
. 002E |
/ 002F |
3_ | 0 0030 |
1 0031 |
2 0032 |
3 0033 |
4 0034 |
5 0035 |
6 0036 |
7 0037 |
8 0038 |
9 0039 |
: 003A |
; 003B |
< 003C |
= 003D |
> 003E |
? 003F |
4_ | @ 0040 |
A 0041 |
B 0042 |
C 0043 |
D 0044 |
E 0045 |
F 0046 |
G 0047 |
H 0048 |
I 0049 |
J 004A |
K 004B |
L 004C |
M 004D |
N 004E |
O 004F |
5_ | P 0050 |
Q 0051 |
R 0052 |
S 0053 |
T 0054 |
U 0055 |
V 0056 |
W 0057 |
X 0058 |
Y 0059 |
Z 005A |
[ 005B |
\ 005C |
] 005D |
^ 005E |
_ 005F |
6_ | ` 0060 |
a 0061 |
b 0062 |
c 0063 |
d 0064 |
e 0065 |
f 0066 |
g 0067 |
h 0068 |
i 0069 |
j 006A |
k 006B |
l 006C |
m 006D |
n 006E |
o 006F |
7_ | p 0070 |
q 0071 |
r 0072 |
s 0073 |
t 0074 |
u 0075 |
v 0076 |
w 0077 |
x 0078 |
y 0079 |
z 007A |
{ 007B |
| 007C |
} 007D |
~ 007E |
DEL 007F |
8_ | ||||||||||||||||
9_ | ||||||||||||||||
A_ | Ł 0141 |
Ø 00D8 |
Đ 0110 |
Þ 00DE |
Æ 00C6 |
Œ 0152 |
ʹ 02B9 |
· 00B7 |
♭ 266D |
® 00AE |
± 00B1 |
Ơ 01A0 |
Ư 01AF |
ʼ 02BC |
||
B_ | ʻ 02BB |
ł 0142 |
ø 00F8 |
đ 0111 |
þ 00FE |
æ 00E6 |
œ 0153 |
ʺ 02BA |
ı 0131 |
£ 00A3 |
ð 00F0 |
ơ 01A1 |
ư 01B0 |
|||
C_ | ° 00B0 |
ℓ 2113 |
℗ 2117 |
© 00A9 |
♯ 266F |
¿ 00BF |
¡ 00A1 |
|||||||||
D_ | ||||||||||||||||
E_ | ̉ 0309 |
̀ 0300 |
́ 0301 |
̂ 0302 |
̃ 0303 |
̄ 0304 |
̆ 0306 |
̇ 0307 |
̈ 0308 |
̌ 030C |
̊ 030A |
︠ FE20 |
︡ FE21 |
̕ 0315 |
̋ 030B |
̐ 0310 |
F_ | ̧ 0327 |
̨ 0328 |
̣ 0323 |
̤ 0324 |
̥ 0325 |
̳ 0333 |
̲ 0332 |
̦ 0326 |
̜ 031C |
̮ 032E |
︢ FE22 |
︣ FE23 |
̓ 0313 |
Letter Number Punctuation Symbol Other Undefined
Use
GEDCOM
The GEDCOM specification for exchanging genealogical data refers to ANSEL (ANSI/NISO Z39.47-1985) as a valid text encoding for GEDCOM files and extends it with additional characters which are shown in the following table.[4][5]
Hex | Unicode | Glyph | Description |
---|---|---|---|
0xBE | 25A1 | □ | empty box |
0xBF | 25A0 | ■ | black box |
0xCD | 0065 | e | midline e |
0xCE | 006F | o | midline o |
0xCF | 00DF | ß | es zet |
0xFC | 0338 | ̸ | diacritic slash through char |
MARC21
The Extended Latin character set from MARC 21 is synchronized with ANSEL[3] but additionally supports the eszett (ß) character at C7 and the euro sign (€) at C8.[6]
References
- "Project Overview: ANSI/NISO Z39.47-1993 (R2003) Extended Latin Alphabet Coded Character Set for Bibliographic Use (ANSEL) (Inactive)". National Information Standards Organization. Archived from the original on 14 March 2014. Retrieved 5 May 2014.
- Extended Latin Alphabet Coded Character Set for Bibliographic Use (PDF) (National information standard specification). 1993 (R2003). Bethesda, Maryland: NISO Press. 3 May 1993. ISBN 1-880124-02-5. ISSN 1041-5653. OCLC 25546245. OL 12137795M. ANSI/NISO Z39.47-1993 (R2003). Archived from the original (PDF) on 14 March 2014. Retrieved 5 May 2014.
- "International Register Of Coded Character Sets To Be Used With Escape Sequences (Registration Listing Ordered By Registration Number)". International Register Of Coded Character Sets To Be Used With Escape Sequences. Information Technology Standards Commission of Japan. Archived from the original on 9 April 2014. Retrieved 5 May 2014.
- The Church of Jesus Christ of Latter-day Saints, Family History Department (2 December 1995). "Appendix D: ANSEL Character Set". The GEDCOM Standard Release 5.5 (Information standard specification). Salt Lake City, Utah: The Church of Jesus Christ of Latter-day Saints. pp. 87–89.
- The Church of Jesus Christ of Latter-day Saints, Family History Department (4 November 1993). The GEDCOM Standard Release 5.3 (Information standard specification). Salt Lake City, Utah: The Church of Jesus Christ of Latter-day Saints. pp. 67–72.
- "MARC 21 Specifications for Record Structure, Character Sets, and Exchange Media: Code Table Extended Latin (ANSEL)". Library Standards at the Library of Congress. Library of Congress. December 2007.