ISO/IEC 8859-8

ISO/IEC 8859-8, Information technology — 8-bit single-byte coded graphic character sets — Part 8: Latin/Hebrew alphabet, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings. ISO/IEC 8859-8:1999 from 1999 represents its second and current revision, preceded by the first edition ISO/IEC 8859-8:1988 in 1988. It is informally referred to as Latin/Hebrew. ISO/IEC 8859-8 covers all the Hebrew letters, but no Hebrew vowel signs. IBM assigned code page 916 (CCSIDs 916 and 5012) to it.[2][3][4] This character set was also adopted by Israeli Standard SI1311:2002, with some extensions.

ISO-8859-8: Latin/Hebrew
MIME / IANAISO-8859-8
Alias(es)iso-ir-138, hebrew, csISOLatinHebrew[1]
Language(s)Hebrew, English
StandardISO/IEC 8859-8, SI 1311
Classificationextended ASCII, ISO 8859
Based onDEC Hebrew (8-bit), ISO/IEC 8859-1
Other related encoding(s)Windows-1255

ISO-8859-8 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. The text is (usually) in logical order, so bidi processing is required for display. Nominally ISO-8859-8 (code page 28598) is for “visual order”, and ISO-8859-8-I (code page 38598) is for logical order. But usually in practice, and required for XML documents, ISO-8859-8 also stands for logical order text. The WHATWG Encoding Standard used by HTML5 treats ISO-8859-8 and ISO-8859-8-I as distinct encodings with the same mapping due to influence on the layout direction, but notes that this no longer applies to ISO-8859-6 (Arabic), only to ISO-8859-8.[5]

There is also ISO-8859-8-E which supposedly requires directionality to be explicitly specified with special control characters; this latter variant is in practice unused.

The Microsoft Windows code page for Hebrew, Windows-1255, is mostly an extension of ISO/IEC 8859-8 without C1 controls, except for the omission of the double underscore, and replacement of the generic currency sign (¤) with the sheqel sign (₪). It adds support for vowel points as combining characters, and some additional punctuation.

Over a decade after the publication of that standard, Unicode is preferred, at least for the Internet[6] (meaning UTF-8, the dominant encoding for web pages). ISO-8859-8 is used by less that 0.1% of websites.[7]

Code page layout

ISO/IEC 8859-8[8][9][10][11]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
0_
0
1_
16
2_
32
SP
0020
!
0021
"
0022
#
0023
$
0024
%
0025
&
0026
'
0027
(
0028
)
0029
*
002A
+
002B
,
002C
-
002D
.
002E
/
002F
3_
48
0
0030
1
0031
2
0032
3
0033
4
0034
5
0035
6
0036
7
0037
8
0038
9
0039
:
003A
;
003B
<
003C
=
003D
>
003E
?
003F
4_
64
@
0040
A
0041
B
0042
C
0043
D
0044
E
0045
F
0046
G
0047
H
0048
I
0049
J
004A
K
004B
L
004C
M
004D
N
004E
O
004F
5_
80
P
0050
Q
0051
R
0052
S
0053
T
0054
U
0055
V
0056
W
0057
X
0058
Y
0059
Z
005A
[
005B
\
005C
]
005D
^
005E
_
005F
6_
96
`
0060
a
0061
b
0062
c
0063
d
0064
e
0065
f
0066
g
0067
h
0068
i
0069
j
006A
k
006B
l
006C
m
006D
n
006E
o
006F
7_
112
p
0070
q
0071
r
0072
s
0073
t
0074
u
0075
v
0076
w
0077
x
0078
y
0079
z
007A
{
007B
|
007C
}
007D
~
007E
8_
128
9_
144
A_
160
NBSP
00A0
¢
00A2
£
00A3
¤
00A4
¥
00A5
¦
00A6
§
00A7
¨
00A8
©
00A9
×
00D7
«
00AB
¬
00AC
SHY
00AD
®
00AE
¯
00AF
B_
176
°
00B0
±
00B1
²
00B2
³
00B3
´
00B4
µ
00B5

00B6
·
00B7
¸
00B8
¹
00B9
÷
00F7
»
00BB
¼
00BC
½
00BD
¾
00BE
C_
192
D_
208

2017
E_
224
א
05D0
ב
05D1
ג
05D2
ד
05D3
ה
05D4
ו
05D5
ז
05D6
ח
05D7
ט
05D8
י
05D9
ך
05DA
כ
05DB
ל
05DC
ם
05DD
מ
05DE
ן
05DF
F_
240
נ
05E0
ס
05E1
ע
05E2
ף
05E3
פ
05E4
ץ
05E5
צ
05E6
ק
05E7
ר
05E8
ש
05E9
ת
05EA
LRM
200E
RLM
200F

  Letter  Number  Punctuation  Symbol  Other  Undefined

  Different from DEC Hebrew (8-bit) to match ISO-8859-1.
  Different from both DEC Hebrew (8-bit) and ISO-8859-1.

FD is left-to-right mark (U+200E) and FE is right-to-left mark (U+200F), as specified in a newer amendment as ISO/IEC 8859-8:1999.

2002 Israeli Standard extensions

Israeli Standard SI1311:2002 matches ISO/IEC 8859-8:1999 except for a number of additional character allocations for the euro sign, new shekel sign and more advanced explicit bidirectional formatting.[12]

SI1311:2002[12]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
D_
208

20AC

20AA
LRO
202D
RLO
202E
PDF
202C

2017
E_
224
א
05D0
ב
05D1
ג
05D2
ד
05D3
ה
05D4
ו
05D5
ז
05D6
ח
05D7
ט
05D8
י
05D9
ך
05DA
כ
05DB
ל
05DC
ם
05DD
מ
05DE
ן
05DF
F_
240
נ
05E0
ס
05E1
ע
05E2
ף
05E3
פ
05E4
ץ
05E5
צ
05E6
ק
05E7
ר
05E8
ש
05E9
ת
05EA
LRE
202A
RLE
202B
LRM
200E
RLM
200F
  Absent from ISO/IEC 8859-8:1999, added in SI1311:2002.
gollark: Explain how it is first.
gollark: It's not saying that.
gollark: As best I can tell this is saying something about a "gravitomagnetic" effect and (best attempt to parse the insanity) you're trying to go from some reference to that to "so obviously something something gravity magnetism" to "everything is electromagnetism, electric universe, intergalactic Birkeland currents".
gollark: Not really?
gollark: Well, see, you're effectively just trying to push a ton of random papers and jargon with no explanation, so no.

See also

  • 8-bit DEC Hebrew (similar DEC code page)
  • Code page 1255 (similar Windows code page)
  • SI 960
  • 7-bit DEC Hebrew

References

  1. Character Sets, Internet Assigned Numbers Authority (IANA), 2018-12-12
  2. "Code page 916 information document". Archived from the original on 2017-02-16.
  3. "CCSID 916 information document". Archived from the original on 2014-11-29.
  4. "CCSID 5012 information document". Archived from the original on 2016-03-27.
  5. van Kesteren, Anne. "9. Legacy single-byte encodings". Encoding Standard. WHATWG. Note: ISO-8859-8 and ISO-8859-8-I are distinct encoding names, because ISO-8859-8 has influence on the layout direction. And although historically this might have been the case for ISO-8859-6 and "ISO-8859-6-I" as well, that is no longer true.
  6. John, Nicholas A. (2013). "The Construction of the Multilingual Internet: Unicode, Hebrew, and Globalization". Journal of Computer-Mediated Communication. 18 (3): 321–338. doi:10.1111/jcc4.12015. ISSN 1083-6101. Background: the problem of Hebrew and the Internet
  7. "Usage Statistics of ISO-8859-8 for Websites, January 2019". w3techs.com. Retrieved 2019-01-17.
  8. Code Page CPGID 00916 (pdf) (PDF), IBM
  9. Code Page CPGID 00916 (txt), IBM
  10. International Components for Unicode (ICU), ibm-916_P100-1995.ucm, 2002-12-03
  11. International Components for Unicode (ICU), ibm-5012_P100-1999.ucm, 2002-12-03
  12. Standards Institution of Israel. "ISO-IR 234: Latin/Hebrew character set for 8-bit codes" (PDF). Information Technology Standards Commission of Japan (ITSCJ/IPSJ).
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.