Code page 852

Code page 852 (CCSID 852) (also known as CP 852, IBM 00852, OEM 852 (Latin II),[2][3] MS-DOS Latin 2[4]) is a code page used under DOS to write Central European languages that use Latin script (such as Bosnian, Croatian, Czech, Hungarian, Polish, Romanian, Serbian, Slovak or Slovene).[5]

OEM 852 (DOS-Latin 2)
MIME / IANAIBM852
Alias(es)cp852, 852, csPCp852[1]
Language(s)Gaj's Latin alphabet (Bosnian, Croatian, Serbian), Slovene, Czech, Slovak, Polish, Romanian, Hungarian
ClassificationOEM code page, extended ASCII
Based onOEM 850 (DOS-Latin 1), OEM 437 (OEM-US)
Transforms / EncodesISO/IEC 8859-2 (reordered)

CCSID 9044 is the euro currency update of code page/CCSID 852.[6] Byte AA replaces ¬ with € in that update.[7][8]

Note that code page 852 (DOS Latin 2) is very different from ISO/IEC 8859-2 (ISO Latin-2), although both are informally referred to as "Latin-2" in different language regions.[9] However, all printable characters from ISO 8859-2 are included, in a different arrangement which preserves a subset of the box drawing characters of the original DOS code page 437, while sacrificing others (those combining both single and double lining) in order to include more letters with diacritics. This is the same approach taken by code page 850, the equivalent for ISO 8859-1.

This reduced box-drawing support caused display glitches in DOS applications that made use of the box drawing characters to display a GUI-like surface in text mode (e.g. Norton Commander). Several local, more language-specific encodings were invented to avoid the problem, for example the Kamenický encoding for Czech and Slovak[10] or the Mazovia encoding for Polish.

Character set

The following table shows code page 852.[2][11] Each character is shown with its equivalent Unicode code point. Only the second half of the table (128255) is shown, the first half (0127) being the same as code page 437.

Code page 852[4][7][8][12]
_0 _1 _2 _3 _4 _5 _6 _7 _8 _9 _A _B _C _D _E _F
8_
128
Ç
00C7
ü
00FC
é
00E9
â
00E2
ä
00E4
ů
016F
ć
0107
ç
00E7
ł
0142
ë
00EB
Ő
0150
ő
0151
î
00EE
Ź
0179
Ä
00C4
Ć
0106
9_
144
É
00C9
Ĺ
0139
ĺ
013A
ô
00F4
ö
00F6
Ľ
013D
ľ
013E
Ś
015A
ś
015B
Ö
00D6
Ü
00DC
Ť
0164
ť
0165
Ł
0141
×
00D7
č
010D
A_
160
á
00E1
í
00ED
ó
00F3
ú
00FA
Ą
0104
ą
0105
Ž
017D
ž
017E
Ę
0118
ę
0119
¬
00AC
ź
017A
Č
010C
ş
015F
«
00AB
»
00BB
B_
176

2591

2592

2593

2502

2524
Á
00C1
Â
00C2
Ě
011A
Ş
015E

2563

2551

2557

255D
Ż
017B
ż
017C

2510
C_
192

2514

2534

252C

251C

2500

253C
Ă
0102
ă
0103

255A

2554

2569

2566

2560

2550

256C
¤
00A4
D_
208
đ
0111
Ð
00D0
Ď
010E
Ë
00CB
ď
010F
Ň
0147
Í
00CD
Î
00CE
ě
011B

2518

250C

2588

2584
Ţ
0162
Ů
016E

2580
E_
224
Ó
00D3
ß
00DF
Ô
00D4
Ń
0143
ń
0144
ň
0148
Š
0160
š
0161
Ŕ
0154
Ú
00DA
ŕ
0155
Ű
0170
ý
00FD
Ý
00DD
ţ
0163
´
00B4
F_
240
SHY
00AD
˝
02DD
˛
02DB
ˇ
02C7
˘
02D8
§
00A7
÷
00F7
¸
00B8
°
00B0
¨
00A8
˙
02D9
ű
0171
Ř
0158
ř
0159

25A0
NBSP
00A0

  Letter  Number  Punctuation  Symbol  Other  Undefined

Points different from both code page 437 and code page 850 are shaded, while differences from code page 437 which match code page 850 are shown boxed.

gollark: Although of course you can mostly just pass programs `--yes` flags these days.
gollark: Apparently optimized `yes` programs can manage tens of GB/s of `y\n`.
gollark: https://media.discordapp.net/attachments/426116061415342080/756982516153581939/67e8808.jpg
gollark: Fun command: `cat /bin/* | aplay -r 40000`
gollark: h

See also

References

  1. Character Sets, Internet Assigned Numbers Authority (IANA), 2018-12-12
  2. "OEM 852". Go Global Developer Center. Microsoft. Retrieved 11 Nov 2011.
  3. "Code Pages Supported by Windows: OEM Code Pages". Go Global Developer Center. Microsoft. Archived from the original on 2 November 2011. Retrieved 11 Oct 2011.
  4. "Code Page 852 DOS Latin 2". Developing International Software. Microsoft. Retrieved 11 Nov 2011.
  5. "CCSID 852 information document". Archived from the original on 2016-03-27.
  6. "CCSID 9044 information document". Archived from the original on 2016-03-27.
  7. Code Page CPGID 00852 (pdf) (PDF), IBM
  8. Code Page CPGID 00852 (txt), IBM
  9. The Czech and Slovak Character Encoding Mess Explained / PC Latin 2
  10. The Czech and Slovak Character Encoding Mess Explained / Kamenicky
  11. "cp852_DOSLatin2 to Unicode table" (TXT). The Unicode Consortium. Retrieved 11 Nov 2011.
  12. International Components for Unicode (ICU), ibm-852_P100-1995.ucm, 2002-12-03
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.