Unicode and ASCII backward compatibility

0

If first 127 characters are same then why are we still using ASCII and is there backward compatibility issue when using Unicode instead of ASCII

Stribor

Posted 2015-11-22T23:11:18.080

Reputation: 33

1

The accepted answer for ANSI to UTF-8 in Notepad++ would be useful reading.

– Thomas Dickey – 2015-11-23T02:00:24.613

Answers

-1

ASCII, later called ANSI, has 1:1 relation between byte and character. Multibyte character systems, including Unicode, have the advantage of displaying additional character at the expense of requiring additional storage. In addition, there are many implementations of multibyte character systems; in some, the byte order is specified by the BOM. The interpretation of UTF-8, UTF-16 and UTF-32 produces different values for the same byte string. Further, there are different ISO standards for differing alphabets, such as the Scandinavian implementation with A-minuscule-o, as in "Åland Islands".

So, for simple database purposes, or for use with very limited storage, for example, ANSI has space advantages, and is not subject to misinterpretation. If one needs to display the full character set of many alphabets, though, multibyte sets are useful.

DrMoishe Pippik

Posted 2015-11-22T23:11:18.080

Reputation: 13 291

I understand that there are different storage in ASCII and Unicode but say character "a" will have – Stribor – 2015-11-23T00:13:57.657

Same encoding only different padding? Is that accurate? – Stribor – 2015-11-23T00:14:29.483

A in ANSI is byte(65), dec, or 41, hex. A in UTF-8 is 0041, hex, or 4100, with reversed BOM. Not only is there padding, but it may be left or right padding. – DrMoishe Pippik – 2015-11-23T00:30:12.900

1ISO/IEC 8859-x are single-byte character sets. I've seen no knowledgeable source referring to US-ASCII as "ANSI". – Thomas Dickey – 2015-11-23T01:20:53.530

Sorry ANSI was way too wrong. Sorry my typo :( – Stribor – 2015-11-23T02:02:06.173

1

@ThomasDickey In the Windows world "ANSI" was used to denote the 8-bit default GUI codepage. Whether appropriate or (rather) not but it's still widely used. See for example https://msdn.microsoft.com/en-us/library/windows/desktop/dd317752.aspx Windows code pages, commonly called "ANSI code pages".

– dxiv – 2015-11-23T02:05:07.973

2

And in correct contexts (other than Windows) ANSI means the organization http://www.ansi.org which has developed or adopted standards for many thousands of things other than ASCII, from magtapes to encryption to photographic film to machine tools to eyeguards and workboots.

– dave_thompson_085 – 2015-11-23T04:59:48.043