Unicode is a standard for encoding plain text. Thus, any symbol used in mathematical texts is a candidate for encoding as a Unicode character, and a very large number of such characters have been encoded. The process is ongoing, and new characters will be added if they have been actually taken into user.
Superscripting and subscripting is as such not plain text but “rich text”, just like italic, bolding, specific fonts, colors, backgrounds, borders, and animated letters are. A superscript “2” is still the character “2”, just in a raised position and typically in smaller size. From this perspective, we could say that superscripts and subscripts need not be encoded at all. Normal characters can be used, and devices beyond the plain text level, or “higher level protocols” can be used, such as commands in a word processor, style settings, HTML or MathML markup, etc.
So the question is really why superscripts and subscripts have been included at all in Unicode, rather than why they do not constitute a uniform set. One reason is that other character codes have superscript and subscript characters. Unicode has to include them. Another reason is given in the note Unicode in XML and other Markup Languages: “Super and subscripted letters and digits are quite common in some forms of phonetic or phonemic transcriptions, where the use of styles is both awkward and prone to data integrity issues when exported to plain text. For super or subscripted letters in phonetic transcription in particular, a change from superscript of subscript to regular style would alter the meaning. Note that such use in transcription is not limited to letters: superscripted small digits are often used to indicate tone. When used for these purposes, these characters should be retained and markup should not be used.”
However, adding superscript and subscript version of any character would mean adding about 200,000 characters. Next, someone would want to have italic and bold versions of any character, and so on, and we would run out of encoding space. Before that, typographers would have nervous breakdowns: they really don’t want to design glyphs for such characters (most of which would never be used).
This is why the cited document adds: “When used in mathematical context (MathML) it is recommended to consistently use style markup for superscripts and subscripts. This is because mathematical layout allows not just individual symbols, but entire expressions to be superscripted or subscripted in a regular, nested manner.”
Thanks, that pretty much answers the question. Basically, while unicode is a nice work-around to include better-looking math into Emails, using it for super/subscripts is stretching the intent. – kdb – 2014-05-08T21:10:01.010