Impossible to put a zero after an aleph?

143

28

Me and a friend were joking about aleph's. Upon trying to type א0 (switch those 2 chars), they switched themselves! Any sequence of symbols does not stop this effect. Why is this!??

Try to type these with the 0 and א reversed (c&p for א):

א0

א - 0

א \\\ 0

א -./ 0

Words however separate them

א foobar 0

I'm on arch linux and have not tested this on any other OS yet

EDIT: Number does not have to be zero. It works with numbers, but not letters.

Gman Smith

Posted 2017-07-31T06:18:27.487

Reputation: 1 681

14At first glance I thought you were crazy. Turns out it's simply an artefact of how different language directions are used. Great question! – None – 2017-07-31T13:09:09.337

1It strongly depends on the software you are using. Those that put the 0 to the left ot the א switch to Right-To-Left, those that put the 0 to the right of the א instead, either don't support RTL or don't automatically switch based on inputted characters. – simlev – 2017-07-31T14:34:54.743

6For writing Hebrew text, this order makes sense. Otherwise it would be most annoying to type stuff like ב-5 דקות (in 5 minutes). – ugoren – 2017-07-31T14:50:08.320

1@ugoren: But I dare say that most of us who are using aleph are writing math, not Hebrew. Maybe there should be a math aleph (and other Hebrew chars that might be used in equations), and a Hebrew language one? – jamesqf – 2017-07-31T16:23:15.333

7@jamesqf: And there is one, see IllidanS4's post. – user1686 – 2017-07-31T18:32:39.877

28@jamesqf, Hebrew letters exist in Unicode for writing Hebrew. And I dare say there are more of us who write Hebrew (7 millions or so) than those who write about set cardinalities. – ugoren – 2017-07-31T20:30:14.230

@cmbuckley cough Vsauce cough – Jack – 2017-07-31T21:01:36.177

27

@ugoren by some counts there are 130k mathematicians, but in truth, mathematics is the universal language, so there's really ℵ₀.

– Nick T – 2017-07-31T22:34:02.370

Answers

109

'א', 'HEBREW LETTER ALEF' (U+05D0) has the BIDI (bi-directional) class "Right-to-Left [R]", because Hebrew is traditionally written right-to-left. Digits, on the other hand, have no specific directionality assigned to them, and so the whole chunk of aleph and zero is interpreted as being right-to-left. In this case, the following character may not necessarily be located on the right of the preceding character, as Unicode's rather complex bi-directional rules dictate.

You have several options to work around this issue.

  1. You can use 'ℵ', 'ALEF SYMBOL' (U+2135). It's a symbol and has the left-to-right property: ℵ0.

  2. Instead of the usual digit 0, you can use a zero-like character with left-to-right directionality, such as '〇', 'IDEOGRAPHIC NUMBER ZERO' (U+3007).

  3. The cleanest way is to use the 'LEFT-TO-RIGHT MARK' (U+200E) character (Wikipedia) after the aleph: "א‎0". This is an invisible zero-width character that is defined to have left-to-right directionality. Thus, it has the same effect on the bidirectional text layout algorithm as inserting, say, a left-to-right Latin letter after the א, except that no visible letter will appear there.

IllidanS4 wants Monica back

Posted 2017-07-31T06:18:27.487

Reputation: 1 240

69In a mathematical context (which I expect this is), U+2135 is the correct character to use. – cmbuckley – 2017-07-31T13:46:13.633

10You have to be careful with overrides - where you place them in text it is important to remove them (using the "pop directional formatting" character U+202C) when the contex you wish them to operate on completes. – J... – 2017-07-31T14:01:23.150

4Also, the "override" characters are kind of overkill, "embedding" is sufficient for this use case. There's also a new class called "isolate", not sure what the difference is in this situation. – Random832 – 2017-07-31T14:48:49.013

4I'd recommend swapping 2 and 3. – wizzwizz4 – 2017-07-31T15:19:08.953

10

@Random832 All of those are overkill. All you really need is a left-to-right mark (U+200E) between the alef and the zero. That way you don't need any extra "pop" characters, either.

– Ilmari Karonen – 2017-07-31T17:31:30.177

@IlmariKaronen that should be an answer (or an edit to this one). – ypercubeᵀᴹ – 2017-08-01T10:48:20.057

Thanks for providing the edit. U+200E is really the best way. – IllidanS4 wants Monica back – 2017-08-01T17:13:14.683

1@IlmariKaronen the alef symbol isn't overkill... it even saves the most bytes of all the options. – NH. – 2017-08-03T17:47:58.990

2@NH. Agreed, the math alef symbol is certainly the best choice if you're writing math instead of Hebrew text. By "all of those", I was referring to the Unicode bidi embed / override / isolate characters mentioned in the comment I was replying to. All of those are overkill where a simple ‎ will do. – Ilmari Karonen – 2017-08-03T19:40:05.243

196

Aleph (U+05D0) is a Hebrew letter, and Hebrew is written right-to-left, so Unicode assigns it the "Right-to-Left" bidirectional class. (See Unicode TR9: Bidirectional Algorithm for more details.)

Latin letters are of course "Left-to-Right". However, zero (U+0030) is in the "European Number" bidirectional class, which is a weak class – while LtR by default, it can switch to RtL if there's a "strong" Right-to-Left character before it. (See Bidirectional Character Types and Resolving Weak Types in TR9.)

As a result, the directions of before and after are swapped for the entire word – if you put the zero 'before', it will show up to the right; if you write the zero 'after' aleph, it will show up on the left.

user1686

Posted 2017-07-31T06:18:27.487

Reputation: 283 655

14This is an extremely common problem in a number of text editors and websites when typing in Hebrew - I imagine it's true of other right-to-left languages as well. It's certainly gotten better over time, but imagine trying to write a word problem - switching back and forth between Hebrew words (like aleph character) and numbers (like the 0 character) repeatedly... – Jake – 2017-07-31T18:09:04.573

I'm curious, what's the standard when you have, for instance, a textbook written (mostly) in English for learning Hebrew? do you just make sure each language switch occurs on a new line, or do you mix RtL and LtR, or what? – Tin Man – 2017-07-31T20:45:19.607

3

@Walt Most textbooks I've seen are the "immersion" type, which use extremely simple Hebrew but pretty much entirely Hebrew. It may seem counter-intuitive to use a language to teach the language, but it allows for a more organic buildup of language skills. You might see a transliteration or translation inline (something like http://lh4.ggpht.com/-_Vc8TUDwznQ/UlhaLFjnrGI/AAAAAAAAzQk/_zm4BMC0aLw/talam_thumb.jpg?imgmax=800 - "Shalom Kita Aleph" = "Hello First Grade")

– Jake – 2017-07-31T20:50:21.463

1@Jake Ah, that makes sense. The only foreign language I really took was Latin; our textbooks tended to be mostly English with a single chunk of Latin text to decipher each chapter, right up until the whole class format switched from "learn Latin" to "translate this whole epic poem, a bit at a time, through the course of the school year". – Tin Man – 2017-07-31T20:55:02.093

5

@Walt: I think there may be a misunderstanding. If I type a latin (LTR) word, then a Hebrew (RTL) word, then another latin word, I can freely have them all in a sentence, and only the Hebrew word renders RTL. It's all designed to easily fit in the same sentence. The problem is that the number 0 is used by both LTR and RTL languages, and so the software just makes it the same direction as the previous letter. If it follows LTR characters, it's LTR. If it follows RTL letters, it's RTL. There's also overrides to swap it. http://www.fileformat.info/info/unicode/char/202d/index.htm

– Mooing Duck – 2017-07-31T23:19:08.207

@MooingDuck I get that from a technical perspective. I was curious about the usability side, I guess; if you have a Hebrew word (RTL) in an English sentence (LTR), wouldn't that be hard to read? Maybe a bad example, since words are often parsed in a single glance rather than following from letter to letter, but what if you had a full sentence (but not quite enough to justify a blockquote) of Hebrew quoted in a paragraph of English? – Tin Man – 2017-07-31T23:23:56.577

@Walt: The characters will display as you typed them. if you type them in the same paragraph, they'll flow as I described, and if you put them in separate paragraphs, they'll be in separate paragraphs. It's really the only sane thing to do. – Mooing Duck – 2017-07-31T23:43:40.907

4The zero isn't becoming RTL - it's still LTR, and a sequence of digits will show up left-to-right even with Hebrew around it, but the embedding levels interact in such a way that the zero shows up on the left of the Hebrew character preceding it in memory order. (Unicode bidirectionality is complicated.) – user2357112 supports Monica – 2017-08-01T07:48:50.777

@Walt I don't know too much about this, but took a couple of years of hebrew classes and my two conclusions were: 1) yeah, it's mixed and you get used to that super quickly, 2) typesetting is a pain when you combine it and the author of the 'book' had made a fair number of mistakes (it was a university class and the 'book' was written by the lecturer) – David Mulder – 2017-08-07T12:55:50.923

20

Perhaps, a better way to achieve this would be to:

echo -e "\u200F0א"

And the mandatory xkcd reference https://xkcd.com/1137/

‮LTR

wvxvw

Posted 2017-07-31T06:18:27.487

Reputation: 803

14

It's perfectly possible to have a zero in front as shown in the following example which was made in Notepad++.

Alef with 0

What you're seeing and also becomes apparent if you try to mark the character in your question, is that Hebrew is written right to left and (as the 0 is directly connected) the text is handled in a right to left (instead of left to right) manner.

See the second example for the trouble Firefox (on my end) has with a clear selection.

Firefox selecting a right to left text

Seth

Posted 2017-07-31T06:18:27.487

Reputation: 7 657

17This is terrible advice, because it plays games with the actual character ordering in order to get a particular visual ordering. The other answers explain why this occurs and some include the right way to deal with it (the override and explicit direction marks). – Dranon – 2017-07-31T14:01:08.773

8Could you point out where where I'm including some form of advice? I'm merely showing an example of what happens, that it is indeed possible to have a suffix numeral and providing information about why it happens like it does. – Seth – 2017-08-01T05:51:16.257

13

Hebrew is written right to left - this makes the aleph character carry the information, that the next character should be printed left of it.

If you hex-check your document (or move the cursor through your text with the arrow keys in a suitable editor), you will notice, that you get to the alpeh first, then to the digit.

I.e.: The assumption "next character == character to the right" does not hold.

Eugen Rieck

Posted 2017-07-31T06:18:27.487

Reputation: 15 128

3

א0 0א 0-א א-0

The issue is where you do this, and the implementation. To get Hebrew-number behavior all the characters must be in right-to-left directionality. In HTML/CSS that is:

<p style="direction:rtl"> א0 0א 0-א א-0 </p>

In the Operating System, Hebrew and bi-directionality must be enabled.

The workarounds by suggesting the use of other characters as substitutes, defeats the purpose of Unicode. The aleph as a mathematical operator may look the same in some character sets, but is an entirely different character than the Hebrew aleph, both in context and how it will be parsed. For example, a Hebrew-native speaker/computer will not process it correctly if used in conjunction with a Hebrew word. Numbers and non-alpha characters are a problem when they are not themselves given the same directional encoding as the alpha characters. Thus, ironically, numbers themselves while seemingly should be independent of a character-set/directionality, take on whatever unicode directionality of the preceding letter. Thus in a Hebrew document - the numbers become 'Hebraicised' i.e. directionally like Hebrew. Whereas an English-Latin document, the Hebrew letters can be mixed up and messed up because of the lack of directionality attributed to the paragraph.

Danny F

Posted 2017-07-31T06:18:27.487

Reputation: 151

The OP was trying to use aleph as the numeric operator, no? – user1686 – 2017-08-07T07:34:30.597

Well, it was not clear in the post at all. In any case, directionality shouldn't be relevant. Aleph is used in set notation and has an infinite, infinite series designation. It should directionally left to right since all maths is left to right regardless of what language you're using.

However using aleph as a character in Hebrew is directionally set at right to left. – Danny F – 2017-08-08T14:11:10.347

2

It's possible:

‭א0

‭א - 0

‭א \\ 0

‭א -./ 0

‭א foobar 0

(This answer didn't answer "why is this", as it is already answered by others. But it does answer the question in the title, "impossible to...?")

user23013

Posted 2017-07-31T06:18:27.487

Reputation: 167

7But it also doesn't answer HOW, and so is almost useless. – NH. – 2017-08-03T17:43:06.240