93
12
Why is ^M
used to represent a carriage return in VIM and other contexts?
My guess is that M
is the 13th letter of the Latin alphabet and a carriage return is \x0D
or decimal 13
. Is this the reason? Is this representation documented anywhere?
I notice that Tab is represented by ^I
, which is the ninth letter of the Latin alphabet. Conversely, Tab is \x09
or decimal 9
, which supports my theory stated above. However, where might this be documented as fact?
1Also keep in mind that dos/windows use "0x0d 0x0a", also noted as "CR LF". But unix/linux use only "0x0a" or "LF". So when you open a windows document in linux it detects extra "CR", and when you open a linux document in windows it doesn't detect new lines. – LatinSuD – 2014-06-05T08:47:03.667
3@LatinSuD caret notation (and corresponding use of the Ctrl-key) relates to the C0 control set (historically part of ASCII) directly and not whether and how a given operating system or program uses part of that set in representing new lines, or anything else. Similarly, whether
^H
deletes a character or allows overprinting (such asn^H~
as an obsolete way to produce ñ) or any other actual use of the control character is separate from the caret notation. – Jon Hanna – 2014-06-05T12:05:28.25011old one ... I can't remember the original code, but ctrl-G rings a bell! – Brian Drummond – 2014-06-05T13:28:57.913
the ^M you see when in linux (which uses "0x0a"(LF)) is probably from a file made on windows (which uses "0x0d 0x0a" (CR LF)). Thus, at the end of each line, you see the extra "0x0d" (CR). (the 0x0a being interpreted as a newline, and not shown in vi (well, it is : the next line will have a "~" if the previous line didn't end with a Newline). So the the ^M is not exactly a "carriage return", it's part of what a carriage return is in windows. The Answer tells why it's represented that way (using Caret Notation, ^@ = 0x00, ^A=0x01, etc..., ^M=0x0d, ...) – Olivier Dulac – 2014-06-05T14:47:58.107
3@OlivierDulac no, the ^M is exactly a carriage return, just like ^J is exactly a line-feed. While different OSs have had different views as to whether line-feed and/or carriage return or something else (like the Newline character used by some IBM characters but not part of ASCII and so not part of the historical heritage of some other OSs) should represent a new line in a text file, and while some programs have then overridden that in different ways, U+000D itself is still a carriage return, whatever later operating systems like Unix or DOS decided to do with it. (Of course, calling it... – Jon Hanna – 2014-06-05T21:43:42.940
1@OlivierDulac ... U+000D is proleptic, since that name came with Unicode in the 1990s, but that does quite definitely reference the code as it existed in ASCII in 1963, anf through that as it existed in Murray's modified Baudot code in 1901. Murray was solving problems related to moving paper around, with the same tools used in the concept of "text file" many decades later. Hammer a screw into something like a nail, and it's still a screw. Use LF and/or CR to represent the end of a line in a text file, and they're still line-feeds and carriage returns. – Jon Hanna – 2014-06-05T21:47:39.003
@JonHanna: apologies, i mixed in my comment carriage return and newlines. – Olivier Dulac – 2014-06-06T07:43:11.753
Because Control-M was the ASR-33 TTY keyboard combination to get the character. (And yes, Brian, Ctrl-G does ring a bell.) – Daniel R Hicks – 2014-06-06T18:30:07.070
Has nothing to do with "letter of the alphabet", other than when the ASCII table was laid out the alpha characters were assigned sequentially, starting from 0x41. – Daniel R Hicks – 2014-06-06T23:05:06.073
I knew you could actually use ctrl+i as tab (I use it on connectbot on my phone in vim) I didn't realize that ^M works the same way, and they work basically everywhere. Cool! – Wayne Werner – 2014-06-09T19:36:57.357