No tool is adding anything. It's quite a confusion (but not your fault at all) because of few reasons.
There are two common line endings:
- Unix-style, one character denoted
LF
(or \n
or 0x0a
),
- Windows-style, two characters,
CRLF
(or \r\n
or 0x0d 0x0a
).
You download from two different URLs. It seems the server claims each file is text/plain
, so they should use CRLF
. The second one (the one you curl
) does indeed use CRLF
, but the first one (the one you wget
) illegally uses sole LF
instead.
If you download only from the first URL (no matter if with wget
or curl
) and store the result in a hosts1
file, then file hosts1
will yield:
hosts1: UTF-8 Unicode text
(This means the line endings are LF
, otherwise it would be UTF-8 Unicode text, with CRLF line terminators
).
If you download only from the second URL and store the result in a hosts2
file, then file hosts2
will yield:
hosts2: ASCII text, with CRLF line terminators
If you download both to the same file (say hosts12
) in the way you do, you will get LF
as line endings for lines that came from the first URL and CRLF
as line endings for lines that came from the second URL.
In practice any tool that tries to tell whether a file uses LF
or CRLF
examines at most few initial lines, not all of them. Try file hosts12
and you'll get:
hosts12: UTF-8 Unicode text
exactly as it was for hosts1
. The same happens when you vim hosts12
: the editor detects line endings as LF
based on the beginning of the file. Then you skip to the end and you see many ^M
-s which denote CR
characters. vim
prints them because it doesn't consider CR
to be a part of proper line ending in this case.
However when you vim hosts2
, the editor correctly detects line endings as CRLF
. The same CR
characters that were printed as ^M
earlier, now are hidden from you because vim
considers them to be parts of proper line endings. If you added a new line by hand, vim
would use the Windows-style line ending even if you're on Unix. You may think the file is "perfectly normal" but it's not a normal Unix text file.
The confusion is because the two files on the server use different line endings; then vim
tries to be smart.
In Linux (Unix in general) you want your /etc/hosts
to use LF
as line endings. See POSIX definitions of line and newline character. It's explicitly stated the character is \n
:
3.243 Newline Character (<newline>
)
A character that in the output stream indicates that printing should start at the beginning of the next line. It is the character designated by '\n'
in the C language.
I don't think tools are obligated to support \r\n
then. The simple solution is to run wget … && curl … >> …
exactly as you did, then invoke dos2unix /etc/hosts
.
If I were you I would work with another file, say /etc/hosts.tmp
. I would use wget
, curl
, dos2unix
, chmod --reference=/etc/hosts
, chown --reference=/etc/hosts
. Only when the file is complete, I would mv
it to replace /etc/hosts
. This feature of rename(2)
is relevant:
If newpath
already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath
will find it missing.
So any process would find either the old /etc/hosts
(before mv
) or the new one (after mv
). Your current approach, directly working with /etc/hosts
allows scenarios when another process finds the file incomplete or with wrong line endings near its end.
"It's supposed to be 0x0a not 0x0a0x0d" – I think what you get is
0x0d 0x0a
, not the other way around. Or is it really0x0a 0x0d
? – Kamil Maciorowski – 2018-06-19T20:47:44.320@KamilMaciorowski whichever is
^M
on VIM. – AK_ – 2018-06-20T11:36:12.687