In addition to @jrb's answer, in Vim, the character encoding of the file is detected based on the fileencodings option. (note the 's' at end of fileencodings)
I.e. on Windows, the default value for the fileencodings
option is ucs-bom
, which means:
check if BOM exists at the beginning of the file.
If BOM exists, then 'read the character encoding of the file out of BOM'.
If BOM doesn't exist (and in this case that would also mean that all character encodings specified in the fileencodings
option failed to match), then read the file with the character encoding specified in the encoding
option. The default character encoding for the encoding
option is: latin1
. Now, because latin1
is the one byte length character encoding, all bytes in the file are valid latin1
characters (even the Nul
character ^@
that you're seeing*).
*- actually, ^@
is the newline character in the Vim's buffer text, not the Nul character.
The proper way to read the file is to specify the character encoding manually as UTF-16 (as it looks like UTF-16 is the proper char encoding in this case).
4Just stumbled upon this question/answer through a related link: This is actually a bad advice and will only work properly in very few cases. It's better to actually change the encoding rather than removing null bytes. If you remove the null bytes, you might still have other multibyte characters that show up as garbage. – Mario – 2014-03-07T09:57:30.517
@Mario could you tell us more about the encoding change? Is it something related to jrb's answer below? – George – 2014-03-08T13:11:04.633
See rpyzh's answer further down below. Shows loading the file using the proper encoding as well as saving it with a different one (although the answer could need some more explanation). Jrb's last note is enough if you just want to read it, but not if you want to have it saved without the null bytes using another encoding. – Mario – 2014-03-08T13:37:47.797