How do I find the encoding of the current buffer in vim?

92

24

Say I am editing some file with vim (or gvim). I have no idea about the file's encoding and I want to know whether it is in UTF-8 or ISO-8859-1 or whatever? Can I somehow tell vim to show me what encoding is used?

innaM

Posted 2009-08-24T13:48:39.127

Reputation: 9 208

Answers

105

The fileencoding setting shows the current buffer's encoding:

:set fileencoding
fileencoding=utf8

There really isn't a common way to determine the encoding of a plaintext file, as that information isn't saved in the file itself - except UTF-8 Files where you've got a so called BOM which indicates the Encoding. This is why xml and html files have charset metatags.

You can enforce a particular encoding with the 'encoding' setting. See :help encoding and :help fileencoding in Vim for how the editor handles these settings. You can also add several fileencoding settings to your vimrc to have vim try detecting based on the ones listed.

jtimberman

Posted 2009-08-24T13:48:39.127

Reputation: 20 109

2

Probably worth mentioning that BOMs are 1.) Not unique to UTF-8 -- though UTF-8's is distinct from other BOMs, 2.) Not required and often not found in UTF-8.

– ruffin – 2014-10-16T15:09:54.497

@jtimberman Did you mean to write set fileencoding? (with a trailing question-mark)? – SeldomNeedy – 2016-06-22T08:28:25.817

@SeldomNeedy it will work without the question mark too – Ruslan – 2019-08-09T09:53:00.503

7Unfortunatelly, not correct. For Vim cannot find the encoding of the file you're reading. It is not written in the file. It can only guess based on the available characters in the file. For example a file with the text "abcdef" can be in several encodings, since practically all support those characters, but a file with "šđčćž" will likely be in CP1252. So, you're not reading the encoding from somewhere, but guessing what encoding could that be, and based on that displaying it properly. – Rook – 2009-08-24T14:29:12.567

6What you are doing here is explicitly setting the encoding, based on your observations of the file's contents. If you wish for vim to try several encoding, when opening a file, put several of them in the option in your _vimrc. – Rook – 2009-08-24T14:32:24.947

@ldigas, thanks for the feedback, I've updated the answer to be a bit more clear on that (I hope!) – jtimberman – 2009-08-24T15:18:22.607

I only wish that the answer were this easy. It's not, see my answer below for the 'right' way and explanation. – dotancohen – 2013-12-26T07:00:48.303

14

Note that files' encoding is not explicitly stated anywhere in a file. Thus, VIM and other applications must guess at the encoding. The canonical way of doing this is with the chardet application, which can be run from within VIM as so:

:!chardet %

The answer provided by jtimberman shows you the encoding of the current buffer which may not be the same encoding as the file on disk. Thus, you will notice that chardet will sometimes show a different encoding than VIM, especially if you have VIM configured to always use a specific encoding (i.e. UTF-8).

The nice thing about chardet is that it gives a confidence score for its guess, whereas VIM can be (and often is) wrong about guessing the encoding if there are not many characters above \x7F (ASCII 127). For instance, adding a single א to a long file of PHP code makes chardet think that the file is ISO-8859-2 with a confidence of 0.72, whereas adding the slightly longer phrase שלום, עולם!‏ gives UTF-8 with a confidence score of 0.99. In both cases, set fileencoding? showed UTF-8 not because the file on disk was UTF-8, but because VIM is configured to use UTF-8 internally.

dotancohen

Posted 2009-08-24T13:48:39.127

Reputation: 9 798

I suggest that you mention a word about the availability of chardet across OS'es. – Soundararajan – 2018-08-31T09:28:41.717

@Soundararajan: I'm probably not the guy to mention that as I use Debian and CentOS only. You are invited to edit the answer if you have relevant information, though. Thanks! – dotancohen – 2018-08-31T12:28:19.293

I don't see the need to do that inside VIM, better to do it from outside: chardet <file>. Still, good suggestion. – lepe – 2019-08-03T07:10:35.850

-1

I found that : https://vim.fandom.com/wiki/Reloading_a_file_using_a_different_encoding

You can reload a file using a different encoding if Vim was not able to detect the correct encoding :

:e ++enc=<encoding>

where encoding could be cp850, ISO-8859-1, UTF-8, ...

You can use file yourfilename to find encoding or chardetect (provided by python-chardet or uchardet depending your Linux distribution) as suggested by dotancohen.

Pierre-Damien

Posted 2009-08-24T13:48:39.127

Reputation: 161

This doesn't answer the question of how to find out current encoding. Instead this command will force some other encoding on the buffer. – Ruslan – 2019-08-09T09:55:28.617