MS Excel: save spreadsheet to tab-separated with UNIX end of lines and UTF-8?

4

I have a tab-separated text file with a table that has UNIX end-of-line characters (i.e. \n) and UTF-8 encoding. I can open it in MS Excel 2010 without major problems using Data → Import and selecting related options. However, if I save it back into a text file using File → Save as → Tab-separated, I end up with a file with DOS end-of-lines (i.e. \r\n) and one-byte national Windows encoding (i.e. "ANSI").

Of course, I can restore proper end-of-lines and encoding with an external tool, but that looks like a lot of hassle in the long term.

Is there a quick way to save tab-separated file with UNIX EOLs and in UTF8 (i.e. using all the settings that were used during the import) in MS Excel?

GreyCat

Posted 2014-11-24T08:25:00.063

Reputation: 150

1are you open to using a VBA solution? – jjk_charles – 2014-12-02T13:16:31.383

Answers

3

Short answer: no. Excel can't do that without using VBA.

About different encodings
Check this Stackoverflow question about the different encodings (UTF-8, UTF-16, Unicode, etc.)

What Excel can do
Microsoft Excel can export text files using ANSI (Windows-1252 or ISO-8859-1, which are quite similar) or using Unicode (which is the same as UTF-16). Both have a DOS-style end-of-line ([CR]+[LF] or '\r\n' or in ASCII HEX code: 0x0D0A).

What Excel can't do
Microsoft Excel can't export UTF-8 and can't use UNIX style newlines ([LF] or '\n' or in ASCII HEX code: 0x0D).

See this Microsoft Knowledge base article where this is confirmed. They revert to another tool (such as Notepad) to save the ANSI file to UTF-8.

For a VBA answer, see this solution.

agtoever

Posted 2014-11-24T08:25:00.063

Reputation: 5 490

Really? I was quite sure it supported at least UTF-16... – user1686 – 2014-12-05T17:51:15.740

Excel actually has a Save As option for Unicode Text (*.txt). However this is not UTF-8 and produces DOS end-of-line characters. You probably need to use an external tool e.g. Notepad++ and look into the Macro functionality. – Mike Honey – 2014-12-08T03:34:09.393

@MikeHoney good point. Expanded the answer: mentioned ANSI as well as UTF-16 (Unicode) support and made a remark about CR+LF vs. LF too. – agtoever – 2014-12-08T09:11:58.510

Makes sense, thanks. A negative answer is still an answer. – GreyCat – 2014-12-08T15:50:45.883