Windows tool to convert large text file from UTF-8 to "Windows Unicode" (UTF-16)

6

1

I need to view large Unicode text files (current version is 2,379,415,348 bytes) on Windows 7.

Normally I prefer UTF-8 but, but after looking on SuperUser it seems the best Windows large file viewer can't handle UTF-8 so I don't mind doing one-off conversions of these files to UTF-16-LE until a better viewer comes along.

So in the meantime I need a tool that can convert the encoding. Note that I can't use an editor for this or I would just view the file in that editor. Either a command line or GUI tool would be fine.

(I have a netbook maxed out to 2G RAM, sometimes I can view these files fine in gVim but I often have lots of browser windows open and have run out of memory plenty of times. LTFViewer can view text files right from the disk without loading the whole thing into RAM)

hippietrail

Posted 2011-08-03T14:50:39.773

Reputation: 3 699

2Have you tried Notepad? (Just kidding) – user541686 – 2011-08-03T15:02:34.583

@Mehrdad: Ok you made me laugh d-; – hippietrail – 2011-08-03T15:09:58.120

Answers

12

GNU iconv has a Windows version.

iconv -f utf-8 -t utf-16le < in.txt > out.txt

user1686

Posted 2011-08-03T14:50:39.773

Reputation: 283 655

2libintl.dll can be found here: http://gnuwin32.sourceforge.net/packages/libintl.htm, just put libintl.dll into the same directory as iconv.exe. – Ben – 2014-08-29T18:05:14.480

That commandline seems to have output UTF-16 big endian! (I guess it chose the native format of its native *nix rather than the native format of its current port) – hippietrail – 2011-08-03T15:36:28.890

@hippietrail: If you require little-endian, specify -t utf-16le. // The "native format" depends on the CPU architecture, not on the OS. (FWIW, iconv chooses little-endian on Linux x86_64.) Most Windows and nix programs can deal with both utf-16be and utf-16le just fine, and iconv even includes the "byte order mark" to help with this.

– user1686 – 2011-08-03T15:43:09.183

1

Sadly, LTFViewer, can only handle UTF-16LE, despite the BOM. I would rubbish it but it seems to be the only free disk-based large file viewer on Windows. By the way I did specifically mention LE in my question but I can never remember whether it's UTF-16LE or UTF-16-LE. I am running an Intel CPU in 32-bit mode.

– hippietrail – 2011-08-03T16:16:48.170

1@hippietrail: If you require little-endian, specify -t utf-16le in iconv. – user1686 – 2011-08-03T16:19:12.397

1This is a great answer, but only if you are allowed to install software on the machine in question. Is there no included utility to do this on Windows? – Noah – 2013-08-22T01:37:36.647