How to pipe Windows "dir" in ANSI codepage

3

This applies to both Windows XP and Windows 7.

Some of my files have names with European characters, for example the German a-umlaut, also known as a-diaeresis.

These are displayed correctly in Windows Explorer, and also in a command shell (cmd.exe) window in response to the "dir" command.

However, if that "dir" command is directed to a file, e.g.

dir > file.txt

then the European characters in that file are represented in a DOS codepage; for example the a-umlaut is represented as decimal 132 (hex 0x84). This is not what I want. I want the file to be in the ANSI codepage, where for example a-umlaut is decimal 228 (hex 0xE4).

Issuing the command "cmd /?" results in help information including the line

/A      Causes the output of internal commands to a pipe or file to be ANSI

This sounds like exactly what I want. However, either the sequence of commands

cmd /A
dir > file.txt
exit

or the equivalent single command line

cmd /A /C dir > file.txt

produces exactly the same file.txt as before; with its Europoean characters still in the DOS code page.

So my question is, how can I get "dir" to write a file in the ANSI codepage?

  • Rich

Rich Pasco

Posted 2011-08-14T15:12:34.567

Reputation: 31

3Have you tried with the Unicode (/U) flag instead of ANSI? – Breakthrough – 2011-08-14T15:18:05.943

Switching from OEM to ANSI is like upgrading from DOS to Windows 3.1... it will cause even more suffering in the end. Use Unicode if possible. – user1686 – 2011-08-14T15:30:09.567

Nice plug for Unicode, but for many reasons this has to interface with existing scripts that use single-byte characters. – Rich Pasco – 2011-08-15T02:51:02.697

Answers

3

I think there is easy way, "from the box"

chcp 1252
dir > file.txt

Maximus

Posted 2011-08-14T15:12:34.567

Reputation: 19 395

0

You are being led up the garden path by the letter "A". The /A option isn't distinguishing "ANSI" from "OEM" code pages. It's distinguishing 8-bit single-byte/multiple-byte character sets from 16-bit Unicode (the /U option). 8-bit SBCS/MBCS output from a Win32 program, such as CMD, to a console is handled in the "OEM" code page.

JdeBP

Posted 2011-08-14T15:12:34.567

Reputation: 23 855

OK, JdeBP, I understand what you're saying: the help information sure is misleading. But then how do I get the output of the "DIR" command to be in ANSI code page (well, ISO 8859-1, the standard for Windows) rather than the OEM DOS codepage (850 in my case)? I suppose I could write a simple EXE filter with a lookup table, and then pipe my DIR through it, or maybe I could write a whole new replacement for DIR which does what I want in the first place. But surely there's an easier way! – Rich Pasco – 2011-08-15T02:38:29.853

I found one way to do it: download my utility cp850win and then issue the command

dir | cp850win > file.txt

– Rich Pasco – 2011-08-15T06:48:30.583