Is there a way that I manually have a user look up the current Codepage and locale of their windows OS? Is there a registry setting that stores that information?
It would also be useful if the technique worked all the way back to Windows 2000.
Is there a way that I manually have a user look up the current Codepage and locale of their windows OS? Is there a registry setting that stores that information?
It would also be useful if the technique worked all the way back to Windows 2000.
chcp will get you the active code page.
systeminfo will display system locale and input locale, among other things.
"Note: This command (systeminfo) is not available in Windows 2000 but you can still query Windows 2000 computer by running this command on Windows XP or Windows 2003 computer and set remote computer to Windows 2000 computer. If the current user logon that execute this command already has privilege on remote machine (for instance, Domain Administrators), you don’t have to use /u and /p."
From here.
Note that a given system has two active code pages of interest, as determined by the legacy setting named language for non-Unicode programs, formerly known as system locale (see the bottom section for background information):
Note: There are two more code pages, but they are rarely used anymore, and therefore not discussed here: the EBCDIC code and the (pre-OS X) Mac code page - see the WinAPI docs.
The active OEM code page is most easily obtained via chcp
, as shown in Forgotten Semicolon's helpful answer - assuming the console window wasn't configured with a custom code page via the registry and that the code page wasn't explicitly changed in the session with chcp <codePageNum>
.
Determining the active ANSI code page is not as simple, but PowerShell can help, also with determining the name and language of the system locale:
In Windows 8+ / Windows Server 2012+: Use the Get-WinSystemLocale
cmdlet:
Get-WinSystemLocale | Select-Object Name, DisplayName,
@{ n='OEMCP'; e={ $_.TextInfo.OemCodePage } },
@{ n='ACP'; e={ $_.TextInfo.AnsiCodePage } }
Caveat: The information returned does not reflect a potential UTF-8 override that may be in place via a new Windows 10 feature (see this SO answer); instead, the information always reflects the code pages originally associated with the active system locale. If you do need to know whether the UTF-8 override is in effect, see the registry-based method below.
On a US-English system, the above yields:
Name DisplayName OEMCP ACP
---- ----------- ----- ---
en-US English (United States) 437 1252
OEMCP
is the OEM code page, ACP
the ANSI code page.
A registry-based method that also works on older systems down to Windows XP:
# Get the code pages:
Get-ItemProperty HKLM:\SYSTEM\CurrentControlSet\Control\Nls\CodePage |
Select-Object OEMCP, ACP
On a US-English system, the above yields:
OEMCP ACP
----- ---
437 1252
If you also want get the system locale's [friendly] name and LCID (though note that LCIDs are deprecated):
[Globalization.CultureInfo]::GetCultureInfo([int] ('0x' + (
Get-ItemProperty 'HKLM:\SYSTEM\CurrentControlSet\Control\Nls\Language' Default
).Default)
)
On a US-English system, the above yields:
LCID Name DisplayName
---- ---- -----------
1033 en-US English (United States)
Background information:
System locale is the legacy name for what is now more descriptively called language for non-Unicode programs (see NLS terminology), and, as the names suggest:
The setting applies only to legacy programs (programs that don't support Unicode).
It applies system-wide, irrespective of a given user's locale settings, and administrative privileges are required to change it.
It is important to note that is is a legacy setting, because code pages no longer apply to programs that use Unicode internally and call the Unicode versions of the Windows API.
Notably, it determines the active code pages, i.e., the character encoding used by default:
the ANSI code page to use when non-Unicode programs call the non-Unicode (ANSI) versions of the Windows API, notably the ANSI version of the TextOut
function for translating strings to and from Unicode, which notably determines how the program's strings render in the GUI.
the OEM code page to make active by default in console windows, as reflected by chcp
.
65001
, which represents the UTF-8 encoding of Unicode, is a solution, but that can cause legacy command-line programs to misinterpret data and even to fail - see this StackOverflow answer for details.850
, run chcp 850
in cmd.exe
, and $OutputEncoding = [console]::InputEncoding = [console]::OutputEncoding = [text.encoding]::GetEncoding(850)
in PowerShell.additionally, the rarely used anymore EBCDIC and Mac code pages.
Despite the word locale used in the legacy term and the word language in the current term:
The only aspects controlled by the setting are the set of active code pages and the default bitmap fonts, not also other elements of a locale (which are controlled by the user-level locale settings).
A given code page is typically shared by many locales and covers multiple languages; e.g., the widely used 1252
code page is used by many Western European languages, including English.
However, when you do change the setting via the Control Panel, you do pick the setting by way of a specific locale.
For a list of all Windows code pages, see https://docs.microsoft.com/en-us/windows/desktop/Intl/code-page-identifiers
The Windows API that returns the active code page is GetConsoleOutputCP().