50

Which characters are allowed and which of them must be escaped on the command line in different operating systems?

voretaq7
  • 79,345
  • 17
  • 128
  • 213
java.is.for.desktop
  • 889
  • 3
  • 9
  • 15
  • There are some useful answers below, but what are you trying to achieve? Coding up your own character white-listing routines is probably not the best route. – medina Jun 12 '10 at 23:16
  • Thanks to everyone! All answers are helpful. *What I need the info for is: I'm writing a tool which would tag files across the filesystem, by altering their names (no metadata).* – java.is.for.desktop Jun 13 '10 at 12:48
  • See also answer on [superuser](http://superuser.com/questions/358855/what-characters-are-safe-in-cross-platform-file-names-for-linux-windows-and-os/358861#358861). – pevik Jun 21 '16 at 10:21

4 Answers4

30

There's a discussion of filename characters in the Wikipedia article on File Names.

You may find this essay informative: Fixing Unix/Linux/POSIX Filenames.

This article compares OS X and Windows XP: X vs. XP: Forbidden Characters in Filenames (PDF, see pp approx. 64-66).

Things That Shouldn’t Be in File Names for $1,000 Alex

I don't know which characters must be un-escaped, but in Linux, it's probably not a good idea to escape the characters that may have special meaning such as "n" (newline), "t" (tab) and others, but that's generally not a problem in file operations. Perhaps you mean "escaped" rather than "unescaped". The most common ones are ones that the shell will interpret such as space, ">", "<", etc. See some of the articles I linked for a discussion of those.

Dennis Williamson
  • 60,515
  • 14
  • 113
  • 148
29

The only characters not allowed in a filename in *nix are NUL and /. In Windows, only NUL, :, and \ are truly not allowed, but many apps restrict that further, also preventing ?, *, +, and %.

At no point do any characters in a filename need to be escaped except as required in order to not be interpreted by the shell.

Ignacio Vazquez-Abrams
  • 45,019
  • 5
  • 78
  • 84
  • The second point deserves emphasis. Usually, “escaping” refers to a shell mechanism that allows the user to specify strings (e.g. pathnames) that contain characters which the shell would otherwise treat in a special manner. If the OP means using something like “percent encoding” to encode otherwise disallowed characters, then that is a purely application level “pathname protocol” that each involved program must adopt (or not). – Chris Johnsen Jun 13 '10 at 04:20
  • I'm scanning a folder with readdir then trying to open the files with the names it returns. Some of them fail to open with ENOENT which suggests even for the OS sometimes you have to escape? – gman Jan 06 '18 at 05:59
16

If you create a file on Windows with Explorer using one of the following characters, it will complain that the characters are not allowed:

\ / : * ? " < > |

A good reference is here:

Naming Files, Paths, and Namespaces
http://msdn.microsoft.com/en-us/library/aa365247%28VS.85%29.aspx

Microsoft further states:

"... on Windows-based desktop platforms, invalid path characters might include ASCII/Unicode characters 1 through 31, as well as quote ("), less than (<), greater than (>), pipe (|), backspace (\b), null (\0) and tab (\t)."

http://msdn.microsoft.com/en-us/library/system.io.path.getinvalidpathchars.aspx

Greg Askew
  • 34,339
  • 3
  • 52
  • 81
  • I remember reading a couple years ago that user-mode Windows has those restrictions as well as being case-insensitive ("ABC.txt" === "abc.txt"). However, kernel-mode Windows has fewer restrictions and is case-sensitive ("ABC.txt" !== "abc.txt" just like *NIX). For all intents and purposes, though, the above characters will apply to the majority of programs because they run in user-mode. – CubicleSoft Mar 03 '13 at 13:36
  • I can escape `\ / : * ? " < > |` all of them, and create them with mkdir on my GNU/Linux system. You can use `mkdir '?'` to create the `?` directory as well. I have used the ramdisk and XFS file system to test that. – S.Goswami Sep 30 '19 at 05:37
7

On Linux and other POSIX compatible systems, "/" is reserved as it's the directory separator, and "\0" (the NULL character) designates the end of the string. Everything else is allowed.

janneb
  • 3,761
  • 18
  • 22
  • 2
    Although it's highly recommend to avoid newlines, tabs, control characters, and the like, and to make sure the filename is valid UTF-8. – Flimm Sep 23 '15 at 10:32