8

Is it dangerous to have several parallel jobs create the same directory using mkdir -p? (This is under Linux.)

In my case, I send many jobs to a SUN grid to process them in parallel, and some of these jobs start by creating a certain directory foo. So, the execution of the different mkdir commands might happen at exactly the same time ...

user9474
  • 2,368
  • 2
  • 24
  • 26

4 Answers4

8

A simple mkdir is atomic (if you are using NTFS, there are chances it is not atomic, need some check).

By deduction, the mkdir -p folder1/folder2/ starts by creating folder1 which is atomic. I f at the same time another process tries to create folder1 also, it will see that folder1 is created so it will try to create folder2 which will either fail (if the first process already created folder2) or succeed and the first process will fail.

This should not be a problem if this is properly handled (i.e. good error handling).

Weboide
  • 3,275
  • 1
  • 23
  • 32
3

There should be no danger to any POSIX compatible filesystem from multiple concurrent (racing) mkdir -p commands. I have, in fact, tested my own shell script locking function (wrapped around mkdir ... || ... (but NOT with -p) by using hundreds of racing process in my efforts to detect any failures from the race on Linux and Solaris with a few different local operating systems. I never saw any failures and my searching and reading suggested that it should be safe.

(Nitpick: in your case it sounds like atomicity is not required. Atomicity is critical for mutex/locking but not necessary for mere safety --- mkdir() can fail safely when the directory is already there. Multiple racing mkdir shell commands should each have some their calls fail harmlessly as they traverse the components of the target path attempting to create each. Atomicity is irrelevant to that).

Jim Dennis
  • 807
  • 1
  • 10
  • 22
2

Local fs should be posix and all operations (including creation of directories) should be atomic.

I guess it comes down to what mkdir -p does when it starts creating a path and then suddenly encounters an (further) element of that path that's already created. If it's sane it will continue independant of what it did before and your operations should be safe. For the details of your particular mkdir tool you should see the sourcecode.

On networked/clustered filesystems it may very well depend on network latency, server load or mount options.

Also, it would not be hard to write a script that tries it a lot of times with a high concurrency, failures should be easy to detect.

Joris
  • 5,939
  • 1
  • 15
  • 13
  • 1
    POSIX does not guarantee atomicity of "all operations" on a filesystem. However, the `mkdir()` system call should be atomic (though I'm not a standards lawyer and thus cannot cite chapter and verse). Any implementation of `mkdir` is wrapped around that system call ... but, for example, `mkdir -p` is almost certainly a loop around a series of systems calls, checking for and/or attempting to create the directories at each level along the path. That whole sequence is SURPRISE ... NOT atomic. – Jim Dennis Jun 19 '10 at 22:02
1

No, this is not dangerous.

One job will succeed in creating the directory, and the others will fail.

Mark Harrison
  • 795
  • 2
  • 11
  • 20