I have the following setup:
- Windows 8.1 32-bit
- Drive 0: system drive, SSD, NTFS, mounted at
C:\
- Drive 1: data drive, magnetic HDD, NTFS, mounted at
C:\Users\Database User\Documents
andZ:\
additionally
In a sub-sub-directory of C:\Users\Database User\Documents
I have about 50 000 files with about 2KB on average in about 10 subdirectories. (A bcolz column database.)
With cross-drive NTFS junction points I find huge performance discrepancies depending on whether a process' file IO targets its working directory (or a sub-directory thereof) or any other directory.
Below the NTFS junction acceptable performance is only achieved in the processes' working directory or a subdirectory of the working directory:
Working directory
C:\Users\Database User\Documents\abc\def
: executingrmdir /Q /S mydata.bcolz
is a IO bound (Disk bound) operationWorking directory
C:\Users\Database User\Documents\abc
: executingrmdir /Q /S def\mydata.bcolz
is a IO bound (Disk bound) operationWorking directory
C:\Users\Database User\Documents\abc\def\xyz
: executingrmdir /Q /S ..\mydata.bcolz
is a CPU bound operation
In the first two cases, the cmd.exe process hardly consumes any CPU time, while in the latter it consumes 100% of one core. The operation is identical in all three cases. Only the working directories differs.
But note:
- Working directory
Z:\abc\def\xyz
: executingrmdir /Q /S ..\mydata.bcolz
is again an IO bound operation!
This phenomenon occurs with any rapid file IO with a very large number of very small files. It is not limited to rmdir
or cmd.exe
. The above example is only for illustration.
Any idea what is going on and how to fix it?