I am looking at this setup:
- Windows Server 2012
- 1 TB NTFS drive, 4 KB clusters, ~90% full
- ~10M files stored in 10,000 folders = ~1,000 files/folder
- Files mostly quite small < 50 KB
- Virtual drive hosted on disk array
When an application accesses files stored in random folders it takes 60-100 ms to read each file. With a test tool it seems that the delay occurs when opening the file. Reading the data then only takes a fraction of the time.
In summary this means that reading 50 files can easily take 3-4 seconds which is much more than expected. Writing is done in batch so performance is not an issue here.
I already followed advice on SO and SF to arrive at these numbers.
- Using folders to reduce number of files per folder (Storing a million images in the filesystem)
- Run
contig
to defragment folders and files (https://stackoverflow.com/a/291292/1059776) - 8.3 names and last access time disabled (Configuring NTFS file system for performance)
What to do about the read times?
- Consider 60-100 ms per file to be ok (it isn't, is it?)
- Any ideas how the setup can be improved?
- Are there low-level monitoring tools that can tell what exactly the time is spent on?
UPDATE
As mentioned in the comments the system runs Symantec Endpoint Protection. However, disabling it does not change the read times.
PerfMon measures 10-20 ms per read. This would mean that any file read takes ~6 I/O read operations, right? Would this be MFT lookup and ACL checks?
The MFT has a size of ~8.5 GB which is more than main memory.