Can file system performance decrease if there is a very large number of files in a single directory (NTFS)?

6

4

I have heard that file system performance (on a NTFS partition) can start decreasing if the number of files in a single directory becomes very huge (eg : >= 10.000.000 items). Is it true ?

If true, what is the recommended maximum number of files in a single directory ?

EDIT:

About performance: I'm thinking about file operations inside that folder (read, write, create, delete) that could possibly get slow.

tigrou

Posted 2013-07-25T07:01:09.203

Reputation: 759

Yes. MSDN advices not to keep more than 20k files in a single directory. (Windows Vista 2gb Ram) - I have noticed when it starts to go over 40k (Windows 7 4gb Ram) it grinds to a halt. Everything just hangs and stops to work. But having 100k sub directories does not affect speed at all :) – Piotr Kula – 2013-07-25T08:29:42.913

Answers

7

I answer my own question : Yes, it's definitely slower.

I wrote a C# Console Application that creates many empty files in a folder and then randomly access them. Here is the results :

10 files in a folder        : ~26000 operation/sec
1.000.000 files a in folder : ~6000 operation/sec

Here is source code :

List<string> files = new List<string>();

Console.WriteLine("creating files...");
for (int i = 0; i < 1000 * 1000; i++)
{
    string filename = @"C:\test\" + Guid.NewGuid().ToString();
    using (File.Create(filename));
    files.Add(filename);
}

Console.WriteLine("benchmark...");            
Random r = new Random();
Stopwatch sw = new Stopwatch();
sw.Start();

int count = 0;
while (sw.ElapsedMilliseconds < 5000)
{
    string filename = files[r.Next(files.Count)];
    string text = System.IO.File.ReadAllText(filename);
    count++;
}
Console.WriteLine("{0} operation/sec ", count / 5);

tigrou

Posted 2013-07-25T07:01:09.203

Reputation: 759

+1 for the code. I found that as long as there were above 1000 files, the time was very similar, no difference 1k or 300k. Under 1000 files it depended on the number of files. – wezten – 2017-02-28T13:57:57.617

1To be useful, you need to compare to some alternative way to randomly store and access 1M files. E.g. make 1000 subfolders each containing 1000 files, then randomly access those 1M files. – ToolmakerSteve – 2019-03-29T06:11:40.837

2

If you read this, then you should get a pretty good understanding of how NTFS works indexing files and folders.

Locally it shouldn't be much of a hazel indexing files and folders, if you follow the guidelines in the link above, but it will need alot of maintenance with that many files.
On a network it will be another story. It will be slow, this is from my own expirience at work, where we have folders with thousand of folders and that takes some time to index over a network.

Another thing to probably increase with that many files is to disable short-naming:, which will stop windows from creating a second file directory entry which will follow the 8.3 convention (MS-DOS file-naming convention) and decrease the time for folders to enumerate, because it doesn't have to look up the short-names associated with their long-names when enumerating.

  • Go to Run in the Start menu
  • Type cmd and when you see the command-prompt, then right-click on it and select Run as administrator
  • When in the Command prompt type fsutil behavior set disable8dot3 1 to disable short-naming
  • Reboot

If you want to enable it again, type fsutil behavior set disable8dot3 0

Jesper Jensen

Posted 2013-07-25T07:01:09.203

Reputation: 654

See StephenR's comments on this answer - if already have many files, after disabling 8.3, need to strip existing 8.3 names to get speed improvement.

– ToolmakerSteve – 2019-03-29T06:27:57.353

1Not entirely true. Have you ever tried to access a folder with 80k files (say bad email folder on a server) without any tweaks. You can wait a day before it enumerates. – Piotr Kula – 2013-07-25T08:31:01.090

No ofcourse it's not true in all cases, but I still believe if you do it right and maintain it regulary, then you could have a working system. What do you mean with bad email folder? – Jesper Jensen – 2013-07-25T08:50:52.743

1You clearly never had to deal with a mail server before :) You need to write in your answer that if it gets maintained well (about 80% of systems admins don't do that) then there will be no problems. Besides your answer does not really talk about read/write performance and what disabling 8dot3 will do to affect performance. Neither is there hard facts that this does help. Sorry to be such pain.. but your answer needs improovement. -1 till you do so. Let me know – Piotr Kula – 2013-07-25T08:57:12.470

I never said that i've dealed with a mail servers or that the above is from my own expirience (except the network part) :). It is in my answer to maintain but it will need alot of maintenance with that many files..But thanks for the criticism and i'll try to improve my answer a bit. – Jesper Jensen – 2013-07-25T09:16:01.567