Disk space usage in large folders with over one million files

0

0

I have a NTFS-compressed folder containing one to two million small (~25KB) text files.

Recently, I noticed that far more space was being consumed than would be expected. Digging deeper revealed that right clicking on the containing folder and viewing properties frees up space -- to the tune of about 7 GB, starting when I open up properties and stopping when the properties window has finished counting the files and size of the folder. I can actually watch my disk space free up in real time as the properties window adds up all the files in the folder.

This came as a pretty odd surprise. Can anyone explain this behavior or suggest how to avoid it?

The box is Windows 8.1 Pro, with a SSD and no disk encryption used.

Folder attributes:

Archive bit: Off  
Indexing: Off   
Compression: On  
Encryption: Off

user1393477

Posted 2015-02-04T16:15:57.100

Reputation: 103

did you boot with hirens boot cd in windows live mode and check the size of this folder? This way you can be shure it's from windows 8 or it's compression – valko – 2015-02-04T16:31:10.190

1Two notes: 1. The apparent freeing of space may be an artifact of estimated vs. actual measurement, rather than real, as @yalko states. 2. When formatting a disk, you can specify allocation unit size, which impacts disk usage for small files in particular. If you are concerned with disk space, allocation size should be smaller than or equal to prospective file size. – DrMoishe Pippik – 2015-02-04T18:12:59.407

Answers

1

See this page Its a balance between disk performance and drive space, you cannot have both, one will always suffer.

NTFS Optimization

If you investigate your storage needs, you can tune some of global NTFS parameters to achieve significant increase of disk performance. Other techniques like disk defragmentation could help you either.

There are several factors (we do not mention here drive type, rpm ...) that affect the NTFS Performance: Cluster Size, location and fragmentation of Master File Table (MFT) and paging file, NTFS Volume compression, NTFS Volume Source (created or converted from existing FAT volume).

Define Cluster Size Properly Cluster is an allocation unit. If you create file lets say 1 byte in size, at least one cluster should be allocated on FAT file system. On NTFS if file is small enough, it can be stored in MFT record itself without using additional clusters.

When file grows beyond the cluster boundary, another cluster is allocated. It means that the bigger the cluster size, the more disk space is wasted, however, the performance is better.

The following table shows the default values that Windows NT/2000/XP uses for NTFS formatting:

Drive size
(logical volume) Cluster size Sectors


 512 MB or less               512 bytes           1
 513 MB - 1,024 MB (1 GB)   1,024 bytes (1 KB)    2

1,025 MB - 2,048 MB (2 GB) 2,048 bytes (2 KB) 4 2,049 MB and larger 4,096 bytes (4 KB) 8

However, when you format the partition manually, you can specify cluster size 512 bytes, 1 KB, 2 KB, 4 KB, 8 KB, 16 KB, 32 KB, 64 KB in the format dialog box or as a parameter to the command line FORMAT utility.

What it gives us? Determine average file size and format the partition accordingly. How to determine? The simplest (but rough) way is to divide number of files on the drive by total disk usage in kilobytes.

Another idea is to estimate the approximate data size in advance before formatting the hard drive. If you are going to store multimedia stuff that is usually huge in size, make cluster bigger to increase a performance.

If you plan to store small web pages or text documents, make cluster size smaller not to lose a lot of disk space. Think!

Moab

Posted 2015-02-04T16:15:57.100

Reputation: 54 203