10

We have an application that is planning to store around 1.1TB of XML files which average 8.5kb in size.

These represent a rolling 18 months of data, with around 200,000 new files being created every day.

Each file will be written only once, and then has a 3% chance of being read a small number (<10) of times over the following 18 months.

What NTFS options are open to us that will help with performance?

Current ones on our list are:

Edit

Regarding fragmentation: We are planning to use 2k cluster sizes for disk space usage efficiency. Each file will be written only once (i.e. no file edits). Files will be deleted after 18 months on a dayby-day basis.

Therefore we don't believe that fragmentation will be a significant issue.

Richard Ev
  • 240
  • 1
  • 3
  • 14

6 Answers6

11

Disable last access time stamp and reserve space for the MFT.

Janis Veinbergs
  • 1,545
  • 4
  • 23
  • 34
alexandrul
  • 1,435
  • 2
  • 19
  • 25
7

I would also add:

Turn off disk defragmentation. Change block size to 16kb so each file is written into a single block.

Rational for this:

You are wanting to write 1.7GB of data a day, in 200,000 files. Assuming that these files are writen over a 24 hour day, this means around 3 files a second. This does not seem to be a significant problem for a single SATA disk so my guess is that you have other problems as well as disk performance.

(i.e. do you have enough memory? or are you paging memory to disk as well?)

However

  1. Windows NTFS file systems by default attempts to defragment file systems in the background. Disk defragmentation will kill performance whilst you are defragmenting the disk. Since performance seems to already be an issue, this will only be making matters worse for you.

  2. There is a balance between using small cluster sizes and IO performance in writing large files. Files and the file allocation table will not be on the same sector on the disk, so having to allocated blocks as you are writing files will cause the disk head to have to constantly move around. Using a cluster size that is capable of storing 95% of your files in one cluster each, will improve your IO write performance.

  3. As other people have pointed out, using a tiny cluster size of 2k will cause fragmentation over time. Think of it like this, during the first 18 months you will be writing files into clean empty disk, but the OS doesnt know that once closed, no more data will be added to each file, so it has been leaving some blocks available at the end each file incase that file is extended later. Long before you fill the disk, you will find that the only free space is in gaps between other files. Not only that, when its selecting a gap for your file, the os does not know if you are writing a 5 block file or a 2 block file, so it can't make a good choices on where to save your file.

At the end of the day, engineering is about handling conflicting needs, and choosing the lowest cost solution to these balancing needs. My guess is that buying a larger hard drive is probably cheaper than buying faster hard drives.

Michael Shaw
  • 663
  • 4
  • 9
  • We were planning to use a block size of 2k for disk space usage efficiency reasons – Richard Ev Jul 28 '09 at 11:26
  • 4
    ahh the joys of engineering judgement. each block used by a file needs to be removed from free space, and allocated to a file. By using small block sizes you improve disk space efficiency, less wasted space in partially used blocks, but you reduce I/O efficiency as you are increasing the amount of block allocation that occurs. – Michael Shaw Jul 28 '09 at 12:16
  • 2
    What's more, by going for space efficiency like that you fragment a lot. By using a larger block size like ptolemy suggested, where each file fits in one block, fragmentation should be a very minor problem. – sysadmin1138 Jul 28 '09 at 14:59
  • 1
    In my experience, compression should be turned-on - especially if you're not expecting a high read-rate; As to number of files in a directory? I've personally had issues exceeding more than a few thousand if I want to do much reading - for write-only, it hasn't seemed to matter much – warren Sep 07 '09 at 01:48
2

To elaborate on my comment on Ptolemy's answer...

By setting up your block size so a very large majority of every file is contained within one block, you do get I/O efficiencies. With a 2K block size and a 8.5K average file-size, 50% of your I/O operations will be to 5 blocks or more. By setting a 16K block size, it sounds like the very large majority of writes would be to a single block; which would make those 3% of reads much more efficient when they happen.

One thing to consider is backup I/O. If you are backing up the data, every file will get read at least once, and their directory entries will be trolled every backup pass. If you are going to back this up, please consider backup I/O in your designs.

Caveats: if your underlaying storage system is one that already does some storage virtualization (such as an HP EVA disk-array, or other arrays of that class) then this doesn't matter so much. Fragmentation of this type will not be noticed as the data already physically exists in a highly fragmented nature on the actual drives. In that case, the 2k block size is just fine and won't affect performance as much. There will still be performance gains by selecting a block size large enough to hold a majority of your expected file-sizes, but the magnitude won't be as significant.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • Good point about filesystem fragmentation when there's abstraction from the storage hardware. This is also true when dealing with virtual disks sitting on top of a disk array. Potentially, NTFS sits on top of datastore filesystem (e.g., VMFS) which in turn sits on top of however the disk array presents itself. – damorg Jul 28 '09 at 18:51
2

Late for this party, but might benefit others, so...

Re. cluster size, first and most important, you'd need to look at the distribution of file sizes, so you could optimize for both low fragmentation and disk space waste so you'd resize clusters close to this size, not overall avg - e.g.: if most files fall near 2k, a 2k cluster size would be optimal, if near 4k, then a 4k cluster would be optimal, and so forth; if otoh file sizes are evenly/randomly distributed, then best you could do is go with close to avg file size for cluster size, or store files in partitions with different cluster sizes for different file sizes, like some larger systems do, but you'd need software/fs support for that.

user268372
  • 21
  • 1
1

You may also want to look into RAID for your design. There are various forms of RAID, but you would do well to look into RAID 5, which would allow you to be writing files to different drives at the same time, but the data would still be on one volume... This gives you several benefits:

1) You are creating a backup as you go. This allows you to have a drive crash, and you can recover. RAID 1 would create a mirrored copy, but 5 involves striping - RAID 1 would only give you the benefit of that backup... though 5 would be more involved, and you would need more drives to set it up (minimum of 3, versus the 2 needed for RAID 1), you have other benefits.

2) The striping also increases performance, Because you can be writing multiple files at once (estimated 3 per second, above...) the striping would allow the files to be "distributed" along the disks, and each disk only taking a part of the burden. The more disks involved, the lighter the burden per disk, but there would be a point where you would reach a limit of performance versus cost...

3) If you back up the data, the backup can take place without degrading any write performance - depending on the size of the cache of the disks, of course, and the form of backup... but for the most part, you wouldn't need to shut down to invoke the backups.

Also, the way you have the system set up, it even sounds like backups would be easier for you - you only need to back up the 24 hours' data at a time, as the filse aren't being modified later. You could even write a batch job that compresses the data if you were concerned about the space taken up by the files... XML is mainly text, so the compression ratios would be high, and decompression would be rarely needed, on only 3% of the files... so you could include compression on the drive without any fear of decompression time. This would also reduce the block sizes needed, and could further increase the efficiency of the system, with the CPU involved in compressing the data, and not just being the go-between of data. (I.E. If all you did was store data, it'd be a waste of that nice CPU processor in that system... but if it were using the "wasted" clock cycles, compressing the data and more efficiently distributing to the drives, all the better!)

With compression, your 2K blocks would probably hold your 8.5K files without a problem. Add striping and RAID backup, along with a hefty CPU, enough memory to not cache any running programs (if any cache is used at all), and you're well on your way to a good system for what you are looking to do.

  • 4
    In regard to #1 - RAID is not a backup. Ever. It provides redundancy, which enhances availability. – MDMarra Sep 06 '09 at 23:10
0

This is a simple utility to increase the NTFS performance by turning off some NTFS features that are not so used by now (or not so important).

https://gist.github.com/p3x-robot/185e5c1b699d726bcce1bb51d5ca82d8

rem execute as an Administrator

rem based on http://www.windowsdevcenter.com/pub/a/windows/2005/02/08/NTFS_Hacks.html
ram based on https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-2000-server/cc938961(v=technet.10)

rem http://archive.oreilly.com/cs/user/view/cs_msg/95219 (some installers need 8dot3 filenames)
rem disable 8dot3 filenames
ram Warning: Some applications such as incremental backup utilities rely on this update information and do not function correctly without it.
fsutil behavior set disable8dot3 1

rem increase ntfs mtz size
fsutil behavior set mftzone 2

rem disable last access time on all files
fsutil behavior set disablelastaccess 1

echo now you can reboot
Patrik Laszlo
  • 175
  • 1
  • 10
  • i tested with a npm system about 200k one build and from 30 seconds on SSD, with NVME was 20 seconds and after tuning the speed was down to 15 seconds (on Linux the same is 3 seconds....) – Patrik Laszlo Apr 17 '18 at 15:52