How to calculate the max write threads on a hard drive if X speed is desired?

0

I'm trying to establish the max write threads a hard drive can handle, for example, if the desired speed is 20KB/sec per a thread how can I test the max simultaneous writes before the drive throttles and becomes slower, and let's assume the OS, Filesystem or the Application is not apart of the bottleneck.

Each file being written is different per user.

I did read Achieve Maximum write speed on hard disk posted by another user but where is this question is different is the other question focused based on how many files a second while mine is how many based on X KB/sec.

I ran a test using HD Tune and CrystalDiskMark but sadly I think this only covers single threaded transfers or I don't know how to read the results and calculate from them.

Here's the result from CrystalDiskMark, I'm unsure if this is helpful or not.

enter image description here

Question(s)

  • How can I test a hard drive and work out how many simultaneous disks writes the drive can handle based on setting a minimal speed of 100KB/sec

Simon Hayter

Posted 2019-01-30T17:38:30.370

Reputation: 242

1Why not throw in an SSD and bypass the problem - SSDs dont significantly suffer from differences in sequential vs random reads and are much faster all round. (Meaning the answer becomes much closer to speed/users) – davidgo – 2019-01-30T18:30:11.607

Answers

1

It depends entirely on whether you're doing sequential or random I/O, and how often you want / need to flush to disk...

Both 20 KB/s and 100 KB/s are negligible with today's hardware. From the CrystalDiskMark screenshot, and your concern I'd suspect you're dealing with a spinning disk... why not use an SSD?


max simultaneous writes before the drive throttles and becomes slower

It's not a matter of the drive throttling, but rather that the physical movement of the head takes time to complete. With random I/O this is exacerbated as the size of each written block shrinks, and the seek time between writes increases.

let's assume the OS, Filesystem or the Application is not a part of the bottleneck

Without knowing the state of the filesystem in terms of fragmentation and free space, you cannot assume this, and you certainly can't assume it over the life of a product or installation.


If you're suffering from performance issues, then you'll want to make use of buffered I/O - i.e: writing to a file actually collects data into a buffer, before writing a larger block to disk at once.

Writing 100 KB/s for a period of 10 seconds can be presented to the storage as any of the following (or wider):

  • a block of 1 KB every 10ms
  • a block of 10 KB every 100ms
  • a block of 100 KB every 1 second
  • a block of 1,000 KB every 10 seconds

Are we discussing the regular (red), or infrequent (green)? Each of the colors will "write" the same amount of data over the same timeframe.

write throughput in different block sizes

Writing larger blocks at once will help with throughput and filesystem fragmentation, though there is a trade-off to consider.

  • Writing larger blocks, less frequently - will improve throughput, but requires more RAM, and in the event of power loss or crash, a larger portion of data will be lost
  • Writing smaller blocks, more frequently - will degrade throughput, but requires less RAM, and less data is held in volatile memory.

The filesystem or OS may impose rules about how frequently the file cache is written to disk, so you may need to manage this caching within the application... Start with using buffered I/O, and if that doesn't cut it, review the situation.


let's pretend 1,000 users are uploading 1GB file at 20KB/sec

You're comfortable with users uploading a 1 GB file over ~14.5 hours? With all of the issues that failures incur (i.e: re-uploading from the beginning).

Attie

Posted 2019-01-30T17:38:30.370

Reputation: 14 841

Sorry, random IO, new files per user. – Simon Hayter – 2019-01-30T17:56:31.130

Basically users will be uploading files, they vary in size but it would be deemed acceptable for them to be around 20KB/sec, obviously, a mechanical drive with 20KB/sec per user with thousands of users would cause delay due to the disk head bouncing back and forth, it's that part I want to measure. So, Ideally, I want to estimate this hard drive can handle, 250 users, but I know its not as easy as this because then you got response time between the user and disk, but I want to calculate that into the estimate as well. – Simon Hayter – 2019-01-30T18:03:42.980

Isn't there a util either in Powershell or application that I can run which fires off 1000 write threads for example? – Simon Hayter – 2019-01-30T18:05:53.903

Unless you're dealing with a stream of data that needs to be captured in real time at 20 KB/sec, is this really an issue? Even so, I would expect your application / OS to cache a chunk of the upload, and write a large block at once (like the green in the graph). It may take many seconds to fill the cache and trigger a write to disk, depending on the configuration. – Attie – 2019-01-30T18:08:24.987

Yes, there probably is an application that can benchmark this, but I'm not convinced that is in anyway useful information for your use case... – Attie – 2019-01-30T18:09:38.273

Well, let's pretend 1,000 users are uploading 1GB file at 20KB/sec, as far as I know each byte gets queued, but these files don't go in order, the disk will surely go back and forth doing them bit by bit no? meaning they all partially get uploaded over time, roughly all finishing at the same point (roughly). – Simon Hayter – 2019-01-30T18:11:31.820

"the disk will surely go back and forth doing them bit by bit no?"... no... possibly 4KB by 4KB (or sector size), but ideally more like ~4-16MB by 4-16MB depending on available RAM. – Attie – 2019-01-30T18:13:04.150

I see, so your saying, must of the uploads will be written to the ram and then written in bigger chunks on the disk? – Simon Hayter – 2019-01-30T18:14:01.653

Correct, if you're looking to make this work, that's a preferred solution. Don't write many small blocks, write fewer large blocks... that also helps significantly with filesystem fragmentation. – Attie – 2019-01-30T18:15:39.983

Oh, so if the majority of the disk has files bigger than 1GB is it better to have a bigger block size of like 1M? for example using exFAT and assuming I have lots of RAM? – Simon Hayter – 2019-01-30T18:22:47.277

That's a whole other question / topic... Typically no, don't change the filesystem's block size - just write more data at once, and this will be written to adjoining blocks on disk. – Attie – 2019-01-30T18:43:30.740