0

I have a centralized folder location on a network drive (traditional hard disk) that is shared by a few web services running on different application servers. The services will conitinually process incoming files via HTTP requests and will write to this location.

Every request will get its own sub folder with a unique name. Once all the files for a particular request is saved, the service that saved the files will notify another internal service that will read those files from that request folder and carry out further tasks.

For example,

If D:/MyNetworkFolder/ is the parent directory and if ServiceA is processing Request1 and ServiceB is processing Request2, both the services will be trying to save the incoming files of that request (total size upto 2GB) in D:/MyNetWorkFolder/Request1 and D:/MyNetworkFolder/Request2 respectively. Once all the files are saved for a request, another service will read the files from D:/MyNetworkFolder/RequestNumber. and carry out its tasks.

So, during peak hours, there will always be a set of services trying to write new files to the network folder and another set of services trying to read from the saved files in network folder. And possibly, another service trying to delete the files that are completely processed.

Is this type of parallel file processing possible? Would it affect the application's I/O performance or the hard disk's health because multiple services are trying to read/write from the same Parent location at the same time? The other option we have is to make sure each service gets its own physical network drive or to consider using SSDs.

All servers are running on Windows Server 2008 and above and the web services are written using C# and .NET.

030
  • 5,731
  • 12
  • 61
  • 107
Ren
  • 103
  • 5

3 Answers3

4

Is this type of parallel file processing possible? Would it affect the application's I/O performance or the hard disk's health because multiple services are trying to read/write from the same Parent location at the same time?

You've essentially described how shared network folders have worked since the dawn of... shared network folders. Performance will depend on your infrastructure but there isn't anything inherently performance impactful from doing this.

joeqwerty
  • 108,377
  • 6
  • 80
  • 171
3

Is it possible for multiple web services to write to a centralized network directory/folder location at the same time?

In short yes.

In practice you will need to benchmark your applications and storage and size them correctly.

You might also want to consider:

  • A large number of files or directories in a single directory can be bad for performance. How large a number will be problematic depends... Again benchmark. (a couple of thousand, no problem, millions, generally a bad idea.)
  • You need to ensure that unique file/directory names are generated/used.
  • If the file system is your queue you need to prevent the processing of files that are still in a previous stage ( i.e. no further processing on files that haven't finished uploading yet, files shouldn't be deleted before processing is completed etc.) When your files are on a single file-system you can achieve that by renaming the file/directory at the beginning and end each stage, which is atomic and as good as instantaneous, but if you would need to copy or move those 2GB files to different file systems that would take a lot more time and IO.

On a side note, in 2016 I wouldn't build new custom application with Windows Server 2008 as the target platform...

HBruijn
  • 72,524
  • 21
  • 127
  • 192
3

This is all perfectly fine, possible and standard. File servers are capable of dealing with several clients all wanting to request read and writes at once.

As for performance and health, both of these come down to metrics...

Performance: You define acceptable performance metrics for the file server, for the applications or for both and you use the performance monitoring tools to make sure those metrics are kept to.

Health: If you buy reasonable quality components then you should be capable of achieving decent uptime. Once you get past a certain point then you should consider high availability solutions as there are no guarantees that components will never fail, and all operating systems need downtime for maintenance after all.

Rob Moir
  • 31,664
  • 6
  • 58
  • 86