I'm running a workstation with dual xeon 5690's (12 physical/24 logical cores), 192 gb of ram (ie, maxed-out), Windows 7 64bit, 5 slots for adapter cards, and 1 tb of internal storage, with 5 more internal bays available.
I have an app that creates data files totaling about 88 tbs. These are written once every 14 months, and the rest of the time the app only needs to read them; and > 95% of the reads are sequential reads of huge chunks of data. I have some control over how big the individual files are, but ideally they would be between 5 and 8 tbs.
The app will be reading from only one drive at a time, and the nature of the data is such that if (when) a drive dies I can restore the data to a new disk from tape.
While it would be nice to be able to use the fastest drive/controllers available, at this point size matters more than speed.
After doing lots of reading, I am leaning toward buying a bunch of cheap 2tb drives and putting them into a bunch of cheap enclosures. All this stuff is going into my home office, so I need to avoid the raised floor/refrigerated approach.
My questions:
Is the cheap drive/enclosure solution the best one for this situation?
Given the nature of the app and the way the data is used, does RAID make sense? If so, which one?
For huge sequential reads, would Usb 3.0 and eSata be a wash performance-wise?
For each slot available on the workstation, can I hook up an enclosure that can hold multiple drives? Or is it one controller per drive?
If I can have multiple drives on one controller, am I essentially splitting the bandwidth (throughput)? For example, if I have a 12 bay enclosure, is the throughput of the controller reduced by a factor of 12?
Are there any Windows 7 volume/drive/capacity limits I should be aware of?