11

// short update at the bottom

// another update is near the bottom to reply to a suggested edit

So, at first I had this idea: Find a virtual driver to set up and use software raid on windows. Result: Failed even with support from developer.

The next idea then came into my mind after watching a YouTube video about virtualization: Put in a 2nd rather cheap gpu for a linux system that runs bare metal and set up my windows in a VM with my main gpu via passthrough. This way I could had used mdadm/lvm and let linux do all that software raid stuff. Result: Failed - due to some weird issues with my motherboard not liking the 2nd gpu at all.

Then I read something about Windows Storage Spaces and that it is able to provide fault tolerance comparable with a software RAID6 (as far as I understand it's done by filesystem shadow copies spread accross the physical drives). So I gave it a try and got it working (although it required some manual lines in powershell as the gui version doesn't expose some of the advanced settings).

As this was just in a VM the test performance was rather bad, but I noticed that data are written multiple times, which can sometimes end up in the drives are used rather unevenly. As an example: One of the virtual disks only had about 2GB written to where as another drive had about 4GB written to. So, whatever distribution algorithm is used (it doesn'T look like round-robin but more like most available physical space first) it's far from how I would expected a software RAID6 to behave.

I also noticed it's rather wasteful to make use of physical disk space. My test was using 8 disks with 50GB each. A mdadm software RAID6 resulted in about short of 300GB useable space, the storage spaces one with only about 250GB - so another 15% "penalty". Ok, I guess that's all that overhead and such, but even from a software RAID I expected to make a bit better use out of my physical disk space.

I then tested what happens if I start to remove drives, and as I had it set up with -PhysicalDiskRedundancy 2 it was able to survive it and all test data were still all available.

So, overall it seem to fit my needs for a software raid on windows supporting raid6-like fault tolerance to survive a double failure (that is: failing a 2nd drive while rebuilding the 1st failed one). About the performance: Well, it'S software raid - and as I'm currently using fakeRAID (basically a driver-specific software raid shadowed by the bios) there will be not that much more system performance impact as I have right now.

What really made me think thrice about it: There are currently two major issues: a) it can't be mounted on a linux system (I had not yet tested if and how it may can be mounted in a recovery environment) and b) in the current win10 2004 are a lot of issues already caused data loss as reported by some users on different forums.

Why am I asking this: The main "issue" is that I currently don'T have the financial options to invest in new/better hardware. I only have to spare what I currently own. Hence I'm searching for a software solution. I tried WinBTRFS as it claimed to support software RAID for it's volumes, but I wasn't able to set it up correctly even with the help from its developer. So, the base question boils down to: Is using storage spaces a vailable option if one can't afford hardware RAID or other solutions like virtualization (due to hardware incompatibility)? Sure, I have many of my "really important" data backed up on an external drive, but still: I rather would build some reliable system instead of going the "I believe in that nothin will happen" way.

// update

Just as a small update about if and how you can access such a virtual disk via WinPE: I just downloaded the current 2004 ADK and created a fresh WinPE image. As I had to use PowerShell to access the information I just copied the instructions found on the ADK PE documentation. After that I created an ISO and booted that in the VM. Without any further commands it was available right from the boot. As I read on the MSDN forums this is only true for Client versions of Windows. On Server versions storage spaces start up in a readonly and detached state (I guess for safety). So in order to read from it one have to attach it manually. To write to it, obviously, one have to change it from readonly to readwrite - but as my question on that was about how to read data in a recovery environment for me writing to such a volume isn't needed.

// additional reply

As DarcyThomas suggested in his comment, here're my background why I currently use a RAID5 and why I think to have the need for migrate to a more safer style like RAID6:

  1. Am I doing it for the small read speed advantage: Although I noticed that the array is capable of streaming data a bit faster than one of the drives is able to on its own it only really shows when I copy large files resulting in long sequential reads. When I deal with a lot of small files, which cause a lot of random I/O, the performance sometimes get worse compared to a single drive. As for write speeds it's about the same story. So, to answer this question: No, speed advantage is surely not what I'm aiming for, hence I'm ok with the even worse "penalties" a RAID6 implies.

  2. Am I doing it as a cheap backup? One surely would try to argue yes. And I sure take advantage from still have all data available if one of the drives fail. Sure, I do have the really important data on another offline drive, so in a catastrophic loss of the array (i.e. due to hardware malfunction or the board going up in smoke) I still will have my important data safe. But I for sure do take advantage of the convenience to not have to worry about a drive failing as much as if I would use them as single drives (or maybe in another configuration). I already had two drives failed (both rather short time after moving - so it's possible that it was physical transport damage both times instead of the drive worn out) and the rebuild times were quite long (about 14 hours for just 3 TB).

  3. Do I really need that one single large volume? Although another debatable question to keep it short I would reply to it simple with: Yes, at least for convenience. I have the array already filled up more than 1/3 and managing such vast amount of data across multiple drives/volumes would result in a chaos (at least for me). Another neat side effect: If someone comes by with like new stuff (music, movies, etc) I can just "dump" it on the array and can reorganize and de-duplicated later without have to worry to clog up on of the drives. I'm someone with a brain like a fly: I would forget I had put data on another drive after a few hours and would take another to find it again. Just have it all in one place treats me.

  4. As for "online" backup solutions: Yes, I know they're out there. And yes, I also know there're some one can get for free or at least cheap. And sure I would have the ability to write myself some small encryptor/decryptor code making use of asymmetric keys to secure the symmetric one rather than using passphrases. And it's not like I won't trust them. But same goes true as in number 3: Over time I would just simply forget about a few of them. And although I have a rather fast connection (250/50) having all my data across the net isn't something I'm looking towards to. But I guess that's just a personal thing.

So, to summarize: Moving on from 5 drive RAID5 to 8 drive RAID6 for me is just the next logical step. The investment will be rather low (just for the additional drives + one or two simple HBAs) and done right it shouldn't depend on proprietary stuff like the one I'm using right now. Yes, I figured out how to access a storage space from a recovery environment, but this requires its proprietary spec to stay the same without sudden changes cause incompatibilities (like the chaos with just office documents). Maybe this addition may help others in the future for replies.

cryptearth
  • 119
  • 1
  • 4
  • I think you need to define what problem you are trying to solve. RAID solves 2 problems (depending on the configuration) Increase in speed with concurrent read/write, and redundancy. If one disk dies you have a backup. Do you need either of these and if so why? Could you just use a service like Dropbox or rsync for redundancy/backup? Then you only need to worry about speed ,and you can use a simpler RAID configuration for that. This doesn't answer your question. But it may be worth considering, so you don't get stuck in a rabbit hole trying to shave a yak! – DarcyThomas Aug 06 '20 at 00:46
  • I have added some additional information in respect to your comment to maybe clear up some things. Thanks for that great input. If it would had been its own answer it sure would had deserved some upvotes. – cryptearth Aug 06 '20 at 10:48

2 Answers2

12

Windows Parity Spaces are dog slow and (according to Microsoft) aren’t designed for anything except archive workloads. Microsoft keeps trying to improve write performance say implemented log missing from the hardware RAIDs, but lack of the battery-powered write back cache takes away all the fun. You can however try to improve writes by telling Spaces you have UPS.

https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/deploy-standalone-storage-spaces

Set-StoragePool -FriendlyName -IsPowerProtected $True

Another point is to use ReFS and Storage Spaces combined into so-called Mirror-Accelerated-Parity, writes will end up inside SSD tier to die on HDD tier later.

https://docs.microsoft.com/en-us/windows-server/storage/refs/mirror-accelerated-parity

http://knowledgebase.45drives.com/kb/kb450193-creating-mirror-accelerated-parity-volumes-and-storage-tiers-in-storage-spaces-windows-server-2019/

Unfortunately this isn’t 100% supported scenario for anything except Storage Spaces Direct (which is another can of worms on its own).

I’d suggest Linux MDRAID+XFS due to its stellar stability and lots of proven deployments or old stock LSI hardware RAID card from eBay if you absolutely need to stick with Windows Server OS.

BaronSamedi1958
  • 12,510
  • 1
  • 20
  • 46
  • 3
    It's a client OS, no server. I already had the idea to use a linux bare metal and run win as VM, but to use my main gpu via passthrough it would need a 2nd for linux - which my board doesn't like at all (it doesn't even boot with 2nd gpu). As I use the rig for games I'm stuck to win platform as not many of them run native on linux - and wine performance is even worse. It's also not supposed high performance, OS runs on its own drive, but it has to be reliable. It should also accessible by rescue system (still testing this one). Sorry for short reply, but you know comment length limit. – cryptearth Aug 03 '20 at 10:15
  • 1
    Just one note - I am not sure where Storage Spaces Direct is a can of worms. Using it without ANY problems since Server 2019 came out. Anything earlier - ouch - but that version upward is rock solid and stays up even during patch reboots (rolling upgrade - nice). The only time we really take it down these days is when there is a hard reason to take down power (i.e. major rework on the UPS line). Last time when we installed a bypass switch so we can move from UPS to Net - otherwise it works and works and works and works. WIth VERY good performance thanks to a lot of SSD write caches. – TomTom Aug 03 '20 at 12:11
  • 4
    We have very different experience with Storage Spaces Direct (node lock ups, data corruption with ReFS, various performance issues etc) and have no plans to re-implement it anytime soon. – BaronSamedi1958 Aug 03 '20 at 12:47
  • 4
    S2D in Windows 2019 had an initial issue with lost quorum in small 2 nodes cluster and there was a huge chance to get all storage offline after enabling maintenance mode on one server for patching or graceful restart. This issue might be already solved in some of winter patches. I deploy Starwind VSAN for small clusters by customers as an alternative to S2D, because it was developed for such usage. P.S. ReFS is also not so “resilient” as promised - https://bit.ly/2XqpJGC. – batistuta09 Aug 04 '20 at 08:45
  • 3
    Storage Spaces Direct (S2D? AzSHCI?) is anything but reliable tech. ReFS turning itself into RAW under heavy load or after reaching specific volume capacity is what people see in the wild. https://forums.veeam.com/veeam-backup-replication-f2/windows-2019-large-refs-and-deletes-t57726-150.html – RiGiD5 Aug 04 '20 at 18:39
6

"Windows Storage Spaces - a useful replacement for RAID6?"

If by "RAID6" you mean "I hate my data and want to get to it in as slow a way as possible" then yeah, sure - we lost 62TB of data to it at one point, luckly had a backup of it all but never again.

EDIT: Don’t trust to Windows software RAID, don’t trust to double parity hardware RAID of a great capacity, always follow 3-2-1 backup rule and “In backup we trust”.

BaronSamedi1958
  • 12,510
  • 1
  • 20
  • 46
Chopper3
  • 100,240
  • 9
  • 106
  • 238
  • Speed is an issue, but "you can not attribute a loss on SPD. I had the pleasure of watching a file server with Raid 6 die within 5 minutes due to 2 discs failing. RAID is never a guarantee for data. It is not designed for this. – TomTom Aug 03 '20 at 12:13
  • 5
    Disc = optical disk, spinning rust is called "Disk" in English. Chopper3 clearly stated "we recovered from backup" so your "RAID isn't a backup" remark is pointless. – BaronSamedi1958 Aug 03 '20 at 12:48
  • 4
    that is not a helpful answer, but a personal anecdote. Please edit to provide some context and reasons – aaaaa says reinstate Monica Aug 03 '20 at 15:36
  • 2
    I'm not sure I can follow this one, but to reply: By RAID6 I mean an array of 6 or more physical drives working together to withstand a double-failure like having a 2nd drive fail during rebuild the 1st failed one. I'm aware that this is still by no means an excuse for a separate backup, but even my consumer grade fakeRAID already saved me from data loss 2 times. Also: Again, as this is NOT a server but only a private machine used to run games performance doesn't really that much. How did you even end up with 62TB in a single RAID6 without nesting? That sounds more like a multi-error to me. – cryptearth Aug 04 '20 at 02:12
  • 4
    I tried to improve Chopper3’s reply with a short summary, but in general I agree with him and think his maybe a bit emotional reply IS an answer. Peace! :) – BaronSamedi1958 Aug 04 '20 at 08:00
  • 3
    This link about 3-2-1 backup rule should help. https://www.vmwareblog.org/3-2-1-backup-rule-data-will-always-survive/ – RiGiD5 Aug 04 '20 at 18:42