0

still trying to configure my large (24) array of disks (2.4TB) for an archive/nas of mixed huge and small files. But apart from this, I am now more focused to understand how striped RAIDs work under the hood but more I read and more I get confused because most of the literature's example are based on "low" number of disks (I asked the producer but he was reluctant to answer publicly some of those questions because "reserved information")

  • stripe size is usually (Number of data disks) x (size of strip) (or chunk) Eg. 8x64KB=512KB or 10x256KB=2560KB
  • how are files split and saved into the stripe? one file per stripe (the remaining strips are filled with zeros) or many files for a stripe until it all its strips are filled?
  • for large array, is the stripe size still important? I mean I discovered my PERC uses a fixed value 1MB for the stripe size if its value is bigger than 1MB (e.g. 8x256KB). In this case, how should the stripe be arranged? is it still large as 8x256KB=2MB and internally divided in 2x 1MB? or is it large 1MB divided by 8 data disks?
  • nowadays should I configure a striped RAID with "power of 2" in mind? my PERC allows me to configure any number of disks for any kind of RAID level, which are not power of 2
  • knowing these limitations(?), is it worth to set the array as a 2x12Disks RAID60 and 256KB of strip size? we need not to waste too much space

1 Answers1

2

how are files split and saved into the stripe? one file per stripe (the remaining strips are filled with zeros) or many files for a stripe until it all its strips are filled?

Arrays like this don't think in terms of files, just blocks, the filesystem itself defines what files are made up of what blocks, it's not the underlying disk system that does that.

So don't think of it as files just blocks, imagine all the files on your filesystem but take away all the data about folders and files, it's just one big pile of blocks - and it's those blocks that get striped across available disks for performance and resilience.

Generally speaking the default for file systems and RAID arrays like this will fit 95% of all applications just fine. The ability to tune them is great if you have the time to play about and test all the various combinations or if you have an application that has unusual requirements (such as it constantly reading or writing either lots of tiny random files or at the other end huge sequential files) - in those cases then yes some of the tuning can have significant benefits. But again generally speaking the defaults are usually pretty good for most use cases. I do VoD so we do often tune our storage volumes to have very large strips/blocks because we know they're all large sequential files, but then we don't put our DB files or logs etc. on those arrays/volumes because they'd be terrible for that use.

Anyway back to recommendations, glad you seem to have settled on R60 - we get people here all the time with issues with R5/50, it's dead, don't use it at all - R6/60 and R1/10 are the only game in town, unless you have a boner for ZFS anyway :) - anyway if I was doing this I'd do exactly what you suggest - R60 made up of 2 x 12-disk R6's, leave the stripe at defaults and then as your application starts to make use of this array you can look at how it's performing and if you really feel you need to tune it and will get a lot of benefit from doing so then go ahead, but I bet you'll be just fine with the defaults.

Best of luck.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
  • Thanks, another brick into my wall of knowledge. I like ZFS but used only inside built-in NAS software, never with OS from the scratch. Also it seems a powerful controller, what are the benefits using "software raid" against HW? About my other doubts, from your answer I guess now that a "stripe of strips" is filled up until "completion" and later next one is used. What about if the controller uses a "stripe size" shorter than the value I set? using "power of 2" number of disk would increase performance because the alignment is chunk-stripe-PV/LVM-XFS is perfect? – pink0.pallino Feb 01 '22 at 08:26
  • @pink0.pallino Generally, larger stripe sizes perform better with large, sequential writes (because of less processing overhead) and worse with small, random writes (because of write amplification). You should select the stripe size that fits your application. If you don't know run tests. All RAID variants generally perform best when the number of disks is a power of two plus the disks added for redundancy, e.g. 4, 6, 10, or 18 for RAID 6, or 8, 12, 20, 36 for RAID 60. – Zac67 Feb 01 '22 at 09:24
  • 1
    @Zac67 yes, I am choosing a 256KB strip size, for 2x (10+2 RAID6) array. This should fit our needs. My curiosity was about how the controller manages this large stripe (2560KB) if in such case it uses its own stripes of 1MB, which is not multiple of what I choose and I did not want set it "wrong". Can you explain why with RAID60 the "power of two" rule is 8, 12,20,36 and not the same of RAID6? I always thought that there is a RAID0 on the top of two RAID6 arrays and for each of these the rule 4, 6, 10, 18 – pink0.pallino Feb 01 '22 at 10:08
  • RAID 60 (also called RAID 6+0) consists of two RAID 6 subarrays that are striped like RAID 0, see https://en.wikipedia.org/wiki/Nested_RAID_levels#RAID_60_(RAID_6+0) – Zac67 Feb 01 '22 at 10:47