13

I've heard that the write performance of RAID 5 can be appalling at times. While I want the redundancy that it provides I don't want to sacrifice my database insert/update times.

Is this something I should be worried about and if so, what would be the recommendation to get redundancy with good write performance?

Scott Saad
  • 436
  • 2
  • 7
  • 10
  • 1
    Which DB? Oracle + RAID 5 used to be a no-no. Not sure if this is still the case. – cagcowboy Apr 30 '09 at 14:45
  • In this particular example the database runs on MySql and MSSQL. – Scott Saad Apr 30 '09 at 15:36
  • A little more specificity in the question with regards to the application (DB read/write mix, up time and recovery time requirements) might get a more applicable answer; these things might make a difference in the best solution. – Jay Stevens May 01 '09 at 16:20

11 Answers11

23

RAID 10 is usually recommended since the I/O is so random. Here's an example. The calculations are a bit simplified, but pretty representative.

Let's say you have a 6 drive array and your drives can do 100 I/Os per second (IOPS). If you have 100% reads, all six drives will be used and you'll have about 600 IOPS for both RAID 10 and RAID 5.

The worst case scenario is 100% writes. In that scenario, RAID 10's performance will be cut in half (since each write goes to two drives), so it will get 300 IOPS. RAID-5 will convert each write into two reads followed by two writes, so it will get 1/4 the performance or about 150 IOPS. That's a pretty big hit.

Your actual read/write pattern will be somewhere in-between these two extremes, but this is why RAID 10 is usually recommended for databases.

However, if you don't have a busy database server, then you could even do RAID-6. I often do that if I know the database isn't going to be bottleneck since it gives you much more safety than RAID 10 or RAID 5.

TorgoGuy
  • 1,028
  • 1
  • 7
  • 13
22

Transactional databases

RAID-5 is relatively slow to write as the controller needs to load in enough data to recalculate the parity on a write. Write operations will incur at least four disk operations:

  • Reading in the parity block

  • Reading in the old block (assuming it is not already in cache) to XOR the value with the parity block.

  • Writing the new parity block (old parity block XOR old data block XOR new data block)

  • Writing the new data block.

If the system does not use write-back caching this means that all of these oprations are on the critical path for I/O completion. Often, this is the case with database writes - in fact, Microsoft (for example) has a certification program for SAN equipment for use with SQL server that requires the vendors to guarantee this behaviour. Sometimes older RAID-5 equipment did not use this optimisation and had to recalculate the parity from the whole stripe.

RAID-10 has a mirror for every single drive and does not need to read additional data to calculate parity. This means that writes need much less physical I/O.

RAID-50 sits somewhere in the middle, with the volume split into multiple RAID-5 volumes, which are in turn striped. On a RAID-50 made from groups striped in a 3+1 scheme a write generates at most three additional disk I/O requests. If you feel so inclined you can view RAID-5 and RAID-10 as special cases of RAID-50. RAID-50 is mainly used to provide large volumes across many physical disks

Other parity schemes such as RAID-6 (a parity scheme with two redundant disks per set) also exist, Modern disks are large enough that rebulding an array can take quite a long time - long enough that the risk of a second disk failure during the rebuild is quite significant. RAID-6 mitigates this risk by having a second parity disk, requiring three disk failures to cause data loss. A similar trick to RAID-50 schemes can be used to make RAID-60 arrays.

Finally, a single mirrored pair (known as RAID-1) can provide redundancy and good enough performance for some tasks. In particular you will probably find that RAID-1 gves you sufficient throughput for quite a lot of database log traffic. More on this below.

If you have a write-heavy workload you will probably get a performance gain from a RAID-10 volume. This can be a win as you can probably get your required throughput from a smaller number of physical disks, assuming the disks have enough space). Some items such as logs or temporay areas on a database server should be on RAID-1 or RAID-10 volumes, as these get lots of write traffic.

Logs

Log volumes are characterised by a mostly sequential data access pattern, and are essentially a ring buffer consisting of commands along the lines of 'write this data to this block' They are written out as a producer by the core DBMS engine and processed as a conumer by the log reader function. A single mirrored pair will actually handle quite a lot of log traffic.

Read-heavy systems and file servers

On a read-heavy system such as a data warehouse you may want to use one or more RAID-5 volumes. On a file server, disk accesses will largely be done on a whole file basis, so writes will probably write out most of the blocks that make up the parity block anyway. In this case the performance penalyt for RAID-5 will be lighter.

In the past the cost savings on disk may have been significant but this is less likely to be an issue now.

Write-back caching and RAID-5

On a SAN or internal RAID controller with a battery-backed cache you may be able to enable 'Write-back' caching. This caches writes and returns control to the application. The I/O is reported by the controller as completed. However, it does not necessarily write the data out to disk immediately. This facility allows RAID-5 parity read/write operations to be substantially optimised and can soften the write performance penalty for RAID-5x volumes.

However, this still carries a small risk of data integrity issues. The host system has been told that this write has been completed when this is not in fact the case. It is possible for a hardware failure create data inconsistencies between (say) log and data volumes on a database server. For this reason, write-back caching is not recommended for transactional systems, although it may be a performance win for something like an ETL process.

Summary

Disk space is so cheap nowadays that transactional systems should probably use RAID-1 or RAID-10 for log volumes and RAID-10 for data volumes. The physical disk size is likely to be much larger than the database and RAID-10 will allow more write throughput for the same number of disks, potentially reducing the number of disk volumes needed to support the system.

On something like a data warehouse you can still chew space with large, heavily indexed fact tables so you might get small price win with RAID-5 or RAID-50 data volumes. However logs and tempdb should still be placed on a RAID-10 volume as they will probably get a lot of work during ETL processing. However, the cost saving on disk is likely to be fairly small.

  • Write-back cache: if you're buying a RAID controller with a "battery-backed" write-back cache note that the battery doesn't seem to come included. Make sure you get your supplier to include one. – David Hicks May 01 '09 at 19:07
  • Some do, some don't. I've got quite a few Adaptec 2200s's which didn't come with batteries. Some do come with them as standard. – ConcernedOfTunbridgeWells May 01 '09 at 20:36
  • 1
    You have an error in your response. You do *not* need to read every drive to recalculate parity. A write becomes 2 reads and 2 writes. The other 12 drives in your example 14-drive array will not be touched by RAID5. – TorgoGuy May 03 '09 at 18:55
  • Actually, I do believe the man is right. You could read and XOR the parity with the old value of the block and again with the new value of the block. Never seen that described before but it would work. – ConcernedOfTunbridgeWells May 03 '09 at 19:25
3

Depends on how much writing you're doing.

If it's a fairly lightweight "web app" then you're unlikely to see any performance hit on RAID5.

If you're building a multi-GB data warehouse with large ETL's, then the write buffer on the RAID 5 will quickly overflow and your straight into the "poor write performance" of RAID 5.

Every RAID5 write will cause at least 3 writes (plus a CRC calculation). When buffered, this is fine and fast (small short bursts of activity - single record updates and inserts). If this is sustained writes (large bulk insert / updates) then it will be noticed.

It's a balance between performance and space. RAID 10 (mirror of striped drives) gives both performance and resiliency, but 50% reduction in capacity.

RAID5 gives higher capacity, good read performance but poor (large) write performance.

hbogert
  • 411
  • 1
  • 4
  • 18
Guy
  • 2,658
  • 2
  • 20
  • 24
3

Well, it depends heavily on your fault/risk tolerance. RAID5 has a lot of issues. My DB server currently have two mirrored drives, and if I was to scale that up, I'd go for something with more parity, probably RAID6 or RAID10.

Also, if your application is uptime-critical, I'd probably recommend having two database servers with replication instead, master-master or hot spare or whatever. RAID only helps against disk failures, but there's a lot more that could go wrong on a server :)

mikl
  • 612
  • 1
  • 11
  • 17
2

RAID 1, that's my final answer

Reasons:

mirrored pair gives adequate redundancy for failed disks and the RAID continues to function to the last disk.

mirrored pair yields greater I/O performance for reads if you place your data and indexes carefully... [hint: Use seperate volumes for data and its index(s)]. You can obtain even more performance by duplexing your controllers.

ninegrid
  • 213
  • 3
  • 6
2

For a good few "why not" guides, this was always very popular with the Oracle DBA's I have worked with...

The Battle against any Raid Five - http://www.miracleas.com/BAARF/

Mark Regensberg
  • 1,421
  • 12
  • 14
2

Short answer: no.

Long answer: unless you have a very small database or very minimal requirements, no. Data retrieval is heavily dependent on disk I/O operations per second, and the overhead of striping will eat up your disk access over time, especially with long query runs. Most databases are run on a RAID 10 style setup, or with specific volumes holding partitions of the data. Yes, RAID 10 will cost you in writes, but your read performance (with the right setup) will go way up.

Avery Payne
  • 14,326
  • 1
  • 48
  • 87
1

what would be the recommendation to get redundancy with good write performance?

A large write-back cache. Increase the RAM on your hardware RAID controller or the RAM available to your software RAID solution (i.e. for Linux's MDADM increase the system RAM, MDADM balloons to use otherwise unused system RAM as write cache). This advice holds for given values of "large" - if you'll be often (5% of the time?) writing data at a rate fast enough to fill the write cache no matter how large it is then this will make little difference.

David Hicks
  • 2,258
  • 2
  • 15
  • 12
1

Really its about separating your data files, log files, from your OS file. Logs write sequentially Data incur lots of random reads and some random writes

By building RAID configurations that support these traits you enhance performance greatly

Raid 1 - mirroring is great for the logs files Raid 10 is good for your data files. Its also worth looking at separating TempbDB and backups to separate drives. Adding filegroups is another way to increase performance. When it comes to SANs this is not so clear. It depends whether you are building specific raid configurations for each LUN or relying on number of spindles.

CPU_BUSY
  • 2,322
  • 17
  • 17
0

Running a database off of RAID5 is usually a mistake. I have only seen it done in two cases - well-engineered databases that were read-intensive with few writes, and databases where "wasting space" due to RAID5 was not politically viable.

RAID5 will destroy transactional performance.

Also, if you are considering RAID5, see if you can get RAID6 to work. Theoretical reliability is so much better, although real-world reliability is often worse due to immature implementations.

carlito
  • 2,489
  • 18
  • 12
0

In the spirit of the recent StackOverflow blog article saying we shouldn't re-present answers already available on the Internet, I point you at this

RAID-5 isn't unique in the redundancy it provides, it just does it while consuming less additional disks than some alternatives. You can pick something else with equal or better redundancy and better write performance