1

We're considering building an Oracle database with 12 Intel X25-M G2 160GB drives in software RAID10. It'd be running Linux. Database gets some very heavy write activity during the early morning data load, other than that it is mostly read-only (and the read load is fairly minimal).

We're currently running on 11 150GB Velociraptors (also Linux software RAID10), and are hoping the X25-M will speed up the data load.

We currently have redo on different disks than the rest of the data.

I'm wondering a few things:

  1. Any experience with using X25-M drives for databases? The X25-E are unfortunately beyond our budget.
  2. Would it hurt to separate redo off to some magnetic (non-SSD) drives, say 2 (raid1) or 4 (raid10) Seagate Constellations?
derobert
  • 1,288
  • 12
  • 22
  • 1
    http://serverfault.com/questions/229833/is-it-safe-to-use-consumer-mlc-ssds-in-a-server -- If you don't mind some downtime you might chance the X25-M.... I think you are playing with fire though. – Kyle Brandt Mar 03 '11 at 22:55
  • Oracle sells flash products to speed up their database in their servers. http://www.oracle.com/us/products/servers-storage/storage/flash-storage/index.html – Brian Mar 03 '11 at 23:03
  • @Kyle Brandt: Curious about why you'd expect it to cause downtime, especially in RAID10? I don't see anything in the other question to suggest that. – derobert Mar 03 '11 at 23:46
  • 1
    @derobert: I don't have any solid evidence, but I think the write wear in RAID would happen at about the same rate for a mirrored pair. So when one fails the other might fail at about the same time. – Kyle Brandt Mar 03 '11 at 23:57
  • This seems like incredible overkill. A single 10K RPM hard drive like the velociraptor can handle about 125 random IOPS. A single good SSD like the X-25M can handle thousands of random IOPS. – sciurus Mar 04 '11 at 01:11
  • @sciurus: Do you have a source of this? We got about 100 random writes on 6x15K SAS drives in RAID 10. – Kyle Brandt Mar 04 '11 at 12:47
  • @sciurus: Intel only claims 300 random writes per second for the X25-M. See http://www.intel.com/design/flash/NAND/mainstream/pdf/322697.pdf – derobert Mar 04 '11 at 17:03
  • @Kyle Brandt: Intel gives a write endurance w/ random writes of only 15TB, so that is indeed a concern. We'd be replacing them fairly often, it sounds like :-( Please make an answer out of that... – derobert Mar 04 '11 at 17:08
  • @derobert - In that same table they claims up to 8,6000 random 4KB writes on an 8GB range of the drive. I'm curious why it drops so dramatically when the range is across the entire drive. – sciurus Mar 04 '11 at 19:54
  • 1
    @Kyle Brandt: If you're getting higher figures it's because of caching. The IOPS a disk can provide is determined by the latency and seek time. The IOPS an array can provide is determined by the number of disks, IOPS per disk, read/write mix, and RAID penalty. A good overview is http://www.cmdln.org/2010/04/22/analyzing-io-performance-in-linux/ , and a good calculator is http://wmarow.com/strcalc/ – sciurus Mar 04 '11 at 20:05
  • @sciurus: Sorry the 100k was a typo, I meant 100. I don't buy those formulas my self. That is the conclusion I came to at http://blog.serverfault.com/post/798854017/. This article you linked to at least mentioned that different read modes etc for software RAID in Linux -- but I have never seen these formulas backed by a significant amount of *actual* data. Also don't attack me too hard on my bechmarks in that post -- I would do a much better job at these days :-P – Kyle Brandt Mar 04 '11 at 20:11
  • @sciurus: When the write range is only 8GB, it can use the other 152GB as a scratch area, to avoid flash erases. When the whole disk is being hit, the -M has I believe a 7% or so scratch area, so there is a lot more overhead for flash erases & wear leveling. Also probably a harder hit on the block reallocation in the disk. – derobert Mar 04 '11 at 20:23
  • @sciurus: So for example that calculator puts the random write IOPS on the array we had at 450, when in reality we get 95. So not even close.... Given maybe we just have some major configuration error? But the partition is aligned, block size is 64 (which the calculator accounts for). I just really want to see someone back up these formulas with a lot of data. – Kyle Brandt Mar 04 '11 at 20:39

3 Answers3

1

According to Intel (but this site broke it down easier to read, and was faster to google) writing 100GB of data to the 80GB model of the X25-M drive, it will have a life of 5 years. How many write cycles also depends on how much of the drive is full (for wear leveling and write space)

Brian
  • 1,213
  • 2
  • 14
  • 24
1

I don't have any solid evidence, but I think the write wear in RAID would happen at about the same rate for a mirrored pair. So when one fails the other might fail at about the same time. Therefor I personally have concerns with consumer level SSD in production servers unless you can tolerate some down time and maybe some data loss.

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
0

No experience, but worth checking out the flash cache posts here. There is more stuff on SSD on his blog, such as here

Gary
  • 1,839
  • 10
  • 14
  • A warning about flash cache: it's only supported on Oracle's linux distribution, not on Red Hat or CentOS. – sciurus Mar 04 '11 at 01:03