1

Currently I have an sql server that is experiencing high cpu usage. There are tons of reads and writes happening continuously.

The machine is a Dual Xeon E5645 @2.4GHz with 48GB of ram and using SSD drives in raid 5.

I would like to beef up the horsepower for our sql instance. We are considering getting a 4 cpu server with Xeons in the 3GHz+ range.

However we are debating on wheter to use windows clustering. Then just setting up the cluster to run on the 1 machine and add more machines as the workload requires.

Would this be a feasible solution? or would just having the one machine be enough? Should I not bother with a 2008 cluster?

Here is some data from the server using the query.

SELECT DB_NAME(fs.database_id) AS [Database Name], mf.physical_name, io_stall_read_ms, num_of_reads,
CAST(io_stall_read_ms/(1.0 + num_of_reads) AS NUMERIC(10,1)) AS [avg_read_stall_ms],io_stall_write_ms, 
num_of_writes,CAST(io_stall_write_ms/(1.0+num_of_writes) AS NUMERIC(10,1)) AS [avg_write_stall_ms],
io_stall_read_ms + io_stall_write_ms AS [io_stalls], num_of_reads + num_of_writes AS [total_io],
CAST((io_stall_read_ms + io_stall_write_ms)/(1.0 + num_of_reads + num_of_writes) AS NUMERIC(10,1)) 
AS [avg_io_stall_ms]
FROM sys.dm_io_virtual_file_stats(null,null) AS fs
INNER JOIN sys.master_files AS mf
ON fs.database_id = mf.database_id
AND fs.[file_id] = mf.[file_id]
ORDER BY avg_io_stall_ms DESC OPTION (RECOMPILE);
    Database Name   io_stall_read_ms    num_of_reads    avg_read_stall_ms   io_stall_write_ms   num_of_writes   avg_write_stall_ms  io_stalls   total_io    avg_io_stall_ms
    RViewWf          19751827778        232140460             85.1             7254139          1444051                 5.0         19759081917 233584511   84.6

What does everyone suggest?

Prescient
  • 109
  • 1
  • 1
  • 4
  • 3
    `using SSD drives in raid 5` If you hate your data that much, just delete it. It's cheaper, won't cause as many problems down the road, and will probably improve performance over R5, as well. – HopelessN00b Feb 12 '13 at 15:54
  • I like the snarky response. What would be a better solution in your opinion? – Prescient Feb 12 '13 at 15:56
  • 1
    Honestly? Almost anything. RAID5 for a database is the problem, not the solution. So, switch to a sane RAID level and get back to us if there are still problems. [Check here to try to locate an appropriate RAID level](http://serverfault.com/q/339128/118258). – HopelessN00b Feb 12 '13 at 15:58
  • Depending on the size of the RAID and the IO pattern of the DB, RAID 5 could be a perfectly fine solution. If it's got a lot of disks and is large in overall size (6TB or more), I might consider looking at RAID 10, but if its just a few SSD and a couple hundred GB, RAID 5 is probably perfectly fine. – MDMarra Feb 12 '13 at 16:01
  • By the way, I didn't mean to come off as harsh as (I'm told) I did. I just don't feel that RAID5 is ever the right RAID level for your database server to put its databases on, given the value of the information typically stored in business databases. – HopelessN00b Feb 12 '13 at 16:22

3 Answers3

5

Windows Clustering's full name is Windows Failover Clustering. It's not active/active database instances. It doesn't sound like it's what you're looking for. MSSQL server doesn't do sharding or any of that stuff. It "scales up" instead of "scaling out" so to speak.

MDMarra
  • 100,183
  • 32
  • 195
  • 326
2

Windows Failover Clustering provides high availability for clustered resources (services, applications, virtual machines). It doesn't provide workload "sharing".

joeqwerty
  • 108,377
  • 6
  • 80
  • 171
0

Rather than binding several disks into a striped RAID on the system itself (using a single controller, which is where you get your perf hit), you may be better off using a SAN arrangement, so that your disks are run by redundant controllers that are optimized for data transfer. Of course, SAN solutions can be expensive, so it'd depend on how much data you're trying to manage (and how valuable it is).

Most SAN's can set up RAID arrays of multiple types. Depending on how redundant you want things to be, you can run with RAID 0+1 (two striped arrays with no parity, mirrored to each other) so that you don't take the parity performance hit if a drive fails. Or you could set up RAID 5+1 (with or without a hot spare) to add an extra layer of redundancy.

George Erhard
  • 804
  • 6
  • 12
  • RAID 1+0 is almost always preferred to 0+1 – MDMarra Feb 12 '13 at 16:43
  • Yah I'm always trying to keep them straight... is it a stripe made up of mirrors, or a mirror of two stripes? :) – George Erhard Feb 13 '13 at 17:45
  • 1+0 is a stripe of mirrors, which has shorter rebuild times and is more fault tolerant than 0+1 which is a mirror of stripes. They both perform roughly the same. – MDMarra Feb 13 '13 at 17:49
  • Good info. We tend to go with 5+1's with hot spares, to add an extra layer of redundancy, but that's overkill for smaller operations. – George Erhard Feb 13 '13 at 18:10