8

We are about to deploy shared storage researching for ultra-fast storage to implement Microsoft SQL Server Failover Cluster (FCI). So far the project goes, we would to start with 500K IOPS for 8k blocks about 70r/30w pattern. Also we would like to have an ability to encrease pefromance up to 2M IOPS (for same pattern) in a year or so, due to the SQL server growing expectations.

For the purpose of the project, we are going to deploy 4-node cluster of Microsoft Storage Spaces Direct (S2D). As for hardware we already have 2x Dell rack servers R730xd with 2x E5-2697 and 512GB RAM and we are ready to get 2 more.

As for storage, Microsoft recommends going with NVMe or NVMe + SSD to obtain maximum performance (source). Therefore, after some research, Samsung SSDs are good to go with. https://www.starwindsoftware.com/blog/benchmarking-samsung-nvme-ssd-960-evo-m-2 http://www.storagereview.com/samsung_960_pro_m2_nvme_ssd_review

The setup we consider is following: 1x Samsung 960 EVO NVMe + 4x Samsung PM863 SSD per S2D host.

Can S2D implementation using Samsung 960 EVO NVMe and Samsung PM863 deliver 500k to SQL FCI?

EDIT:

a) didn't you ask something similar the other day? - I did. A new question was posted since the first shot was off-topic. Subject and body are changed. Previous question will be deleted.

b) they're consumer drives, - The question is about to find the setup of S2D that could house required 500k IOPS on start. What setup would you recommend?

c) how are you planning on connecting all of those, I'm unaware of a server out there with 5 x M.2 slots - we need to know this, - Only 1x M.2 drive per each node is to be used. I have corrected the setup of shared storage: 1x Samsung 960 EVO NVMe + 4x Samsung PM863 SATA SSD per S2D host.

d) what kind of IOPSs (size and type)? - SQL FCI read intensive workload of 4k, 8k, 64k blocks. Reads range is 70-90% and writes one - 30-10%.

e) 500k-to-2M is a very wide range of requirement variance - why such a wide range? - The project performance is expected to significantly grows in sort period, so we must have ability to run 4x workload on same hardware till the and of the first year. A year after we will add 4x more hosts to cluster.

We are Microsoft Shop so there is no option to go eslewhere but Microsoft SQL Server 2016. Also, as you might consume the project requires redundancy and extra availability therefore SQL Failover Cluster Intance will be deployed aside S2D.

Joshua Turnwell
  • 530
  • 3
  • 12
  • 2
    a) didn't you ask something similar the other day? b) they're consumer drives, c) how are you planning on connecting all of those, I'm unaware of a server out there with 5 x M.2 slots - we need to know this, d) what kind of IOPSs (size and type)? e) 500k-to-2M is a very wide range of requirement variance - why such a wide range? f) We could do with knowing a lot more about your server specs - details please. – Chopper3 Mar 30 '17 at 08:34
  • 1
    @Chopper3 Thank you for the comment. I have added information. – Joshua Turnwell Mar 30 '17 at 08:52
  • 1
    That answers one of those questions - what about the rest? – Chopper3 Mar 30 '17 at 08:54
  • 1
    @Chopper3 Please review the added information. What else is required? – Joshua Turnwell Mar 30 '17 at 09:08
  • 1
    Thanks, still no idea how you're planning to connect those 5 x M.2 drives to the server but I'm giving up asking again. One final question - do you NEED a relational database for this really? You may very well do but if you can ask this question of yourself and you could get away with a NoSQL engine line Couchbase or MongoDB etc. then you'd suddenly find it very easy indeed to go well over 2M IOPS. The reason I ask all this is because you want to do this via MSSQL, S2D (therefore WS2016) and via consumer SSD - this is all very new and untested....tbc – Chopper3 Mar 30 '17 at 09:27
  • ...certainly you won't find anyone here doing similar - you'll find people running SQL server on 2012R2 in very high-throughput solutions, you'll find people with server layouts that can handle the performance and failure characteristics of using consumer storage in servers, you may even find someone here who's done some S2D work (I've barely scratched the surface myself) - but finding someone who'll empirically state that that setup will give you what you want is just not going to happen - you're going to have to benchmark this yourself. – Chopper3 Mar 30 '17 at 09:29
  • Personally I'd try really hard to see if you can 'NoSQL' it, because if you can do that for the majority of your work and just use SQL for low-performance/batch/reporting jobs then you open up the possibilities to get billions of operations from cheapo servers that can easily handle single node failures and use low-end/consumer hardware. MSQL is great but does limit your performance, if you have to go that way then please look at booting off a regular R1 pair of 'spinny' disks then use a PCIe NVMe drive like Intel's P36xx/37xx series - they're built for this kind of job unlike the samsungs. – Chopper3 Mar 30 '17 at 09:32
  • @Chopper Only 1x M.2 drive per each node is to be used. I have corrected the setup of shared storage: 1x Samsung 960 EVO NVMe + 4x Samsung PM863 SATA SSD per S2D host. We are Microsoft Shop so there is no option to go eslewhere but Microsoft SQL Server 2016. Also, as you might consume the project requires redundancy and extra availability therefore SQL Failover Cluster Intance will be deployed aside S2D. – Joshua Turnwell Mar 30 '17 at 09:41
  • Ah ok, sorry missed that the PM863's were SAS, they're not going to be any good sorry, their specs just aren't good enough - firstly you're limited to 6Gbps and their random writes are very slow, the 4k ones especially, even their fastest 'stat', the 8k random reads, is less than 100k - they're not cheap either - please look at PCIe NVMe cards, personally I really rate the P3608's but others here have had more experience of the P37xx series and of other manufacturers - you'll get ~5x the bandwidth and many times more random IOPS, they may not be much more expensive either. – Chopper3 Mar 30 '17 at 09:55
  • I see in the [benchmark post](https://www.starwindsoftware.com/blog/benchmarking-samsung-nvme-ssd-960-evo-m-2) a single Samsung 960 EVO NVMe can deliver up to 500k IOPS for 4k random 70/30 workload. [PM863](http://www.storagereview.com/samsung_pm863_ssd_review) delivers up to 100k on small size blocks. Therefore, is it fair getting performance of 500k IOPS from 1x 960 EVO with 4x PM863? and 2M IOPS since it's about 4-node S2D cluster? – Joshua Turnwell Mar 30 '17 at 10:44
  • 1
    I assumed you wanted the 960 for boot only - it doesn't have the wear-support for an SQL server - you'll kill it. – Chopper3 Mar 30 '17 at 11:39

1 Answers1

8

It's a bad idea to use consumer SSDs in your SDS deployments. VMware VSAN and Microsoft S2D both assume writes will be "atomic", so one ACK-ed by host is actually on persistent memory; consumer SSDs don't have any power outage protection so they MIGHT lose your data. Write endurance is also very different.

https://blogs.technet.microsoft.com/filecab/2016/11/18/dont-do-it-consumer-ssd/

https://blogs.vmware.com/vsphere/2013/12/virtual-san-hardware-guidance-part-1-solid-state-drives.html

http://www.yellow-bricks.com/2013/09/16/frequently-asked-questions-virtual-san-vsan/

I'd suggest to stick with some Enterprise-grade NVMe cards.

BaronSamedi1958
  • 12,510
  • 1
  • 20
  • 46
  • 5
    This! For the case, I would recommend to take a look at Intel enterprise NVMe cards such as P3700 model: http://www.storagereview.com/intel_ssd_dc_p3700_25_nvme_ssd_review Here it is PM863 benchmarks from the same site, btw: http://www.storagereview.com/samsung_pm863_ssd_review – batistuta09 Apr 03 '17 at 11:29
  • 3
    Intel P3700 are great. I'll check Intel enterprise NVMe for the case.Thanks. – Joshua Turnwell Apr 04 '17 at 09:57