7

Is it a good idea to have a transactional database running on a server that uses iSCSI?

They main argument is that you can have two servers accessing the same disk array, so if one server breaks the other one can kick in an do the work.

So I have one server in some cold standby mode. Is this setup really safer than having disks and server in the same machine?

This would mean iSCSI-machines don't break as easily as a server.

Another point is the WAL, which could be affected negatively if the network lags. Any experience?

I'm happy to hear you're opinions on this topic.

John Gardeniers
  • 27,262
  • 12
  • 53
  • 108
Franz Kafka
  • 237
  • 2
  • 12

6 Answers6

9

Your question needs to be broken down into two parts: internal vs external storage, and iSCSI vs the alternatives (FC, FCoE, NFS). My experience is with large corporate clients running mainly Oracle, so this may not apply to smaller environments or other databases. In my opinion, there is a lot of value in external storage; iSCSI is one option for providing this but it is not as mature as some of the alternatives.

Internal vs External Storage

External storage, generally provisioned in a disk array and presented through a storage area network (SAN), provides a number of advantages:

  • Fast performance, often with hardware-accelerated RAID and large battery-backed caches.
  • Easy to scale up volumes to use more disks for performance and capacity.
  • Centralized access to storage resources to avoid silos of wasted resources.
  • Able to share storage for high-performance or high-availability clusters (e.g. Oracle RAC).
  • May come with features for snapshotting and replicating data to remote sites.
  • May come with decent analytics tools to track performance.

The main disadvantages of external storage are the complexity and cost of setting up and maintaining the storage area network and the storage arrays.

iSCSI vs Alternatives

Fibre Channel is currently the standard mechanism for accessing external storage for databases. It is common to see SAS (Serial-Attached-SCSI) used for less critical data, more as an extension to internal disk than as disk in a SAN. The key point about these technologies is that they run on dedicated storage networks.

Newer alternatives such as FCoE and iSCSI provide effectively the same protocol as FC and SAS, except they run over Ethernet and therefore can use the same infrastructure used for host-to-host networking. The idea is that by converging on Ethernet, companies can reduce the cost and complexity of having external storage. However, there are still questions around whether Ethernet as a transport provides the speed and reliability of the dedicated technologies.

NFS is a file-level protocol which also runs over Ethernet. In the past it has been considered to have too much overhead for databases, but with hardware offloading in newer network adapters, better OS network stacks, and direct support by the database (e.g. the Direct NFS feature in Oracle) it is also a viable option for some companies. NFS is particularly nice because it reduces the administration involved in resizing volumes, and also fits the model of virtualized storage such as EMC VNX and Oracle ZFS Storage.

Tom Shaw
  • 3,702
  • 15
  • 23
  • Okay I'm talking about oracle as well. I haven't got the money for the enterprise edition to use the real replication on WAL level (I think they call that data guard). But with the normal edition you get this RAC stuff for "free". So the best way to go is buy some really safe san and two fast servers? Somehow I'm not convinced that this is safe. Two building, two independent servers seem better. Do you know if you can use both servers with rac in the standard one edition at the same time? Or does that need two licences? – Franz Kafka Jun 04 '11 at 17:00
  • We're an Oracle shop and as we understand it RAC is not available under the SEO license. – Keith Stokes Jun 04 '11 at 17:07
  • @Franz: RAC is to protect against server failure and allow horizontal scaling - not to protect your data. There's no substitute for having a remote copy of your data, preferably with an ability to roll-back changes (snapshots or delayed application of logs). Data Guard is the gold standard but is expensive. I've seen some people roll-their-own log shipping, some use array-based replication+snapshots, some use logical volume replication. A new alternative is Delphix - I've never used it but I respect their technical team. – Tom Shaw Jun 05 '11 at 00:08
4

In my opinion iSCSI is less stable of a storage network transport than Fibre Channel or Fibre Channel over Ethernet. Presuming you've made the needed optimizations, it should work. It'll work best if the iSCSI connection is going over a dedicated ethernet, no non-iSCSI traffic on it to get in the way. ISCSI initiators have gotten better over the years and even support multipathing now, if you want to go there.

As for running a transactional database over iSCSI, I have my doubts. A cold-standby server should be able to pick up where the hot server left off, presuming it left the database files in a recoverable state. That said, the one use-case where the dedicated storage transport networks really shine is in this sort of "lots of little transactions" use-case.

In the end, by my assessment, it'll work, but may fail catastrophically more often than using FC or FCoE.

sysadmin1138
  • 131,083
  • 18
  • 173
  • 296
  • It's come a long way... – SpacemanSpiff Jun 04 '11 at 13:48
  • 2
    In my opinion iSCSI can be just as stable as fibre when implemented properly. Fibre deployments almost force you into good design where iSCSI's flexibility allows you to do it poorly when you don't do your homework. – icky3000 Jun 04 '11 at 14:09
  • 1
    @icky3000: I have nothing against iSCSI, but the problem is that by the time you've installed and configured all the dedicated Ethernet bits and pieces to make it fast and reliable enough for storage, you've lost a lot of the promised simplicity and savings! The fact that any Joe can pick it up and use it means it will continue to grow in popularity, but the average installation will be less reliable until the vendors make it foolproof. – Tom Shaw Jun 04 '11 at 14:51
  • @Tom Shaw: I do see your point but best practices for iSCSI are more about simple design steps that don't really cost that much money. I don't think you'll find it reaches the cost of fibre even when done really well. – icky3000 Jun 04 '11 at 17:20
2

You've raised a couple different issues. First, can iSCSI work well for a transactional database? Yes, absolutely. But it is a qualified yes. You'll get better performance from fibre in most implementations. You need to understand your IO requirements to determine if you actually need that performance. If not, iSCSI might be perfect. That having been said, what sometimes happens is that people see how easy iSCSI is to implement so they just roll it out without giving much thought to design. Best practices for iSCSI are easily found on Google but you should be thinking about things like how you'll separate that traffic from other traffic (dedicated switches or VLANs), how you might implement jumbo frames, whether you'll use MCS or MPIO (your vendor will probably dictate this for you), etc.

The second issue is how you're going to design redundancy for your database(s). Yes, it is true that if the primary server fails you can assign those disks to another box but I wouldn't rely on that as my HA solution because you aren't guaranteed that the dying primary server left those databases in good shape. There are many ways to do HA and most of them work well with iSCSI, but you need to put some more thought into it.

icky3000
  • 4,718
  • 1
  • 20
  • 15
1

Your plan is seriously flawed because the performance using iSCSI will be distinctly inferior to having local drives. Further, there are better schemes than having a cold standby to access the same drives, which also suffers from the single point of failure being the most unreliable part - the drives.

I suggest you use a more conventional approach, one that has been well tested and proven to work. Use local disks n both machines and use master/slave or master/master replication to provide your redundancy.

John Gardeniers
  • 27,262
  • 12
  • 53
  • 108
  • I agree with everything except your first sentence. While it is true that local storage would be faster, it isn't necessarily true that the poster needs that performance. He does need to understand his IO requirements before diving into a storage solution but he may find that iSCSI can meet those requirements very easily. Or not. – icky3000 Jun 04 '11 at 14:11
  • Local drives don't work everywhere. I have clients that would have extreme performance pain and capacity constraints if limited to internal disk because of the lack of scalability and no battery-backed cache. – Tom Shaw Jun 04 '11 at 14:25
  • @Tom Shaw: You can get local RAID controllers with a BBU for the cache. – Sven Jun 04 '11 at 14:50
  • @SvenW: Valid point - also, SSDs can fill some of that role now too. However the other benefits of external storage still stand. – Tom Shaw Jun 04 '11 at 14:57
  • @Tom Shaw: Agreed that there are big benefits to external storage, especially in larger environments when you need a lot of it. I'd rather manage one big system than 50 little ones. – icky3000 Jun 04 '11 at 17:23
  • All of you are applying reasoning from the perspective of what you know. You are comparing iSCSI with SAS with SSD's etc because they are all 'storage' somehow. Different applications and constraints will move the ideal choice over the existing and current standards. iSCI has constraints wrt being networked, but it's block-level so at least it /could/ be efficient. Latency will reduce IOPS, but parallelism would make that irrelevant. Network setup modulates latency anyway, and a network cable is also just a copper cable, so it's just fat network stacks that ruin the fun. Lot of thinking. – Lodewijk Oct 12 '18 at 09:10
1

We've used iSCSI-to-external storage in our Oracle transactional database environment for years without trouble and excellent performance. The iSCSI component is not part of our redundancy plan however. We address that as others have illustrated. As well, for the reasons also mentioned otherwise, we are migrating to NFS-based storage using Oracle's DNFS. With 11gR1 we saw quite a few issues which seem to not be present in 11gR2. We don't have enough users yet to see how performance will really work out. That probably happens in the next few weeks.

Keith Stokes
  • 927
  • 6
  • 7
0

There are many SAS disk arrays out there that allow multiple server connection. For example, Dell MD32x0 arrays, when configured with dual controllers, can work with up to four servers with two redundant connections per server (http://www.dell.com/downloads/global/products/pvaul/en/powervault-md3200-md3220-transition-guide.pdf).

If you don't have specific need to build SAN environment, never used SANs before, and just want active/standby failover, this will be a simpler, easier, and cheaper solution.

Max Alginin
  • 3,284
  • 14
  • 11