Synology and vmware with 4 way MPIO slow iSCSI speeds

Question

I am trying to achieve high iSCSI speeds between my ESX box, and the synology NAS. I am hoping to achieve a top speed of 300-400 Mb/s. But so far all i've reached is 150 - 170 MB/s.

The main test that I am using is to create a 20GB Virtual Disk, Think Eager Zeroed in the iSCSI SSD based datastore. (And variations of this.)

Some questions:

I am assuming that creating this disk would be sequential writing?
Synology never passes 30% / 40% CPU usage, and memory is almost used. I am assuming that the Synology is capable of writing at these speeds on an SSD, right?
Also, is ESX able to max out the available bandwidth when creating a virtual disk over iSCSI?
If using a benchmark tool, what would you recommend, and how can I be sure that I won't have the bottleneck on the data sending side? Can I install this tool in a VM in the SSD Datastore, and run it "against itself"?

This is my setup.

I have a Synology 1513+ with the following disks and configuration:

3 4TB WD disks (Unused)
1 Samsung EVO 860. (1 volume, no raid)
1 Samsung 256GB SATA III 3D NAND. (1 volume, no raid)
2 iSCSI targets, one per SSD. (8 total vmware iSCSI initiators connected)

Network config:

Synology 4000 Mbps bond. MTU 1500, Full Duplex.
Synology Dynamic Link Aggregation 802.3ad LACP.
Cisco SG350 with link aggregation configured for the 4 Synology ports.
Storage and iSCSI network is physically separated from the main network.
CAT 6 cables.

VSphere:

PowerEdge r610 (Xeon E5620 @ 2.40Ghz, 64 GB memory)
Broadcom NetXtreme II BCM5709 1000Base-T (8 NICS)
VSphere 5.5.0 1623387

VSphere config:

4 vSwitch, 1 NIC each for iSCSI.MTU 1500. Full Duplex.
iSCSI Software initiator with the 4 vmkernel switches in the port group, all compliant and path status active.
2 iSCSI targets with 4 MPIO paths each. All active and round robin

So basically, 4 cables from the NAS go to the Cisco LAG, and 4 iSCSI from ESX go to regular ports on the switch.

Tests and configs I've performed:

Setting MTU to 9000 on all vmswitches, vmkernel, synology and cisco. I have also tried other values like 2000 and 4000.
Creation of 1 (and 2, 3 simultaneous ) virtual disks in 1/2 iSCSI targets to maximise the workload.
Disabled / Enabled Header and Data Digest, Delayed Ack.

I've lost count of all the things that I have tried. I am not sure where my bottleneck is, or what have I configured wrongly. I have attached some screenshots.

Any help would be much appreciated!

iSCSI paths on ESX

Networking config on ESX

Example of the vmkernel config

iSCSI iniciator network configuration

Cisco LAG config 1

Cisco LAG config 2

score 3 · Accepted Answer · answered Feb 21 '19 at 19:04

It might be accelerated with VAAI ZERO primitive (I can't tell exactly on your outdated vSphere version). But it's sequential write either way. I also depends how you created your iSCSI target. Newer DSM-s by default create Advanced LUNs that are created on top on file systems. Older versions by default used LVM disks directly and performed much worse.
~400MB/s should be achievable
400MB/s is not a problem, if target can provide the IO
If you're looking at pure sequential throughput, then dd on Linux side or simple CrystalDiskMark on Windows will work.

LAGs and iSCSI usually don't mix. Disable bonding on Synology and configure as separate interfaces. Enable multi-initiator iSCSI on Synology. I don't have a Synology at hand unfortunately for exact instructions.

Configure vSphere like this.

vSphere initiator --> Synology target IP/port 1
vSphere initiator --> Synology target IP/port 2
vSphere initiator --> Synology target IP/port 3
vSphere initiator --> Synology target IP/port 4

Disable unneccessary paths (keep one vSphere source IP to one Synology IP), vSphere supports (not enforced) only 8 paths per target on iSCSI. I don't remember if you can limit target access per source on Synology side, likely not. Also you have already enough paths for reliability and any more will not help as you're likely bandwidth limited.

Change policy to a lower value, see here https://kb.vmware.com/s/article/2069356 Otherwise 1000 IOPS will get down one path until path change occurs.

Keep using jumbo frames. It's about 5% win on bandwidth alone and on gigabit you can easily become bandwidth starved.

Thanks! 1. I didn't want to use VAAI because that would transfer the workload into the synology. 2. Understood. 3. Understood. 4. I tried this, with same low speeds results. However CrystalDiskMark helped me to prove that my local ESXi SSD storage is capable of 500Mb/s. Without LAG, I have to use ALB on the Synology, and I can tell that only one link is being used at the time. I have also tried your configuration, with no change. Tried changing IOPS and no change. So, I am really at a lost of what I am doing wrong. Thank you for the feedback tho! — acanessa, Feb 26 '19 at 15:42
VAAI is usually a good thing, there's no point pushing zeroes over the network if array can generate them internally. I have used similar iSCSI configurations in the past and I could easilty saturate 4*1Gbit links, but it's hard to tell where the issue is. — Don Zoomik, Feb 26 '19 at 21:15
I agree with you on VAAI. I actually have it implemented, but not for this storage, because I wanted to generate the zeroes specifically from the ESX host, via the network. I will keep looking and let you know what i find. — acanessa, Feb 26 '19 at 22:06

score 1 · Answer 2 · answered Mar 01 '19 at 19:24

UPDATE:

I managed to solve my problem. Bottom line, it's was 80% my fault, 20% configuration.

The synology and the switch configuration were correct after all. Using LACP on the Synology and the Cisco did work for me. My NAS has only one IP where iSCSI targets are available, and the ESX has 4 NICs / vMKernels pointing to it.

Configure vSphere like this.

vSphere ini. 10.10.74.11|
vSphere ini. 10.10.74.12|
vSphere ini. 10.10.74.13|
vSphere ini. 10.10.74.14|--4Cables-->[CISCO]--4cables-->Synology IP (10.10.74.200)

I used MPIO with Round Robin.

The main problem was that I wasn't stressing up the system enough. My test of creating virtual zeroed disks and assuming that the ESX host was going to use all bandwidth available to do it, appears to have been wrong.

Configuring CrystalDiskMark correctly was key! Also changing the IOPS according to the documentation located in the link (https://kb.vmware.com/s/article/2069356) was part of the solution. I am now getting around 450 Mbs read / 300 Mbs writes!

Synology and vmware with 4 way MPIO slow iSCSI speeds

2 Answers2

Linked