15

I'm having big troubles with my iSCSI network and can't seem to get it working as fast as it could.

So I've tried pretty much everything to gain full performance from my SAN, having involved specialists of VMware and EMC.

A short description of my gear: 3x HP DL360 G7 / vSphere 5.5 / 4 onboard NICs / 4 PCIe Intel NICs for iSCSI 2x HP 2510-24G 1x EMC VNXe 3100 / 2 storage processors, each with 2 iSCSI dedicated NICs / 24x 15k SAS RAID10 / 6x 7.2k SAS RAID6

I went best practices and put the storage pools evenly on both iSCSI-Servers. I created 2 iSCSI servers, one on each storage processor. Please see the image for my iSCSI configuration.

iSCSI configuration

iSCSI traffic is separated via VLAN (forbid set for other VLANs), I even tried it with another HP switch of the 29xx series. Flow control is enabled (also tried it disabled), Jumbo is disabled. There is no routing involved.

On the ESX hosts all iSCSI NICs are being used as I used the Round Robin setting for every datastore. I also tried it with a path change policy of 1 IO as so many others seem to have gained performance that way. I tried the internal NICs too (Broadcom), but there's no difference. On the switches I can see that the ports are being used very evenly, ESX side and VNXe side. I have a perfect load balancing, HOWEVER: I can't get past 1 Gbit in total. I do understand that the VNXe is optimized for multiple connections and Round Robin does need that too, but even when I do a storage vMotion between 2 hosts and 2 datastores (using different iSCSI servers), I can see a straight line around 84 MBit/s via Unisphere webinterface. I can see that line so often at exactly the same value that I can't believe my disks wouldn't deliver more or the tasks isn't demanding enough. It's getting even better: With only one cable on each hosts and each storage processor I do achieve the SAME performance. So I got a lot of redundancy but no extra speed at all.

As I've seen quite some people talking about their iSCSI performance I am desperate to find out what is wrong with my configuration (that has been tested and verified by trained persons of VMware and EMC). I'm thankful for every opinion!

EDIT:

Yes, I have configured vMotion to use multiple NICs. Besides that storage vMotion always goes through the iSCSI adapters, not the vMotion adapters. I have attached screenshots of my configuration.

iSCSI Port binding

iSCSI Destinations

iSCSI Paths

I know storage vMotion is no benchmark, however I had to do a lot of this the last few days and the upper limit has always been at around 80 MB/s. A pool of 6x 15k 600 GB SAS disks in RAID 10 should easily be able to put a whole lot more through, don't you think? I did an IO Meter test for you - tried some of them, the fastest was 256 KiB 100% Read. I got 64.45 MB/s - my Unisphere shows about the same speed too. That's in a VM that's stored on a pool of 6x 15k 300 GB SAS disk (RAID 10) which hardly any other activity this time of day.

IO Meter

Unisphere

EDIT2:

Sorry for the duplicate usernames, but I wrote this question at work and it didn't use my username I already got at Stock Overflow. However here is the screenshot showing my Round Robin settings. It is the same on all hosts and all stores.

Round Robin

ewwhite
  • 194,921
  • 91
  • 434
  • 799
Ryan Hardy
  • 210
  • 2
  • 6
  • I don't think anything is wrong. What exactly are you expecting? Have you configured [multi-NIC vMotion?](http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2007467) – ewwhite Jun 17 '14 at 15:15
  • 3
    Doing storage vmotion is not a benchmark for storage systems, as the vmkernel is restricted on I/O and CPU usage. Have you tried benchmarking using iometer etc? What kind of disks in the VNXe, and what kind of raid/storage pool setup? – pauska Jun 17 '14 at 15:23
  • Also, can you post a screenshot of the LUN "manage path" inside vSphere? – pauska Jun 17 '14 at 15:26
  • You logged in with a different user than you used to ask the question, so your edit got stuck in a queue. – pauska Jun 17 '14 at 17:14
  • Thanks for the screenshots, but it's still not the one I asked for. Click on one of your esxi hosts, the configure tab, datastores, select the datastore you want to troubleshoot and click the "Properties" link on the bottom right. Then click on the "manage paths" and send us a screenshot of that window. – pauska Jun 17 '14 at 17:16
  • Alright, the multipathing looks correct. I can help you further with this, but it's going to be a lot of back-and-forth. May I suggest you join our chatroom? I'm there daily from 9AM CEST to around 4PM nearly every day. http://chat.stackexchange.com/rooms/127/the-comms-room – pauska Jun 17 '14 at 22:41

3 Answers3

1

It is possible that you do not generate enough IOPS for this to really kick-in.
Have a look here on how to change the setting from default 1'000 IOPS to a smaller value. (This is symmetrix specific, but you can do the same for the VMWare Round Robin Provider)

I'm however not yet convinced if it really is able to utilize more than one link totally in parallel with just one datastore. I think you have to do the IOMeter test on more than one datastore in parallel to see benefits. (Not 100% sure though)

MichelZ
  • 11,008
  • 4
  • 30
  • 58
1

Create a SATP rule for storage vendor named EMC, set the path policy as Round Robine and IOPS from default 1000 to 1. This will be persistence across reboots and anytime a new EMC iSCSI LUNs is presented, this rule will be picked up. For this to apply to existing EMC iSCSI LUNs, reboot the host.

esxcli storage nmp satp rule add --satp="VMW_SATP_DEFAULT_AA" \
  --vendor="EMC" -P "VMW_PSP_RR" -O "iops=1"

I've played around with changing the IOPS between 1 - 3 and find performing the best on a single VM. That said, if you have a lot of VM's and a lot of datastores, 1 may not be optimal...

Be sure you have each interface on the VNXe set to 9000 MTU. Also, the vSwitch with your iSCSI interfaces should be set to 9000 MTU along with each VMKernel. On your VNXe, create two iSCSI Servers - one for SPA and one for SPB. Associate one IP for each initially. Then view details for each iSCSI Server and add additional IPs for each active interface per SP. This will give you the round-robin performance you are looking for.

Then create at minimal two datastores. Associate one datastore with iSCSIServer-SPA and one with iSCSIServer-SPB. This will ensure one of your SP's isn't sitting there at idle.

Lastly, all interfaces on the ESX side that are being used for iSCSI should go to a separate vSwitch with all interfaces as active. However, you will want a VMkernel for each interfaces on the ESX side within that designated vSwitch. You must override the vSwitch failover order for each VMKernel to have one Active adapter and all others Unused. This is my deployment script I used for provisioning ESX hosts. Each host has a total of 8 interfaces, 4 for LAN and 4 for iSCSI/VMotion traffic.

  1. Perform below configuration

a. # DNS esxcli network ip dns search add --domain=mydomain.net

esxcli network ip dns server add --server=X.X.X.X

esxcli network ip dns server add --server=X.X.X.X

b. # set hostname update accordingly

esxcli system hostname set --host=server1 --domain=mydomain.net

c. # add uplinks to vSwitch0 esxcli network vswitch standard uplink add --uplink-name=vmnic1 --vswitch-name=vSwitch0

esxcli network vswitch standard uplink add --uplink-name=vmnic4 --vswitch-name=vSwitch0

esxcli network vswitch standard uplink add --uplink-name=vmnic5 --vswitch-name=vSwitch0

d. # create vSwitch1 for storage and set MTU to 9000

esxcli network vswitch standard add --vswitch-name=vSwitch1

esxcli network vswitch standard set --vswitch-name=vSwitch1 --mtu=9000

e. # add uplinks to vSwitch1

esxcli network vswitch standard uplink add --uplink-name=vmnic2 --vswitch-name=vSwitch1

esxcli network vswitch standard uplink add --uplink-name=vmnic3 --vswitch-name=vSwitch1

esxcli network vswitch standard uplink add --uplink-name=vmnic6 --vswitch-name=vSwitch1

esxcli network vswitch standard uplink add --uplink-name=vmnic7 --vswitch-name=vSwitch1

f. # set active NIC for vSwitch0

esxcli network vswitch standard policy failover set --vswitch-name=vSwitch0 --active-uplinks=vmnic0,vmnic1,vmnic4,vmnic5

g. # set active NIC for vSwitch1

esxcli network vswitch standard policy failover set --vswitch-name=vSwitch1 --active-uplinks=vmnic2,vmnic3,vmnic6,vmnic7

h. # create port groups for iSCSI and vmkernels for ESX01 not ESX02

esxcli network vswitch standard portgroup add --portgroup-name=iSCSI-vmnic2 --vswitch-name=vSwitch1

esxcli network ip interface add --interface-name=vmk2 --portgroup-name=iSCSI-vmnic2 --mtu=9000

esxcli network ip interface ipv4 set --interface-name=vmk2 --ipv4=192.158.50.152 --netmask=255.255.255.0 --type=static

vim-cmd hostsvc/vmotion/vnic_set vmk2

esxcli network vswitch standard portgroup add --portgroup-name=iSCSI-vmnic3 --vswitch-name=vSwitch1

esxcli network ip interface add --interface-name=vmk3 --portgroup-name=iSCSI-vmnic3 --mtu=9000

esxcli network ip interface ipv4 set --interface-name=vmk3 --ipv4=192.158.50.153 --netmask=255.255.255.0 --type=static

vim-cmd hostsvc/vmotion/vnic_set vmk3

esxcli network vswitch standard portgroup add --portgroup-name=iSCSI-vmnic6 --vswitch-name=vSwitch1

esxcli network ip interface add --interface-name=vmk6 --portgroup-name=iSCSI-vmnic6 --mtu=9000

esxcli network ip interface ipv4 set --interface-name=vmk6 --ipv4=192.158.50.156 --netmask=255.255.255.0 --type=static

vim-cmd hostsvc/vmotion/vnic_set vmk6

esxcli network vswitch standard portgroup add --portgroup-name=iSCSI-vmnic7 --vswitch-name=vSwitch1

esxcli network ip interface add --interface-name=vmk7 --portgroup-name=iSCSI-vmnic7 --mtu=9000

esxcli network ip interface ipv4 set --interface-name=vmk7 --ipv4=192.158.50.157 --netmask=255.255.255.0 --type=static

vim-cmd hostsvc/vmotion/vnic_set vmk7

i. # create port groups for iSCSI and vmkernels for ESX02 not ESX01

esxcli network vswitch standard portgroup add --portgroup-name=iSCSI-vmnic2 --vswitch-name=vSwitch1

esxcli network ip interface add --interface-name=vmk2 --portgroup-name=iSCSI-vmnic2 --mtu=9000

esxcli network ip interface ipv4 set --interface-name=vmk2 --ipv4=192.168.50.162 --netmask=255.255.255.0 --type=static

vim-cmd hostsvc/vmotion/vnic_set vmk2

esxcli network vswitch standard portgroup add --portgroup-name=iSCSI-vmnic3 --vswitch-name=vSwitch1

esxcli network ip interface add --interface-name=vmk3 --portgroup-name=iSCSI-vmnic3 --mtu=9000

esxcli network ip interface ipv4 set --interface-name=vmk3 --ipv4=192.168.50.163 --netmask=255.255.255.0 --type=static

vim-cmd hostsvc/vmotion/vnic_set vmk3

esxcli network vswitch standard portgroup add --portgroup-name=iSCSI-vmnic6 --vswitch-name=vSwitch1

esxcli network ip interface add --interface-name=vmk6 --portgroup-name=iSCSI-vmnic6 --mtu=9000

esxcli network ip interface ipv4 set --interface-name=vmk6 --ipv4=192.168.50.166 --netmask=255.255.255.0 --type=static

vim-cmd hostsvc/vmotion/vnic_set vmk6

esxcli network vswitch standard portgroup add --portgroup-name=iSCSI-vmnic7 --vswitch-name=vSwitch1

esxcli network ip interface add --interface-name=vmk7 --portgroup-name=iSCSI-vmnic7 --mtu=9000

esxcli network ip interface ipv4 set --interface-name=vmk7 --ipv4=192.168.50.167 --netmask=255.255.255.0 --type=static

vim-cmd hostsvc/vmotion/vnic_set vmk7

j. # set active NIC for each iSCSI vmkernel

esxcli network vswitch standard portgroup policy failover set --portgroup-name=iSCSI-vmnic2 --active-uplinks=vmnic2

esxcli network vswitch standard portgroup policy failover set --portgroup-name=iSCSI-vmnic3 --active-uplinks=vmnic3

esxcli network vswitch standard portgroup policy failover set --portgroup-name=iSCSI-vmnic6 --active-uplinks=vmnic6

esxcli network vswitch standard portgroup policy failover set --portgroup-name=iSCSI-vmnic7 --active-uplinks=vmnic7

k. # create port groups

esxcli network vswitch standard portgroup add --portgroup-name=VMNetwork1 --vswitch-name=vSwitch0

esxcli network vswitch standard portgroup add --portgroup-name=VMNetwork2 --vswitch-name=vSwitch0

esxcli network vswitch standard portgroup add --portgroup-name=VMNetwork3 --vswitch-name=vSwitch0

l. # set VLAN to VM port groups

esxcli network vswitch standard portgroup set -p VMNetwork1 --vlan-id ##

esxcli network vswitch standard portgroup set -p VMNetwork2 --vlan-id ##

esxcli network vswitch standard portgroup set -p VMNetwork3 --vlan-id ###

m. # remove default VM portgroup

esxcli network vswitch standard portgroup remove --portgroup-name="VM Network" -v=vSwitch0

n. # enable iSCSI Software Adapter

esxcli iscsi software set --enabled=true

esxcli iscsi networkportal add -A vmhba33 -n vmk2

esxcli iscsi networkportal add -A vmhba33 -n vmk3

esxcli iscsi networkportal add -A vmhba33 -n vmk6

esxcli iscsi networkportal add -A vmhba33 -n vmk7

o. # rename local datastore

hostname > $var=

vim-cmd hostsvc/datastore/rename datastore1 local-$var

p. #Define Native Multi Path Storage Array Type Plugin for EMC VNXe 3300 and tune round-robin IOPS from 1000 to 1

esxcli storage nmp satp rule add --satp="VMW_SATP_DEFAULT_AA" --vendor="EMC" -P "VMW_PSP_RR" -O "iops=1"

q. # refresh networking

esxcli network firewall refresh

vim-cmd hostsvc/net/refresh

  1. Configure NTP client using vSphere Client for each host

a. Configuration --> Time Configuration --> Properties --> Options --> NTP Settings --> Add --> ntp.mydomain.net --> Check "Restart NTP service to apply changes" --> OK --> wait… -> Select "Start and stop with host" --> OK --> Check "NTP Client Enabled --> OK

  1. Reboot Host

  2. Proceed with EMC VNXe Storage Provisioning, return to this guide when complete

  3. Login to vSphere client per host

  4. Upgrade each Datastore to VMFS-5

a. Configuration --> Storage --> Highlight Datastore --> Upgrade to VMFS-5

Andrew Schulman
  • 8,561
  • 21
  • 31
  • 47
0

Unfortunately, I think that nothing is wrong with your setup. You simply can not use more than 1 Gb/s for a single VM.

The point here is that you don't want simply to use two (or more) NICs, you want to use them concurrently, in a RAID-0 like configuration.

802.3ad, the standard about link-level aggregation and that I think you configured on your switchs, can't typically be configured to stripe a single connection across different NICs. This is due to how the interface-alg selection works: it is based on src and dst MACs and/or IP/ports, and a single connection will always have the same MACs/IPs/ports.

This does not mean that your setup can't push higher numbers (both as tput and IOPS), but this put a hard limit on how much performance can a single VM extract. Try to load 2 or 4 IOMeter instances on 2/4 different VMs: I bet that aggregated tput will be way higher then single VM benchmark, but no single machine will go through the 1 Gb/s limit.

Linux bridging and some high-end switch support different link-aggregation methods and enables full striped, aggregated network interfaces. However this has non trivial ramifications on how other switchs/systems interact with these "non-standard" aggregation methods.

Anyway, for storage network, you should really enable jumbo frames, if supported.

shodanshok
  • 44,038
  • 6
  • 98
  • 162