CentOS Client - Unable to Establish iSCSI connection with multiple interfaces on the initiator

Question

So after upgrading to CentOS 6.2, I am seemingly no longer able to login into my iSCSI targets. I have multiple interfaces on different subnets on the system, and I first thought that it had to do with the fact that I may not be binding correct interfaces, which seems to be the case when looking at netstat, as this is clearly wrong:

[root]⌘ netstat -na|grep .90
tcp        0      1 10.10.100.60:42354          10.10.8.90:3260             SYN_SENT    
tcp        0      1 10.10.100.60:40777          10.10.9.90:3260             SYN_SENT

I then went ahead and disabled all but one interface, and so as a result netstat appears to be correct, but the issue with login remains. I am positive that the target never sees a packet, because I see nothing by SYN_SENT. I know the problem is on my client, because the target is servicing multiple systems, none of which are CentOS 6.2. At this point I am pretty confident that some things changed between CentOS 6.0/6.1 and 6.2. So, if anyone have any thoughts, or ran into this, I would very much like to hear your thoughts.

[root]⌘ iscsiadm --mode node --targetname iqn.2011-12.dom.homer:01:lab-centos-servers-00001 --portal 10.10.8.90:3260,2 --interface=sw-iscsi-0 --login
Logging in to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260] (multiple)
iscsiadm: Could not login to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not log into all portals


[root]⌘ netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
10.10.8.0       0.0.0.0         255.255.255.0   U         0 0          0 eth2.7
10.10.9.0       0.0.0.0         255.255.255.0   U         0 0          0 eth3.7
10.10.100.0     0.0.0.0         255.255.252.0   U         0 0          0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth1
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth2
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth3
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth2.7
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth3.7
0.0.0.0         10.10.100.1     0.0.0.0         UG        0 0          0 eth0

Output of ip addr show for the two interfaces involved:

[root]⌘ for i in 2.7 3.7; do ip addr show eth$i; done
6: eth2.7@eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 00:0c:29:94:5b:8d brd ff:ff:ff:ff:ff:ff
    inet 10.10.8.60/24 brd 10.10.8.255 scope global eth2.7
    inet6 fe80::20c:29ff:fe94:5b8d/64 scope link 
       valid_lft forever preferred_lft forever
7: eth3.7@eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP 
    link/ether 00:0c:29:94:5b:97 brd ff:ff:ff:ff:ff:ff
    inet 10.10.9.60/24 brd 10.10.9.255 scope global eth3.7
    inet6 fe80::20c:29ff:fe94:5b97/64 scope link 
       valid_lft forever preferred_lft forever

Update 01/06/2012:

This issue is getting even more interesting by the day it seems. I went a few weeks back and grabbed a snapshot of this system from before upgrading to 6.2. I spun up a new system from the snapshot, and reconfigured interface info and host keys, as well as iSCSI initiator and iscsi interface info to match new MACs. Changed nothing else.

Then, I attempted to connect to my targets, and no issues at all. I cannot say that this was unexpected. I then went ahead and compared sysctl settings from both systems and there were differences after the upgrade, but nothing seemingly relevant to iSCSI or IP that could contribute to this. I also noticed that by default now two sessions per connection were enabled after the upgrade, but I changed it back to 1 session in /etc/iscsi/iscsid.conf.

On the problematic system we can see that source interface is seemingly wrong, but even when I disable the 10.10.100 interface, problems persist. So, while this may be relevant, I could not validate it for certain. Needless to say, further research is necessary. Something is clearly different between releases. Working system is on 6.1, and non-working is 6.2.

::Working System::
tcp        0      0 10.10.8.210:39566           10.10.8.90:3260             ESTABLISHED 
tcp        0      0 10.10.9.210:46518           10.10.9.90:3260             ESTABLISHED 

[root]⌘ ip route show
10.10.8.0/24 dev eth2.6  proto kernel  scope link  src 10.10.8.210 
10.10.9.0/24 dev eth3.7  proto kernel  scope link  src 10.10.9.210 
10.10.100.0/22 dev eth0  proto kernel  scope link  src 10.10.100.210 
169.254.0.0/16 dev eth0  scope link  metric 1002 
169.254.0.0/16 dev eth2.6  scope link  metric 1006 
169.254.0.0/16 dev eth3.7  scope link  metric 1007 
default via 10.10.100.1 dev eth0

::Non-working System::
tcp        0      1 10.10.100.60:44737          10.10.9.90:3260             SYN_SENT    
tcp        0      1 10.10.100.60:55479          10.10.8.90:3260             SYN_SENT

[root]⌘ ip route show
10.10.8.0/24 dev eth2.6  proto kernel  scope link  src 10.10.8.60 
10.10.9.0/24 dev eth3.7  proto kernel  scope link  src 10.10.9.60 
10.10.100.0/22 dev eth0  proto kernel  scope link  src 10.10.100.60 
169.254.0.0/16 dev eth0  scope link  metric 1002 
169.254.0.0/16 dev eth2.6  scope link  metric 1006 
169.254.0.0/16 dev eth3.7  scope link  metric 1007 
default via 10.10.100.1 dev eth0 

And the result is still same:

[root]⌘ iscsiadm: Could not login to [iface: sw-iscsi-0, target: iqn.2011-12.dom.homer:01:lab-centos-servers-00001, portal: 10.10.8.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not login to [iface: sw-iscsi-1, target: iqn.2011-12.dom.homer:02:lab-centos-servers-00001, portal: 10.10.9.90,3260].
iscsiadm: initiator reported error (8 - connection timed out)
iscsiadm: Could not log into all portals

Update 01/08/2012:

I believe I have been able to figure out the answer to my issue. It is quite obscure and I doubt this will happen to anyone else any time soon. It turns out that setting iface.iscsi_ifacename and iface.hwaddress in the interfaces configuration file is not legal. When one manually adds an iscsi target, such as below, all settings from the interface config file are copied into the node config file, that gets created by the below command. Result is parameters iface.iscsi_ifacename and iface.hwaddress together in the same config file. These parameters are seemingly mutually exclusive, which does not exactly make sense, or there is perhaps an oversight in the codepath. Perhaps I will investigate further.

# iscsiadm -m node --op new -T iqn.2011-12.dom.homer:01:lab-centos-servers-00001 -p 10.10.8.90,3260,2 -I sw-iscsi-0
# iscsiadm -m node --op new -T iqn.2011-12.dom.homer:02:lab-centos-servers-00001 -p 10.10.9.90,3260,2 -I sw-iscsi-1

Notice, below I commented out iface.hwaddress and iface.ipaddress, after which I re-added targets, with same command as above. All works just fine.

[root]⌘ cat *
# BEGIN RECORD 2.0-872.33.el6
iface.iscsi_ifacename = sw-iscsi-0
iface.net_ifacename = eth2.6
#iface.hwaddress = XX:XX:XX:XX:XX:XX 
#iface.ipaddress = 10.10.8.60
iface.transport_name = tcp
iface.vlan_id = 6
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
# END RECORD
# BEGIN RECORD 2.0-872.33.el6
iface.iscsi_ifacename = sw-iscsi-1
iface.net_ifacename = eth3.7
#iface.hwaddress = XX:XX:XX:XX:XX:XX
#iface.ipaddress = 10.10.9.60
iface.transport_name = tcp
iface.vlan_id = 7
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
# END RECORD

Again, chances of this happening to someone else are slim to none, so likely waste of time typing this up. But, if someone does encounter this issue, I hope this post will help.

What does ifconfig eth2.7 and ifconfig eth3.7 show for your IP addresses? — Jed Daniels, Jan 05 '12 at 02:38
Jed, I updated the question with `ip addr show` for the two if's. Thanks a lot. — slashdot, Jan 05 '12 at 12:37
Unfortunately, I have no real guidance here, but I'm very curious to know what is wrong. I'd look at the output of arp -an (after attempting the connection), check /var/log/daemon for anything relevant (and check messages too, of course), and probably try iscsiadm login with '-d8' to turn up debugging on it. Also, if possible, look for changes/differences in /etc/iscsi/iscsid.conf from before and after your upgrade. If you have the ability, I'd also check the logs and maybe even tcpdump on the target to determine if you are actually getting the connection request. — Jed Daniels, Jan 05 '12 at 20:47
Oh yea, and make sure there isn't some new firewalling going on post-upgrade. — Jed Daniels, Jan 05 '12 at 20:47
Have done all these already, and am still not quite there. I have a few more ideas, so will test, and if I find an answer will be sure to share. — slashdot, Jan 05 '12 at 21:46

Joaquin Villanueva · Answer 1 · 2012-05-09T10:55:02.407

Found the same issue, but in a CentOS 6.2 new install. iSCSI logins times out after creating a bridge for KVM over the ethernet adapter. Before this (no bridge created), iSCSI logins without problem.

Seems iscsiadm tries to connect to the iface.hwaddress defined (but there are two: the eth1 and the br1 interfaces in my setup) and uses eth1. Connection timed out.

Removing iface.hwaddress and adding iface.net_ifacename (as suggested) set to the correct interface name (br1), does the trick. Problem solved.

CentOS Client - Unable to Establish iSCSI connection with multiple interfaces on the initiator

1 Answers1