5

I set up an OpenStack Folsom (2012.2) multi-node, single-network infrastructure. Everything runs fine, instances are running good on any compute node, private network works like a charm and all instances are reachable via Floating IPs from the outside and can reach the outside.

But when trying to perform a network request from a VM to itself via it's Floating IP it fails.

Neither ping nor ssh are working.

Security Groups are all open.

Ping works via Floating IPs from one VM to another but SSH don't.

Some data for one example

  • 10.0.0.0/24 is the private network
  • 10.0.0.1 is the controller
  • 10.1.100.0/24 is the Floating IP network
  • the VM with 10.0.0.13 has the Floating IP 10.1.100.4

iptables entries (regarding 10.1.100.4/10.0.0.13) on the controller (all services including network):

-A nova-network-2.7-OUTPUT -d 10.1.100.4/32 -j DNAT --to-destination 10.0.0.13
-A nova-network-2.7-PREROUTING -d 10.1.100.4/32 -j DNAT --to-destination 10.0.0.13
-A nova-network-2.7-float-snat -s 10.0.0.13/32 -o eth0 -j SNAT --to-source 10.1.100.4

iptables entries on the compute node:

regarding 10.1.100.4/10.0.0.13:

-A nova-compute-2.7-local -d 10.0.0.13/32 -j nova-compute-2.7-inst-143

regarding nova-compute-2.7-inst-143:

-N nova-compute-2.7-inst-143
-A nova-compute-2.7-inst-143 -m state --state INVALID -j DROP
-A nova-compute-2.7-inst-143 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A nova-compute-2.7-inst-143 -j nova-compute-2.7-provider
-A nova-compute-2.7-inst-143 -s 10.0.0.1/32 -p udp -m udp --sport 67 --dport 68 -j ACCEPT
-A nova-compute-2.7-inst-143 -s 10.0.0.0/24 -j ACCEPT
-A nova-compute-2.7-inst-143 -p tcp -m tcp --dport 22 -j ACCEPT
-A nova-compute-2.7-inst-143 -p tcp -m tcp --dport 3389 -j ACCEPT
-A nova-compute-2.7-inst-143 -p tcp -m multiport --dports 1:65535 -j ACCEPT
-A nova-compute-2.7-inst-143 -p udp -m multiport --dports 1:65535 -j ACCEPT
-A nova-compute-2.7-inst-143 -p icmp -j ACCEPT
-A nova-compute-2.7-inst-143 -j nova-compute-2.7-sg-fallback

Any suggestions where to search for the problem are welcome. Of course I will provide any necessary data to solve the problem. Currently I am not quite sure which data would help.

030
  • 5,731
  • 12
  • 61
  • 107
Tilo Prütz
  • 225
  • 1
  • 3
  • 8
  • You can try to change the configuration of compute node to solve the `iptables` rules problem by: $ echo 1 > /proc/sys/net/bridge/bridge-nf-call-iptables –  Feb 11 '14 at 07:58

2 Answers2

3

Okay, I have found the problem:

All packets to Floating IPs (10.1.100.0/24 in my case) are DNATed to private network destination (10.0.0.0/24 in my case). The ssh packets go round via the controller and come right back to the VM. The ssh server answers but sends the packet with it's private address as source (of course - it has no other). So the ssh client gets a packet from 10.0.0.13 as answer to a request to 10.1.100.4 which it ignores.

Well, so when sending packets from Private to Floating IPs not only the destination has to be NATed but the source, too. But that's not straight forward because the DNAT is in PREROUTE while the SNAT is in POSTROUTE. It can be done using the connection tracking module:

iptables -t nat -A nova-network-2.7-float-snat -s 10.0.0.13/32 -d 10.0.0.0/24 -j SNAT --to-source 10.1.100.4 -m conntrack --ctstate DNAT

This did the trick for me (for every single Floating IP of course). It mangles every packet from private to private which was already DNATed (then it should go to Floating IP), to make it seem to come from a Floating IP.

In my ssh scenario now happens the following:

  • client sends from 10.0.0.13 to 10.1.100.4
  • packet is DNATed to 10.0.0.13
  • packet is SNATed to 10.1.100.4
  • server answers packet to 10.1.100.4
  • packet is DNATed to 10.0.0.13
  • packet is SNATed to 10.1.100.4
  • client gets answer from 10.1.100.4 and is happy

This works as well for ping and also for traffic between different VMs.

Looks like I have to patch the nova-network code and submit it to the openstack project :-/.

Tilo Prütz
  • 225
  • 1
  • 3
  • 8
  • nova-network is frozen at this point. quantum should be taking over and the deprecated nova-network project will eventually be removed. –  Dec 22 '12 at 01:53
  • 1
    Okay, but do you know if quantum has such issue? – Tilo Prütz Dec 22 '12 at 07:00
2

Tilo,

This is captured pretty well in https://bugs.launchpad.net/nova/+bug/1096259, and patches are currently in progress for nova (https://review.openstack.org/#/c/19139/) as of today (Jan 7th, 2013).

The full fix also goes with bug 1096987 (https://bugs.launchpad.net/nova/+bug/1096987) and 1096985 (https://bugs.launchpad.net/nova/+bug/1096987) to cover the more common deployment scenarios where you are using either a predefined external gateway or taking advantage of the nova-network linux/iptables networking public bridge setup.

Joe Heck
  • 221
  • 1
  • 4