3

I have the following problem when trying to implement a two-node failover cluster, using Hetzner as the hosting provider.

my corosync.conf is as follows:

# Please read the corosync.conf.5 manual page
compatibility: whitetank

totem {
version: 2
secauth: off
interface {
    member {
        memberaddr: 144.76.91.XXX
    }
    member {
        memberaddr: 5.9.121.XXX
    }
    ringnumber: 0
    bindnetaddr: 5.9.121.0
    mcastport: 5405
    ttl: 1
}
transport: udpu
}

logging {
fileline: off
to_logfile: yes
to_syslog: yes
debug: on
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
    subsys: AMF
    debug: off
}
}

` the problem is that 5.9.121.XXX binds correctly (and appears under crm_mon to be a part of the cluster)

udp 0 0 5.9.121.XXX:5405 0.0.0.0:* 8281/corosync

but 144.76.91.XXX fails and binds to localhost instead.

udp 0 0 127.0.0.1:5405 0.0.0.0:* 7889/corosync

Analysis of tcpdump logs indicates that 144.76.91.XXX replies to 5.9.121.XXX with an ICMP type 3 (destination unreachable), code 3 (port unreachable).

corosync -f output will print repeatedly:

Jun 24 12:53:28 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.

UDP traffic is enabled between the 2 hosts, no firewalls are currently in place and I am using Debian (thus no SELinux).

Any ideas for a work around between this issue? Is it even possible to create a cluster with 2 machines on different subnets or do I need to order servers within the same subnet? Thanks in advance for any replies.

thanasisk
  • 941
  • 6
  • 16

1 Answers1

2

From the mailing list (that solved my problem), courtesy of Dan Friscu:

"You're not supposed to use the same bindnetaddr in both places, it's relevant only at node level for UDPU (your use case). On the node where you have the 144.* address, use that as bindnetaddr."

Hope that this is useful to someone.

thanasisk
  • 941
  • 6
  • 16