5

I'm using a setup (on our staging-system) with 2 root-servers and 1 fail-over-ip. As software we use corosync and pacemaker. Corosync is configured for multicast communication via port 5405. --> everything works fine.

Now I want to deploy this system on 2 root-server with failover-ip. Well, multicast communication won't work, because the root-servers are not directly connected ; they are connected with router and are located in different data-centers

Now I changed the corosync.conf according to udpu support (like the example). I'm using corosync v.1.4.1

corosync.conf:

compatibility: whitetank

totem {
        version: 2
        secauth: off
        interface {
                member {
                        memberaddr: A.A.A.A
                }
                member {
                        memberaddr: B.B.B.B
                }
                ringnumber: 0
                bindnetaddr:A.A.A.A
                mcastport: 5405
        }
transport: udpu
}

logging {
    fileline: off
        to_stderr: no
        to_logfile: yes
        to_syslog: yes
        logfile: /var/log/cluster/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: AMF
                debug: off
        }
}

service {
    name: pacemaker
    ver: 1
}

If I have a look at 'netstat -nlp' :

Proto Recv-Q Send-Q Local Address               Foreign Address             State       PID/Program name   
tcp        0      0 0.0.0.0:47150               0.0.0.0:*                   LISTEN      1224/rpc.statd      
tcp        0      0 0.0.0.0:111                 0.0.0.0:*                   LISTEN      1206/rpcbind        
tcp        0      0 0.0.0.0:22                  0.0.0.0:*                   LISTEN      1314/sshd           
tcp        0      0 127.0.0.1:25                0.0.0.0:*                   LISTEN      1391/master         
tcp        0      0 :::111                      :::*                        LISTEN      1206/rpcbind        
tcp        0      0 :::22                       :::*                        LISTEN      1314/sshd           
tcp        0      0 :::45791                    :::*                        LISTEN      1224/rpc.statd      
udp        0      0 0.0.0.0:59806               0.0.0.0:*                               1769/corosync       
udp        0      0 0.0.0.0:39859               0.0.0.0:*                               1769/corosync       
udp        0      0 0.0.0.0:957                 0.0.0.0:*                               1206/rpcbind        
udp        0      0 0.0.0.0:56137               0.0.0.0:*                               1224/rpc.statd      
udp        0      0 0.0.0.0:976                 0.0.0.0:*                               1224/rpc.statd      
udp        0      0 0.0.0.0:111                 0.0.0.0:*                               1206/rpcbind        
udp        0      0 :::957                      :::*                                    1206/rpcbind        
udp        0      0 :::111                      :::*                                    1206/rpcbind        
udp        0      0 :::36209                    :::*                                    1224/rpc.statd

If I try a telnet on this port --> connection refused. Additionally I've disabeld SELinux, Firewall, etc. --> not working.

/var/log/cluster/corosync.log:

May 08 13:54:42 corosync [MAIN  ] Corosync Cluster Engine exiting with status 0 at main.c:1858.
May 09 12:22:18 corosync [MAIN  ] Corosync Cluster Engine ('1.4.1'): started and ready to provide service.
May 09 12:22:18 corosync [MAIN  ] Corosync built-in features: nss dbus rdma snmp
May 09 12:22:18 corosync [MAIN  ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
May 09 12:22:18 corosync [TOTEM ] Initializing transport (UDP/IP Unicast).
May 09 12:22:18 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 09 12:22:18 corosync [TOTEM ] bind token socket failed: Cannot assign requested address (99)
May 09 12:22:18 corosync [TOTEM ] The network interface [A.A.A.A] is now up.
Set r/w permissions for uid=0, gid=0 on /var/log/cluster/corosync.log
May 09 12:22:18 corosync [pcmk  ] info: process_ais_conf: Reading configure
May 09 12:22:18 corosync [pcmk  ] info: config_find_init: Local handle: 7739444317642555395 for logging
May 09 12:22:18 corosync [pcmk  ] info: config_find_next: Processing additional logging options...
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found 'off' for option: debug
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found 'yes' for option: to_logfile
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found '/var/log/cluster/corosync.log' for option: logfile
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found 'yes' for option: to_syslog
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Defaulting to 'daemon' for option: syslog_facility
May 09 12:22:18 corosync [pcmk  ] info: config_find_init: Local handle: 5650605097994944516 for quorum
May 09 12:22:18 corosync [pcmk  ] info: config_find_next: No additional configuration supplied for: quorum
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: No default for option: provider
May 09 12:22:18 corosync [pcmk  ] info: config_find_init: Local handle: 2730409743423111173 for service
May 09 12:22:18 corosync [pcmk  ] info: config_find_next: Processing additional service options...
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Found '1' for option: ver
May 09 12:22:18 corosync [pcmk  ] info: process_ais_conf: Enabling MCP mode: Use the Pacemaker init script to complete Pacemaker startup
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Defaulting to 'no' for option: use_logd
May 09 12:22:18 corosync [pcmk  ] info: get_config_opt: Defaulting to 'no' for option: use_mgmtd
May 09 12:22:18 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
May 09 12:22:18 corosync [pcmk  ] Logging: Initialized pcmk_startup
May 09 12:22:18 corosync [pcmk  ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
May 09 12:22:18 corosync [pcmk  ] info: pcmk_startup: Service: 10
May 09 12:22:18 corosync [pcmk  ] info: pcmk_startup: Local hostname: CentOS-60-64-minimal
May 09 12:22:18 corosync [pcmk  ] info: pcmk_update_nodeid: Local node id: 1632251470
May 09 12:22:18 corosync [pcmk  ] info: update_member: Creating entry for node 1632251470 born on 0
May 09 12:22:18 corosync [pcmk  ] info: update_member: 0x152edf0 Node 1632251470 now known as CentOS-60-64-minimal (was: (null))
May 09 12:22:18 corosync [pcmk  ] info: update_member: Node CentOS-60-64-minimal now has 1 quorum votes (was 0)
May 09 12:22:18 corosync [pcmk  ] info: update_member: Node 1632251470/CentOS-60-64-minimal is now: member
May 09 12:22:18 corosync [SERV  ] Service engine loaded: Pacemaker Cluster Manager 1.1.6
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync extended virtual synchrony service
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync configuration service
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync cluster closed process group service v1.01
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync cluster config database access v1.01
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync profile loading service
May 09 12:22:18 corosync [SERV  ] Service engine loaded: corosync cluster quorum service v0.1
May 09 12:22:18 corosync [MAIN  ] Compatibility mode set to whitetank.  Using V1 and V2 of the synchronization engine.
May 09 12:22:27 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.
May 09 12:22:30 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.
May 09 12:22:32 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.
May 09 12:22:34 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.
May 09 12:22:36 corosync [TOTEM ] Totem is unable to form a cluster because of an operating system or network fault. The most common cause of this message is that the local firewall is configured improperly.

If I try 'crm status' I get 'could not connect to cluster'

What is my fault? Can anyone help? A hint ?

Rene Hellmann
  • 51
  • 1
  • 3

0 Answers0